Tuesday, April 4, 2017

Do We Still Need to Disavow in the Era of Penguin 4.0?

Posted by MarieHaynes

It has now been six months since the launch of Penguin 4.0. In my opinion, Penguin 4.0 was awesome. It took ages for Google to release this update, but when they did, it was much more fair than previous versions of Penguin. Previous versions of Penguin would cause entire sites to be suppressed if the algorithm thought that you'd engaged in manipulative link building. Even if a site did a thorough link cleanup, the suppression would remain present until Google re-ran the Penguin algorithm and recognized your cleanup efforts. It was not uncommon to see situations like this:

I saw many businesses that had looooooong periods of time of suppression — even years!

According to Google spokesperson Gary Illyes, the new version of Penguin that was released in September of 2016 no longer suppresses sites:

Now, instead of causing a sitewide demotion when Penguin detects spam, they’ll simply devalue that spam so that it can’t help improve a site’s rankings.

I’m guessing that it took a lot of brainpower to figure out how to do this. Google now has enough trust in their ability to find and devalue spam that they are comfortable removing the punitive aspect of Penguin. That’s impressive.

This change brings up a question that I am asked several times a week now:

If Penguin is able to devalue spam, is there any reason to disavow links any more?

I've been asked this enough times now that I figured it was a good idea to write an article on my answer to this question.

A brief refresher: What is the disavow tool?

The disavow tool was given to us in October of 2012.

You can use it by uploading a file to Google that contains a list of either URLs or domains. Then, as Google crawls the web, if they come across a URL or domain that is in your disavow file, they won’t use links from that page in their calculations of PageRank for your site. Those links also won’t be used by the Penguin algorithm when it decides whether your site has been involved in webspam.

For sites that were affected by Penguin in the past, the disavow tool was an integral part of getting the suppression lifted off the site. It was essentially a way of saying to Google, “Hey... in the past we made some bad links to our site. But we don’t want you to use those links in your calculations.” Ideally, it would be best to remove bad links from the web, but that’s not always possible. The disavow tool was, in my opinion, super important for any site that was hit by Penguin.

For more in-depth information on using the disavow tool, see this Moz post: https://moz.com/blog/guide-to-googles-disavow-tool

What does Google say about using the disavow tool now?

It wasn’t long after the release of Penguin 4.0 before people starting asking Google whether the disavow tool was still necessary. After all, if Google can just devalue spam links on their own, why should I have to disavow them?

Here are some replies from Google employees:

Now, the conspiracy theorists out there will say, “Of course Google wants you to disavow! They need that data to machine-learn for Penguin!”

Google has said that Penguin is not a machine learning algorithm:

And even if they ARE using disavow data for some kind of machine learning training set, really, does it matter? In my opinion, if Google is saying that we should be still using the disavow tool, I don’t think they're trying to trick us. I think it still has a real purpose.

Three reasons why I still recommend using the disavow tool

There are three main reasons why I still recommend disavowing. However, I don’t recommend it in as many cases as I used to.

1) Manual actions still exist

You do NOT want to risk getting a manual unnatural links penalty. I have documented on Moz before about the many cases I've seen where a manual unnatural links penalty was devastating to the long-term health of a site.

Google employee Gary Illyes commented during a podcast that, when a Google webspam team member looks at your site’s links, they can often see labels next to the links. He said the following:

If the manual actions team is reviewing a site for whatever reason, and they see that most of the links are labeled as Penguin Real-Time affected, then they might decide to take a much deeper look on the site... and then maybe apply a manual action on the site because of the links.

In other words, if you have an unnatural link profile and you leave it up to Penguin to devalue your links rather than disavowing, then you’re at risk for getting a manual action.

Of course, if you actually do have a manual action, then you’ll need to use the disavow tool as part of your cleanup efforts along with manual link removal.

2) There are other algorithms that use links

Link quality has always been important to Google. I believe that Penguin is just one way in which Google fights against unnatural links algorithmically. One example of another algorithm that likely uses links is the Payday Loans algorithm. This algorithm isn’t just for payday loans sites; it also affects sites in many high-competition verticals.

Bill Slawski recently posted this interesting article on his thoughts about a recent patent filed by Google. In one place, the patent talks about a situation where a resource may have a large number of links pointing to it but there is a disproportionate amount of traffic. In cases like that, the page being linked to might actually be demoted in rankings.

Now, that’s just a patent, so it doesn’t mean for sure that there's actually an algorithm behind this... but there could be! Makes you think, right?

Google is always trying to fight against link spam and Penguin is just one of the ways in which they do this. If there are links that are potentially causing my link profile to look spammy to Google, then I don’t want them to count in any calculations that Google is making.

3) Can we trust that Penguin is able to devalue all spam pointing to our site?

The official announcement from Google on Penguin is here. Here's what it says about devaluing as opposed to demoting:

"Penguin is now more granular. Penguin now devalues spam by adjusting ranking based on spam signals, rather than affecting ranking of the whole site."

This statement is not clear to me. I have questions:

  • When Google says they are “adjusting ranking,” could that also be negative adjustments?
  • Can Penguin possibly demote rankings for certain pages rather than affecting the whole site?
  • Can Penguin possibly demote rankings for certain keywords rather than affecting the whole site?

As posted above, we received some clarification on this from Google employees in a Facebook post (and again via tweets) to tell us that Penguin 4.0 doesn’t penalize, but rather devalues spam. However, these are not official statements from Google. These statements may mean that we never have to worry about any link pointing to our site ever again. Perhaps? Or they could mean that there's less need to worry than there was previously.

Personally, if my business relies on Google organic rankings in order to succeed, I'm a little leery about putting all of my trust in this algorithm’s ability to ignore unnatural links and not let them hurt me.

Who should be disavowing?

While I do still recommend use of the disavow tool, I only recommend it in the following situations:

  1. For sites that have made links for SEO purposes on a large scale – If you or an SEO company on your behalf made links in low-quality directories, low-quality article sites, bookmark sites, or as comment spam, then these need to be cleaned up. Here's more information on what makes a link a low-quality link. You can also run links past my disavow blacklist if you're not sure whether it’s a good one or not. Low-quality links like this are probably being devalued by Penguin, but they're the type of link that could lead to a manual unnatural links penalty if you happen to get a manual review by the webspam team and they haven’t been disavowed.
  2. For sites that previously had a manual action for unnatural links – I've found that if a site has enough of a spam problem to get an unnatural links penalty, then that site usually ends up collecting more spam links over the years. Sometimes this is because low-quality directories pop up and scrape info from other low-quality directories. Sometimes it's because old automated link-generating processes keep on running. And sometimes I don’t have an explanation, but spammy links just keep appearing. In most cases, sites that have a history of collecting unnatural links tend to continue to collect them. If this is the case for you, then it’s best to disavow those on a regular basis (either monthly or quarterly) so that you can avoid getting another manual action.
  3. For sites under obvious negative SEO attacks – The key here is the word "obvious." I do believe that in most cases, Google is able to figure out that spam links pointed at a site are links to be ignored. However, at SMX West this year, Gary Illyes said that the algorithm can potentially make mistakes: If you have a bunch of pharma and porn links pointing at your site, it’s not a bad idea to disavow them, but actually in most cases I just ignore these. Where I do recommend disavowing for negative SEO attacks is when the links pointing at your site contain anchors for keywords for which you want to rank. If it’s possible that a webspam team member could look at your link profile and think that there are a lot of links there that exist just for SEO reasons, then you want to be sure that those are cleaned up.

Who does NOT need to disavow?

If you look at your links and notice some "weird" links that you can’t explain, don’t panic!

Every site gets strange links, and often quite a few of them. If you haven’t been involved in manipulative SEO, you probably do not need to be disavowing links.

When Google takes action either manually or algorithmically against a site for unnatural linking, it's because the site has been actively trying to manipulate Google rankings on a large scale. If you made a couple of directory links in the past, you’re not going to get a penalty.

You also don’t need to disavow just because you notice sitewide links pointing to you. It can look scary to see in Google Search Console that one site is linking to you thousands of times, especially if that link is keyword-anchored. However, Google knows that this is a sitewide link and not thousands of individual links. If you made the link yourself in order to help your rankings, then sure, go ahead and disavow it. But if it just appeared, it’s probably nothing to worry about.

Borderline cases

There are some cases where it can be difficult to decide whether or not to disavow. I sometimes have trouble advising on cases where a company has hired a medium- to high-quality SEO firm that's done a lot of link building — rather than link earning — for them.

Here's an example of a case that would be difficult:

Let’s say you've been getting most of your links by guest posting. These guest posts are not on low-quality sites that exist just to post articles, but rather on sites that real humans read. Are those good links?

According to Google, if you're guest posting primarily for the sake of getting links, then these are unnatural links. Here's a quote from Google employee John Mueller:

"Think about whether or not this is a link that would be on your site if it weren’t for your actions…When it comes to guest blogging it’s a situation where you are placing links on other people’s sites together with this content, so that’s something I kind of shy away from purely from a link building point of view. It can make sense to guest blog on other people’s sites to drive some traffic to your site… but you should use a nofollow."

If you have a small number of guest posts, Google is unlikely to go after you. But what if a webspam team member looks at your links and sees that you have a very large number of links built via guest posting efforts? That makes me uncomfortable.

You could consider disavowing those links to avoid getting a manual action. It’s quite possible, though, that those links are actually helping your site. Disavowing them could cause you to drop in rankings.

This article could easily turn into a discussion on the benefits and risks of guest posting if we had the space and time. My point in mentioning this is to say that some disavow decisions are tough.

In general, my rule of thumb is that you should use the disavow file if you have a good number of links that look like you made them with SEO as your primary goal.

Should you be auditing your disavow file?

I do believe that some sites could benefit from pruning their disavow file. However, I have yet to see any reports from anyone who has claimed to have done this and seen benefit that we can reasonably attribute to the recovery of PageRank that flows through those links.

If you have used your disavow file in the past in an effort to remove a manual action or recover from a Penguin hit, then there's a good possibility that you were overly aggressive in your disavow efforts. I know I've had some manual penalties that were really difficult to remove and we likely disavowed more links than were necessary. In cases like those, we could go through our disavow files and remove the domains that were questionable disavow decisions.

It’s not always easy to do this, though, especially if you've done the correct thing and have disavowed on the domain level. If this is the case, you won’t have actual URLs in your disavow file to review. It’s hard to make reavowing decisions without seeing the actual link in question.

Here's a process you can use to audit your disavow file. It gets a little technical, but if you want to give it a try, here it is:

(Note: Many of these steps are explained in greater detail and with pictures here.)

  • Download your disavow file from Google: https://www.google.com/webmasters/tools/disavow-links-main
  • Get a list of your links from Google Search Console. (It’s not a bad idea to also get links from other sources, as well.)
  • On your CSV of links, make a column for domains. You can extract the domain by using this formula, assuming your URLs are in Column B:

    =LEFT(B1,FIND(“/”,B1,9)-1)

    You can then use Find and Replace to replace the http, https, and www. with blanks. Now you have a list of domains.
  • On your disavow file, get a list of domains you've disavowed by replacing domain: with blanks. (This is assuming you have disavowed on the domain level and not the URL level.)
  • Put your new list of disavowed domains on the second sheet of your links spreadsheet and fill Column B down with "disavowed".
  • Now, on the links list, we’re going to use a VLOOKUP to figure out which of our current live links are ones that we've previously disavowed. In this formula, your domains are in the first column of each spreadsheet and I've used 1000 as the total number of domains in my disavow list. Here goes:

    =VLOOKUP(A1,sheet2!$A$1:$B$1000,2,FALSE)
  • Now you can take the domains that are in your disavow file and audit those URLs.

What we’re looking for here are URLs where we had disavowed them just to be safe, but in reality, they are probably OK links.

Note: Just as in regular link auditing work, do not make decisions based on blanket metrics. While some of these metrics can help us make decisions, you do not want to base your decision for reavowing solely on Domain Authority, spam score, or some other metric. Rather, you want to look at each domain and think, “If a webspam team member looked at this link, would they think it only exists for SEO reasons, or does it have a valid purpose outside of SEO?”

Let’s say we've gone through the links in our disavow file and have found 20 links that we'd like to reavow. We would then go back to the disavow file that we downloaded from Google and remove the lines that say "domain:example.com" for each of those domains which we want to reavow.

Upload your disavow file to Google again. This will overwrite your old file. At some point in the future Google should start counting the links that you've removed from the file again. However, there are a few things to note:

  • Matt Cutts from Google mentioned in a video that reavowing a link takes "a lot longer" than disavowing. They built a lag function into the tool to try to stop spammers from reverse-engineering the algorithm.
  • Matt Cutts also said in the same video that a reavowed link may not carry the same weight it once did.

If this whole process of reavowing sounds too complicated, you can hire me to do the work for you. I might be willing to do the work at a discount if you allow me to use your site (anonymously) as a case study to show whether reavowing had any discernible benefit.

Conclusions

Should we still be using the disavow tool? In some cases, the answer to this is yes. If you have links that are obviously there for mostly SEO reasons, then it's best to disavow these so that they don’t cause you to get a manual action in the future. Also, we want to be sure that Google isn't using these links in any sort of algorithmic calculations that take link quality into account. Remember, it’s not just Penguin that uses links.

I think that it is unlikely that filing a disavow will cause a site to see a big improvement in rankings, unless the site is using it to recover from a sitewide manual action. Others will disagree with me, however. In fact, a recent Moz blog post showed a possible recovery from an algorithmic suppression shortly after a site filed a disavow. I think that, in this case, the recovery may have been due to a big algorithm change that SEOs call Fred that happened at the same time, rather than the filing of a disavow file.

In reality, though, no one outside of Google knows for sure how effective the disavow tool is now. We know that Google says we should still use it if we find unnatural links pointing to our site. As such, my advice is that if you have unnatural links, you should still be disavowing.

I’d love to hear what you think. Please do leave a comment below!


Sign up for The Moz Top 10, a semimonthly mailer updating you on the top ten hottest pieces of SEO news, tips, and rad links uncovered by the Moz team. Think of it as your exclusive digest of stuff you don't have time to hunt down but want to read!

No comments:

Post a Comment