Site icon Search Engine People Blog

The Definitive Guide to Identify Nofollow & Juiceless Links

One of the most important factors of negotiating any link deal is to ensure that the link is both indexed and dofollow - thereby ensuring it actually passes "link juice" or link power. Time and time again, clients inquire about how best to link out/or get a link from their partners, associations or government agencies.

It's not rare that a super enthusiastic client will come to me and say "Paul!!! Check out this great link I've negotiated with our ABC partner/ sponsor/government agency!" And then I have to burst their bubble and tell them that the link is actually nofollow, or it's a 302 redirection, or not even indexed - and so forth. As a result, I decided to unleash the definitive guide to identify nofollow and "juiceless links" to avoid this situation going forward.

For you advanced SEO'ers, you are probably well aware of nofollow vs dofollow links/index vs noindex, and I know the real question is: "How much link juice does an individual link pass on?" Well that's a whole other post and discussion for another day, but let's start off with the basics, in case you don't know what a nofollow link is, check out one of our old posts which describes the definition of a nofollow link.

Down below I've identified the 10 different ways webmasters can ensure that a link is nofollow or noindex in the search engines - ensuring that no link juice is passed on to your website. Once you know these different methods, you'll have all the ammo you need to ensure that all future links you negotiate are in fact dofollow, indexed and pass lots of link juice = Let the Link Juice Flow!


1.) Basic Nofollow Tag:

This is by far the easiest way to identify a juiceless link. If you see the nofollow tag within the HTML coding, then the link is therefore nofollow and does not pass any juice. Down below is an example of a nofollow link:

<a href="http://www.yourjuicelesslink.com" rel="nofollow">Anchor text</a>

A great way to quickly see nofollow/dofollow links would be to install the SEO for Firefox plugin; once activated nofollow links will be highlighted in red.

2.) Nofollow Page Level:

This tag is set within the <head> of a specific page. By adding this tag, webmasters are telling the search engine robots to nofollow the content and any links from this page. So the page will be indexed within the search engine results, but your link will be nofollow and hence completely juiceless. Easiest way to identify the nofollow tag at the page level is to look at the page source of a specific page and look for the following code within the header:

<meta name="robots" content="nofollow">

If you see this tag, then you know that the page's content (links included) is nofollow and it's not worth your time and effort to have your link there.

3.) Robots.txt File:

The robots.txt is a file that is uploaded to the root level of a website, and it's main purpose is to usually tell the robots specifically to both nofollow and noindex to not retrieve or crawl the content on a specific page or directory level; thereby making it non-accessible to the search engines and not passing any link juice. Down below is an example of a robots.txt file:

User-agent: *
Disallow: /contact.php
Disallow: /directory-link-exchange.php*?

The Disallow: /contact.php tells the search engines to not retrieve or crawl a specific pagenofollow/noindex at the specific page level; in this case the Contact page, while the Disallow: /directory-link-exchange.php*? tells the search engines to nofollow/noindex at the not retrieve or crawl a specific directory. For example, if you happened to negotiate a link placement on the ...direcory-link-exchange.php/business page this page and all other pages under the /directory-link-exchange.php sub-directory would also be nofollow/no index not be retrieved or crawled.

It's also possible that the robots.txt file can be dynamically written and hidden from public view; it's essential that you check to see whether or not the page has been cached (enter cache:http://www.yourwebsite.com into Google).

Webmasters who are unable to upload a robots.txt file may use some of the different meta robots tags as another option, these include:

<meta name="robots" content="nofollow"> as outlined in #2 nofollow page level. As stated before, this tag will disallow any link juice from being passed on, although the page will get indexed.

<meta name="robots" content="noindex"> which tells the search engines just to noindex the page, but the links are actually followed and can pass link juice.

<meta name="robots" content="noindex, nofollow"> which tells the search engines to both nofollow and  noindex the page; thereby blocking the search engines completely and ensuring zero juice is being passed on.

4.) 302 Redirection:

If you've followed everything that I've already spoken about to this point you should be very familiar with 301 vs 302 redirections. In a nutshell, 301 redirections pass link juice while 302 redirections do not. Most of the time I see 302 redirections from major corporate business directories and webguides that use a counter; counting the visits your listing receives from their directory. In this scenario, they use a 302 redirection to redirect the user to your specific listing page; thereby not passing on any link power from their strong sub-pages.

The easiest way to detect 301 vs 302 redirections is to use a redirection analysis tool.

Another option which is a bit more complicated is too use the Live HTTP Headers plugin for firefox. This tools allows you to view the HTTP headers of a page and while browsing. Here's an example of how to use the plugin:

You think your expensive listing on Goldbook is SEO friendly? Think again! They employ the 302 redirections from their sub-pages as well.

Check this out... I did a quick search for beer suppliers in Ajax, (best hockey playoffs in years + warm weather finally = how could I resist?) and here's an example of a specific listings page (for Dial-a-Bottle) that utilizes 302 redirections; ensuring that the listing is actually not SEO friendly at all.

Here is the link which you can go to once you've activated the plugin.

http://www.goldbook.ca/goldbook/ajax/Beer-Ale/dialabottle.html

After you enter this url, scroll down and click on "Visit Website" and you will be redirected to the Dial-A-Bottle website, then navigate to Tools>Live HTTP headers and wait for the information to propagate. Down below is a snippet from the Live HTTP header which clearly shows it has detected a 302 redirection.  Just scroll down to the 2nd paragraph and you will see:

HTTP/1.x 302 Found
Date: Wed, 20 May 2009 18:10:40 GMT
Server: Apache/2.0.59 (CentOS)
Expires: Thu, 19 Nov 1981 08:52:00 GMT
Cache-Control: no-store, no-cache, must-revalidate, post-check=0, pre-check=0
Pragma: no-cache
Location: http://www.thebeerstoredelivery.ca/

So before renewing one of your companies' listings on major business directories make sure you are well aware that they are definitely not SEO friendly if they employ 302 redirections.

5.) Meta Refresh Redirection aka Poor Man's Redirect:

This is a very basic and outdated type of redirection, used mainly because webmasters don't have access to PHP or javascript coding within the website. This method of redirection is becoming less popular these days and has been replaced by the 301/302 redirections. The tag basically tells the search engines to refresh the content every X amount of seconds, and this tag redirects the user to your specific URL within a matter of seconds. Here's an example of the meta refresh redirection within the <head> tag:

<meta http-equiv="refresh" content="5;url=http://example.com/" />

The key here is how long the redirection takes place. If it's less than 4-5 seconds than it's a 301 redirection; if it exceeds 5+ seconds it is considered a 302 redirection. But again, I must stress that this technique is pretty outdated and has been replaced with the other types of redirections mentioned in this post.

Another way of detecting whether or not a meta refresh redirection is being utilized is to check the source code of the page. You can also run the link through the link analysis tool, and if it returns a 200 server quote then you know they've used either a meta refresh or a javascript redirection, so you need to check the source code of the page to find out which one has been used.

6.) Javascript Redirection:

This type of redirection is similar to the meta refresh redirection, whereby it's outdated and webmasters use them in the case of static websites: in which case their site doesn't support any server-side language. They insert the following script on any page within their website in order to activate the redirection:

<SCRIPT language="JavaScript">
<!--
window.location="http://www.yourwebsite.com";
//-->
</SCRIPT>

To detect this type of redirection, it's easiest to just go through the link analysis tool, and if returns a 200 server quote then you know they've used either a meta refresh or a javascript redirection; so once again you need to check the source code of the page to find out which one has been used.

7.) Flash Links:

This is pretty basic; any clickable links built within flash will not pass on any link juice. The proof here lies in both flash images for games and widgets, when you download a flash-based widget or game that links back to a site, no power will be passed on**. Flash links can be very manipulative and as a result are not counted by search engines.

** For widgets, links can be placed within HTML in order to pass link juice.

8.) Javascript Links:

This is also a basic redirection and similar to the Flash redirection whereby any links built within javascript will not pass any link juice. A good example here is Google Adsense links, they are built within javascript and simply redirect you to the destined URL without passing on any link juice.

9.) Cloaking:

And now for the not so basic redirection... cloaking! This is a very advanced, and often very shady technique which is extremely difficult to detect. Cloaking redirections work by showing the page's content with links visible to the public, but actually serve up a different version of the page to the search engines; with your links nofollow or in some cases, without any links at all.

To detect potential cloaking redirections, download the User Agent Switcher plugin for Firefox, and then set yourself as a Google bot. You can do this by going to Tools>User Agent Switcher>Options>User Agents>Add and then entering in "googlebot" into the first three fields.

It is possible, however, that the cloaking could be IP specific, so even though you've identified yourself as a Googlebot, you won't be able to identify the cloaking redirection because your computers' IP does not much the googlebots IP addresses. In this case it is important that you check the cache of the page you are analyzing, and set the parameters to 'text only' in order to see which version of the page is actually being indexed by Google.

[Ed.: it's also possible to get a potential view on some cloaking attempts by "going through" Google using Google Translate]

10.) Non-Cached Pages:

Now this is where it gets tricky. Many or most non-cached pages simply aren't indexed by the search engines. They don't use specific tricks, they don't contain no-cache meta tags; they're simply not in the index.

I've identified a good majority of  the different reasons for a page to be not cached, so your first step would be to go through this list and check the source code of the page file to see if there are any tags or redirections being utilized (see #1-#9).

Other reasons could be that the site is very weak and your link is placed within many sub-pages (deep level) as is often the case with directories. Therefore the crawlers have never gone through the 3rd, 4th, 5th (and so forth) levels of the website and as a result the deep level pages have never been indexed, and therefore do not pass any link juice. That's why it's so important to always check the cache of a page before submitting or negotiating links to be placed within a specific page.

Another reason is that the page could be an "orphan page", whereby it is not being linked to from any other pages within the website. A good way to check for orphan pages is to type in the specific sub-page URL with Yahoo Site Explorer to check for inbound links. If there are zero inbound links, then you know it's an orphan page and therefore will not pass any link juice.

As a general rule of thumb, any non-cached page will not pass any link juice unless it is using the:

<meta name="robots" content="noindex"> as outlined in point #3 of this list.

Conclusion

It's very important to keep in mind that there is no guarantee that all search engines will treat non-exclusion or specific redirection tags the same; in some cases they may not listen to the robots.txt for a few months, or certain pages that have been labeled noindex can still appear within the search engines. That's why I tried to mention all the different ways to identify whether or not a link actually passes power; to be used in combination with each other and just to show you the many different ways that a link can be nofollow/noindex/302 redirected.

Truly hope you enjoyed the post. I know this topic  is sure to raise some debate so please feel free to add your comments, questions and constructive criticism. And always remember; have fun linkin'!