Crawlability Enhancers: 7 Tips for Deep Crawling Success

by Paul Teitelman April 14th, 2010 

One of the biggest issues these days that I’ve seen in Google comes down to crawlability and the way larger sites are being indexed. Crawlability and indexability are some of the most important aspects of SEO as you need your site to get both crawled and indexed in order to get any organic traffic from Google and the other search engines.

The importance of crawlability and indexability cannot be underestimated as without good deep level crawling you can be assured that your site’s pages (and especially deep level pages) won’t be getting indexed, which translates into those pages (and your site) not getting the search engine traffic it deserves. Crawlability and indexability are separate from another, and just because your site is getting crawled by the Google bots doesn’t equate into your site’s pages getting indexed necessarily. The end goal here is clear: to get as many pages of your site crawled, and then indexed by Google – which is the only way to receive traffic from Google and other search engine organic searches.

Before we get into the tips to improve your site’s crawlability and indexability, it’s important to state that the issues raised in this blogpost really are targeted towards larger websites that have upwards of 10,000 pages. For smaller 10-50 page websites there really shouldn’t be any of these issues with crawlability and having pages appear/disappear in the index. Another caveat to take notice of is that just because your site is getting crawled this doesn’t mean that your site’s pages are necessarily getting indexed. This could be because Google feels there isn’t a strong enough case to keep that page in the index; for instance pages with duplicate content and pages with very little content. In conclusion, there needs to be a legitimate reason for Google to keep your page in the index and I’ll provide some tips at the end to assist you with improving your indexability and maintaining your indexed pages.

7 Tips for Enhancing Crawlability

crawlability

1.) Make sure the site’s navigation and information architecture is readable by the search engines

This really should be a given but a lot of sites still fail basic SEO principles: make sure you avoid flash menus and using any complicated JavaScript (or any at all if you can) in the navigation. Also avoid using AJAX navigation. Finally, if using fly out or drop down menus make sure they are built in CSS.

Goal: Make sure that you don’t shoot yourself in the foot with a poor site navigation that is not readable by the search engines. If the site navigation isn’t readable by the search engines, you’re deep sub-pages are not getting to get crawled and as a result won’t receive any traffic.

2.) Link to the most important pages from within the site navigation off of the homepage

Link out to all of your most important pages from your site navigation off of the homepage. Include key service or product pages, blogs, directories and additional second level pages of high importance. You can also link out to additional sub-pages in the footer or sidebar if you aren’t able to squeeze it all in the header navigation.

Goal: The whole point here is that you want to pass PR through the homepage and down to the most important sub-pages within your site.

3.) Have good interlinking between the most important pages

Interlink the most important pages with each other in order to share and distribute link power amongst these pages. Good opportunities for interlinking are:

- Blogposts: link to homepage and specific sub-pages mentioned in the post

- Related products/services: link out to products or services that are similar (if have ecommerce site)

- Locations: link out to other close areas and cities (if have directory site or directory structure)

Goal: To distribute link juice amongst the most important pages of the site and to keep the link power consistent among these pages. Once again you’re passing PR to the most important pages, and with interlinking your sharing and continuously distributing this link power throughout the site.

4.) Inbound links to both the homepage and deeper level hubs of the site

One of the most important factors to help improve crawlability. Build up homepage authority and PR in order to distribute the power to the deeper levels of the site.

If you don’t build up the authority of the homepage it will be very difficult for deep level crawling. Many lower level pages of a site won’t get crawled unless there is a continuous and powerful flow of link juice from the homepage down through to the lower levels of the site.

Another very important strategy is to link to the major hubs or intersections on your site. You shouldn’t solely rely on the homepage link funnel power plus, deep linking is a good SEO strategy at the end of the day either way and especially when talking about crawlability. An example of a good hub to hit for a directory structure would be to go after the main city levels instead of all of the specific businesses within each of those cities. For an ecommerce site, would need to hit the major top-level product or brand categories as opposed to the specific product pages that fall under each brand or category.

Goal: To improve the power and authority of the site by focusing in on building up the PR of the homepage. By linking to deeper levels and major hubs of the site you are distributing some of the link power and also increasing the crawlability of many of the deep level pages that might be so deep within the site that they aren’t getting any link love from the homepage. By focusing on both the homepage and the deeper hubs you can counteract this problem faced by large websites that have many different categories and hundreds of specific listings or pages under each of those categories. The homepage can only distribute so much link juice through to these lower levels and that’s why you also need to focus in on linking to the major hubs of the site as well to increase crawlability.

5.) Have a good sitemap and make sure it’s submitted to the search engines

It doesn’t take that long and it does help, so make sure you have a good, dynamically updated sitemap and that you submit this sitemap to both Google and Bing Webmaster Centrals. Include a link to the sitemap in your robots.txt file.

Goal: This takes more of a direct approach of telling the search engines your site information architecture and showing the bots the paths to the more important pages on your site.

6.) Have a good RSS feed

Contrary to popular belief you actually don’t need a blog to have an RSS feed. You can display products or even listings of new products, upcoming events, sales and other news-related updates, or anything that can be itemized and updates for that matter, in the RSS feed.

Goal: Using the RSS feed as a link resource or as a way of telling Google to find the most important pages on your site.

7.) Make sure your page load time is fast

This is a pretty basic point, but if you have a big site then you need a fast server in order to allow GoogleBot to crawl as many pages as possible in the shortest amount of time. Check speed performance with Google Webmaster Central to evaluate how fast your server truly is.

The point here is to make sure that the size of your site’s pages (KB) are as small as possible to help decrease page loading times. Two tips are to move any javascript or CSS coding to an external file, and to remove any unused or commented code from your pages.

Goal: Simply put; to have the fastest loading website and pages. This requires a fast server that will be able to handle the amount of requests coming in from the search engines (crawlability) and users (traffic).

Conclusion

Once again, crawlability and indexability are some of the most important aspects of SEO as you need your site to get both crawled and indexed in order to get any organic traffic from Google and the other search engines. Stay tuned as next week I'll assist you with improving your indexability and maintaining your indexed pages.

Paul Teitelman

I'm a SEO Manager here at SEP and am responsible for overseeing the organic ranking of clients for their major keywords. When I'm not in front of computers my main passions are drumming, hockey and hanging out up north at my cottage in Muskoka.

You May Also Like

6 Responses to “Crawlability Enhancers: 7 Tips for Deep Crawling Success”

  1. Kimi says:

    About the menubar, didn't know it before reading this post. I was interested on creating the navigation with javascript, your post makes me think twice lol :D
    .-= Kimi recently posted: Back up and Restore in wordpress =-.

  2. FF says:

    Great article Paul, you make a lot of valid points. I tell a lot of clients about the simple things that you mention, some take it on board, some don't.

  3. [...] Crawlability Enhancers: 7 Tips for Deep Crawling Success – this is an interesting area (discovery/indexation) I’ve been personally doing some research into lately. This post is actually quite important for those that aren’t actively seeking better craw-ability. Read it! [...]

  4. [...] is to get the pages initially indexed and then more importantly, to keep those pages in the index. As mentioned earlier, just because a page is crawled doesn’t mean it will necessarily get indexed, so make sure that [...]

  5. Philip says:

    All the points are good, but the post leaves out the most important information. To wit, the crawl rate and the depth of crawling are directly proportional to your PR/backlinks (Google has recently admitted this). Another major omissions is URL canonicalization for duplicate content, not to mention more minor ones.

  6. Michael Lucy says:

    Hi Paul .. Thank you for the article, we will be citing this publication for some upcoming SEO workshops we have scheduled here in Detroit .. Is there a generalized strategy for linking back to your site; example: should you go about building backlinks by focusing on homepage for the initial part of your campaign THEN after a certain amount of backlinks are established start driving traffic to deeper more embedded parts of your site? Vice Versa or a mixture of both? Is there an order of priority?