"Thin" content is one of the most misused and loose terms in our industry. Whenever there's an inexplicable drop in rankings, a lot of experts are ready to jump in and blame an alleged "thin" or "duplicate" content penalty.
On the other hand, whenever Google is willing to confirm an update, they would include "thin content" in the list of possible reasons for a hit making it next to impossible to diagnose the issue properly.
Let's try and look into the term in more detail and try to create an easy-to-digest (and more importantly, easy-to-act-on) definition for it.
What Is Thin Content?
Let's first look at the usual way search experts describe the phenomenon:
Thin content is anything that doesn't provide any real value to the people who visit your site. It is badly written articles that are only there for SEO. It is something generated purely to draw in clicks, or full pages only there to target a different variation of a keyword. Finally, thin content is something people would never share on social media.
All of the above is a highly subjective way of looking at the concept. There's no good way (at least at our disposal) to identify whether content is badly written (unless you read each page of the site) or whether it objectively provides any value. And Google has clearly confirmed that they are not using social media signals (whether a page was shared on social media or not) to determine its quality either.
While the above definition of thin content may be true in general, it doesn't provide any actionable value. By looking at a website, how do we determine if there's thin content somewhere inside?
For the sake of simplicity, here's my action-oriented definition of thin content:
- Content that has a low word count (pages with little to no content on them, usually, just a paragraph or two)
- Content that is already repeated throughout the site (internal duplicate content)
The most important thing here is to remember that both of the above become an issue ONLY on a large scale. If you have one or two pages on your website with little content, there's nothing to worry about. If 70% of your site pages have less than 300 words, it must be the reason why you cannot get it to appear in top 10 of Google.
Another important thing about a large amount of thin content on your site is that it may drag your higher-quality pages down too. To put it simply, if 30% of your site pages are well-written well-researched long-form content, they may still have trouble ranking because of those 70% of thin pages.
Google vs Thin Content: Brief History
Believe it or not, Google has a long history dealing with the above issue. Back when they had Supplemental Index, those 70% of thin-content pages were not such a big issue because they didn't effect your overall rankings. Google would just put them in their Supplemental Index and treat your higher-quality pages separately.
With the introduction of Panda Update (which is now part of Google's algorithm, mind you), everything has changed.
For more context, check out this video by Jim Boykin:
Whenever your site is affected, almost always the answer is "You have too much thin content"
Another well-discussed update targeting "thin content" that came after Panda was Fred that devalued low-quality affiliate sites with thin content.
How to Identify Thin Content on Your Site
The easiest way to identify thin content (and where you may want to start) is:
1. Use a SEO crawler to find pages with the least word count
I am using Netpeak Spider here: Download the trial version, run your (client's) website, scroll to the column called "Content size":
2. Use a SEO crawler to find content that can potentially cause internal duplicate content issue
Using the same tool, check "Duplicate Titles" and "Duplicate Descriptions" sections in "Issues" sidebar.
Note: From the screenshot above, there's no large-scale issue but those 8 pages still need to be addressed because they may be competing with one another in search results.
The above two exercises should have given you a solid list to look at. Now proceed to your Google Analytics dashboard and look at some of those pages: Do they ever receive any clicks?
For more details, proceed to this article by Ahrefs explaining how they identified thin content on their blog:
So, what are some of the red flags?
1. Look at those backlinks
Now, after the above exercises you should have a solid list of pages that may be deemed thin. Go ahead and check if any has any backlinks. From Ahrefs' guide above, here's another screenshot of how to do that:
Note: Backlinks can be a solid positive signal to consider. And I am not talking about "link juice" here. What I mean is: if people were linking to that page of yours, it must have offered them some value, so probably that thin content is not that thin after all!
2. Check if your content is being stolen
This one isn't about what you have done, but what someone else has. Content is stolen all the time and posted on other websites in an attempt to steal traffic. While this is a less common issue (Google can figure things out quite well), it is something to look at.
How to Fix Thin Content Issue
Now let's move on to the long-awaited part: Is there a way to fix it? There are two options to choose from:
1. Improve your old content with new upgrades.
The best option is to improve the content itself and see if that helps to increase the interest in the original piece. Googlers seem to be of the same opinion: It's best to work on it rather than get rid of it:
— Gary "鯨理" Illyes (@methode) October 7, 2015
Keep in mind that you will have to market it again so your users are aware of the improvement. Label it as an update if it gained traction the first time, but not enough.
It helps to use keyword research tools to see how to best expand it. I use Ubersuggest's Keyword Tool to discover related terms and concepts I may want to incorporate into a copy to turn it into a better-researched more valuable content. For example, type in [hiking] to see all kinds of related terms (which can be used to expand the initial idea):
Ubersuggest is free, requires no login, so I find it easy to use to delegate research to remote editors and writers.
While updating and expanding your content, you may also want to look at re-packaging and featured snippet opportunities. While featured snippet research may help you get some extra visibility in search, it will also most surely provide you with lots of ideas on how to improve your content quality by breaking it into sections, providing definitions, lists, and charts.
Note: When you are dealing with huge database-driven websites with hundreds of products or user-generated listings (that have little or no original content), you'll need to consider a more complicated / technical approach. Check out this Whiteboard Friday article discussing just that: How to deal with non-original content on a large scale. For example:
- Find ways to organize / visualize non-original / user-generated content the way it provides new value (generate scores, charts, comparison, etc.)
- Use available industry APIs to generate more information information on each page to help your users (For example, if you are into real estate niche, you could use an API to show crime records for each neighborhood or school stats in the given district. This would help you make a syndicated property listing much more useful to users).
2. Remove it (or go with a no-index meta tag)
One of the simplest ways to get rid of thin useless content is just to remove it entirely. But if there has been some interest in the post you might prefer to no-index it, which means Google won't count it when crawlers come through your site.
While I consider this a last-resort measure, sometimes it's unavoidable, when you are dealing with a large amount of pages that need to be fixed.
Do you have thin content? Maybe you have some tips on how you got yours up to snuff? Let us know in the comments!