Search Engine People - Search Engine Positioning, Placement Service
Home  |  Blog  |  About Us  |  Careers  |  News  |  Contact Us

How Search Really Works: The Keyword Density Myth

Ruud HeinWelcome! Thanks for visiting!

Subscribe to the full feed

by Ruud Hein
February 1, 2008

This post is part of an ongoing series: How Search Really Works.
Last week: Keyword Stuffing.

What is Keyword Density?

Keyword Density is a function, a calculation, of keyword frequency.

It’s calculated as number of occurrences divided by number of words and is usually expressed as a percentage.

keyword density example 

What is Keyword Density Used For?

Nothing much, really.

Keyword density can help in readability calculations.

Keyword density is also sometimes used as a simplified manner to introduce local keyword weight but should never be confused with it.

Why don’t Search Engines use Keyword Density?

local-keyword-density

Search engines deal with calculations that say something about words in a document in relation to the index it appears in.

Keyword density says something about words in a document in relation to the document itself. It doesn’t help you to compare and thus sort or rank a set of documents.

Frequency <> Relevance

The fact is that frequency in and of itself doesn’t equate to relevance.

The word the is the most commonly used English word: it appears with the highest frequency. If a search engine would calculate relevance as frequency, all documents in its index would have the as their topic subject.

Likewise the word time is the most commonly used English noun. This would make a multitude of documents relevant to time before anything else.

Keyword Weight

To make sense of word occurrences in a document a search engine has to see those words in the context of its index.

This is done by calculating the overall importance of words both in the document and in the index.

This importance is called term weight.

To calculate the importance of a word in a document, 3 variables are needed:

  • local weight: a calculation based on keyword frequency in this document. This variable can be calculated in many ways but not as a straightforward count of how many times the word appears in the document.
  • global weight: calculated based upon number of documents in index divided by number of documents with the keyword.
  • normalization: a calculation designed to remove the unfair advantages and disadvantages of document length. Usually you work to express the end values between 0 and 1.

None of the search engines have ever disclosed which published or unpublished scales they use for local weight or global weight.

What we’re looking to achieve is to get high values for terms (words/phrases) that occur a lot of times in the relevant documents but infrequently in the index as a whole.

term-weight

Keyword Density Myth Summary

Search engines use term weight to rank documents by relevance.

Term weight is calculated from the result of two other calculations: local weight and global weight.

Without knowing the function used for local weight we can’t calculate it — but we do know that it’s not just pure keyword frequency.

Without knowing the size of the index, the number of documents relevant to the term, and the function used for global weight we can’t calculate it.

Using keyword density as a guesstimator of weight or relevance is therefore utterly useless. It’s like giving you the height of a three dimensional object based on which you have to not only return its volume but also tell whether it is larger or smaller than any other unseen object in a collection you don’t know about.

keyword-density-calcultation

Hungry for more? I recommend The Keyword Density of Non-Sense.

I hang out at Twitter where I enjoy the company, the buzz, the nuggets of info and opinion we pass along.
Join me on Twitter!
• Get Search Engine People delivered by email

As posted in How Search Really Works.

You're welcome to join the conversation; add your response. You can track the conversation using the RSS 2.0 feed.
You can also trackback from your own site.

23 Responses to “How Search Really Works: The Keyword Density Myth”

  1. Nick James (1 comments.) Says:
    February 1st, 2008 at 7:00 pm

    Once again Ruud, you dispel the myths that a lot of beginners (including myself a few moons back) think equals SEO, and expand upon to show the true path.
    Many thanks.

  2. SEO Design Solutions (4 comments.) Says:
    February 2nd, 2008 at 1:08 pm

    Ruud:

    I love your writing style. I wish I had your patience to lay out the fundamentals so well. I know I am partially responsible for having the appearance of jumping to conclusions and calling them “rules of thumb” but the most important factor “in my summation” I have found that translates into high ranking SERPs is global weight in conjunction with back link relevance and authority (that produce ranking rock stars for pages).

    Thanks for breaking it down so well. Love the writing style.

  3. Mike (10 comments.) Says:
    February 2nd, 2008 at 1:17 pm

    Keyword Density = High Rankings. I am so sick of hearing this, nice to see a post explaining that it is a myth and maybe people will learn what it really takes to rank a website.

  4. Sergey Rusak (3 comments.) Says:
    February 2nd, 2008 at 2:03 pm

    You are right, it is great for beginners and websites which provide all this keyword destiny tools just waste people time.
    There are millions examples when new websites with small amount of backlinks got #1 positions for keywords which seems to be difficult. From another side, large companies whith popular websites can’t rank for specific keywords…
    I always say that domain power and trustrank is the most important factors in SEO. Good reputation will give you great results over time and amazing average traffic from all search engines.

  5. Lex G (1 comments.) Says:
    February 2nd, 2008 at 3:06 pm

    Good thinking man … And it’s very hard for me to like something …

    Lex G

  6. Utah SEO (45 comments.) Says:
    February 3rd, 2008 at 7:09 pm

    Very good that you shed light on this for people. Thanks.

  7. graywolf (2 comments.) Says:
    February 4th, 2008 at 11:01 am

    Of course KWD matters it doesn’t matter more than trust, and links/anchor text but of course it matters.

    now of course every time I post an example some engineer comes along and kills it, even though it was working fine for months or years before.

  8. Ruud Hein Says:
    February 4th, 2008 at 11:29 am

    lol, graywolf, good one :)

  9. evisibility (3 comments.) Says:
    February 4th, 2008 at 4:48 pm

    I have to agree with graywolf on this. KWD matters, but to what extent it matters is up in the air. I always explain SEO as hundreds of little things all done the right ways at the right times that help you get ranked.

    If you do not put any thought into KWD for your campaigns what do you recommend for keyword content? How do you attack on page SEO in relation to keywords?

  10. Ruud Hein Says:
    February 4th, 2008 at 6:31 pm

    I suspect he was making a joke :)
    From the field of information retrieval we know, don’t suspect but know, that keyword density cannot be used to rank documents according to relevance. Except for very basic in-classroom kind of search engines no-one knows of any type of commercial search engine using KD as a relevance factor.

    As a non-relevance/spam factor it makes even less sense, increasing the number of calculations a search engine has to do.

    Keyword frequence, keyword distribution, keyword distance, topic relevance, etc.: these all matter. But keyword *density*?

    Take your favorite KD analyzer. Do 20 searches. See if the ranking matches the KD.

  11. cipher (1 comments.) Says:
    February 4th, 2008 at 10:45 pm

    Nice post… Am trying a lot to get my pages listed in search engines. Even tried this keyword density stuff long back. It would be more helpful if you could explain in simple terms, what is it, that we should do to get our pages more relevant? In terms of content and articles how must we use keywords?

  12. Ruud Hein Says:
    February 5th, 2008 at 11:11 am

    @cipher As the series progresses we’ll see that information come forward more and more.

    Sounds really lame — but the best way to be “more relevant” is to *be* relevant. Seed & promote that and backlinks confirming and voting for that relevance come into play.

  13. evisibility (3 comments.) Says:
    February 5th, 2008 at 11:23 am

    Ruud,

    But if KWD effects local weight and global weight (which is basically the density of pages in the index that mention a given KW) how can you say that it is not important. It is not a main factor in ranking but is still part of the puzzle. You cannot say that if your KWD is too high that you will not be affected. You will be considered spammy and drop rankings. That being said, KWD is a factor, not only in the body but in the Title tag and in the code (such as alt and hyperlink title tags).

    Your reasoning about the words “the” and “time” make sense but I feel that maybe Google is sophisticated enough to not use KWD for these types of terms. Rather it is used more for KW phrases.

  14. Ruud Hein Says:
    February 5th, 2008 at 7:56 pm

    You confuse, or mix up, keyword density (words:keywords ratio) and keyword frequency. Keyword frequency is part of the tf*IDF calculation: keyword density isn’t.

    I’ve held on to the idea of keyword density as a spam measure for a while but Dr. E. Garcia does an amazingly eloquent job of dispelling that notion in Keyword Density Myth - The Devil’s Advocate and Keyword Density (KD): Revisiting an SEO Myth.

    With term weighting it’s easy to see the where a search engine gets its values from. Using keyword density as a relevance measure (or spam measure), where does it get its values from? How would it come up with x% is relevant, y% is not and z% is spam? If these are absolute values, how does that relate to long/short content? If these are variable — again, where do those numbers come from?

    As for keyword *phrases* read the paragraphs about linearization and “burning the trees” on Garcia’s article (link).

    Quote: “Two term sequences illustrate the point: “Find Information About Food on sale!” and “Clients Visit our Partners”. This state of the content is probably hidden from the untrained eyes of average users. Clearly, linearization has a detrimental effect on keyword positioning, proximity, distribution and on the effective content to be “judged” and scored. The effect worsens as more nested tables and html tags are used, to the point that after linearization content perceived as meritorious by a human can be interpreted as plain garbage by a search engine. Thus, computing localized KD values is a futile exercise.“

  15. graywolf (2 comments.) Says:
    February 6th, 2008 at 9:06 am

    so what if I was to show a blog post where all of the text was a 4 word phrase repeated 10 times. What if I was also to tell you these keywords were part of an seo contest. What if I was to tell you my page wasn’t created until after the contest was over, with no active link building on my part (I get scraped to death). What if I was to tell you this page ranked for it’s term until I mention it and some engineer comes along and killed it … twice.

    Of course google should be “smart” enough to catch this but they aren’t

  16. Ruud Hein Says:
    February 6th, 2008 at 6:49 pm

    Graywolf, excellent example of Google not using KD as an anti-spam measure. Or as a relevance measure.

    On the whole I think Google errs on the side of safety. They can do more to remove a lot of possible noise but then you end up throwing out way too much good stuff too.

    “smart” is as smart does. With a clean database it’s already hard enough to return and rank *really* relevant data. With a tainted set like Google’s… wow.

  17. Rajput Jitendra (1 comments.) Says:
    February 18th, 2008 at 10:38 am

    Your article is nice with very usefull information but if it possible then please can you tell me meaning of normalization and how it will be count

    Rajput Jitendra
    http://www.tatvasoft.com

Trackbacks

  1. links for 2008-02-06 oggin.net Says:
    February 5th, 2008 at 7:21 pm

    […] How Search Really Works: The Keyword Density Myth Keyword Density is a function, a calculation, of keyword frequency. It’s calculated as number of occurrences divided by number of words and is usually expressed as a percentage. (tags: content keyword SEO) […]

  2. Keyword Density, SEOs, and the Deception War « IR Thoughts Says:
    February 7th, 2008 at 8:34 am

    […] Waking Up and Getting the Keyword Density Myth I’m happy that at this Sphinnessed post: http://www.searchenginepeople.com/blog/how-search-really-works-the-keyword-density-myth.html , several SEOs are finally waking up and getting the Keyword Density […]

  3. Keyword Density Debate at The Web Design Journal Says:
    February 10th, 2008 at 10:34 am

    […] seems to be people that are arguing that keyword density is not part of the ranking algorithm, or at least not a […]

  4. Learn SEO: Keyword Density Says:
    February 10th, 2008 at 8:04 pm

    […] SEO: Keyword Density submit_url = “http://learningseobasics.com/archives/185″; How Search Really Works: The Keyword Density Myth Ruud […]

  5. Learn SEO: Search Indexing Says:
    April 10th, 2008 at 3:00 am

    […] first parts of the series we have been educated in META keywords, keyword links, keyword stuffing, keyword density myth, and now we have “How Search Really Works: “The” Index […]

  6. How Search Really Works: Relevance (2) - Vector Space | Search Engine People Blog Says:
    May 10th, 2008 at 9:19 pm

    […] the keyword density myth we know that true term weighting is done collection […]

  7. Leave a Reply

« Friday Funnies: Blogging Addict
Want Great LinkBait … Provide Your Staff Small Digital Cameras! »

Subscribe

Full Feed
Email Updates

Recent Posts

  • One Week of Sphinn SEO Lessons
  • Friday Funnies: If MySpace Were A Person
  • 50+ Sites To Help You Bury Negative Posts About You or Your Company!
  • A Letter of Apology to my Wrists
  • Social Networking Going Mobile
  • Huge Growth + Talent Shortage = Increased M & A Activity
  • Fumbling Your Site
  • Friday Funnies: Slogan Of The Month
  • Facebook - It’s the new Yahoo!
  • 25 of Digg’s Most Trusted Sites

Most Popular Ever

  • The Avatar Experiment - Stunning vs Cute vs Guy
  • Which SEO Lord of The Rings Character Are You?
  • Offline Web Links! What??????
  • Free Google Mobile Adwords
  • How To See Your Google Adwords Listings In Other Countries and Cities

Most Popular this Month

  • Which SEO Lord of The Rings Character Are You?
  • How Search Really Works: The Keyword Density Myth
  • Does the future of Windows spell the doom of Google?
  • 5 Unique Ways to Make the Google Bot Your Best Friend
  • 67 Wordpress Pinging Resources You Need to Know

Subjects

  • Affiliate Marketing
  • Authority Building
  • Blogging
  • Branding
  • Canada
  • Content
  • Coupons
  • eBooks
  • En fran栩s
  • Events
  • Experiments
  • Francophone
  • Funnies
  • Google
  • Guest Post
  • How Search Really Works
  • Local Search
  • Mobile Search
  • MSN/Live
  • News
  • Online Marketing
  • Online Retailing
  • Online Shopping
  • Opinion
  • Pages Jaunes
  • PPC
  • Quebec
  • Reputation Management
  • SEM
  • SEO
  • Social Media
  • Stats
  • Technology
  • The Algorithm is Human
  • Tips
  • Tools
  • video
  • Yahoo
  • Yellow Pages

Archive

  • May 2008
  • April 2008
  • March 2008
  • February 2008
  • January 2008
  • December 2007
  • November 2007
  • October 2007
  • September 2007
  • August 2007
  • July 2007
  • June 2007
  • May 2007
  • April 2007
  • March 2007
  • February 2007
  • January 2007
  • September 2006
  • July 2006
  • May 2006
  • March 2006

Search


Recent Readers

The Writers

  • Jeff Quipp
  • Jennifer Osborne
  • Ruud Hein
  • Tom Tsinas

Top Commentators

  • Utah SEO (4)
  • Gab Goldenberg (2)
  • Stefan Vervoort (2)
  • Catfish (2)
  • Dev Basu (2)
  • hugo (2)
  • Oliver Taco (2)
  • Hobo (2)
  • spostareduro (2)
  • Linda Bustos (2)

Blogroll

  • AbleReach Blog
  • aimClear Blog
  • Bill Hartzer
  • Blah Blah Tech
  • Courtney Tuttle's Blog
  • DailyMoolah
  • DoshDosh
  • Geyser Marketing
  • Gray Wolf's SEO Blog
  • Jaan Kanellis
  • Justilien - Link Building
  • Learning SEO Basics
  • Matt Cutts Blog
  • New Orleans Internet Marketing
  • NorthSouthMedia
  • Nowsourcing
  • Profectio - Dave Forde
  • Quiddity - Essence SEO Blog
  • Search Engine College
  • Search Engine Jounal
  • Search Engine Land
  • Search Engine Watch
  • SEO by the SEA
  • SEO Design Solutions
  • SEOco UK Blog
  • SEOPittfall
  • SexySEO
  • Small Business SEM
  • Social Desire
  • Sphinn
  • Stepforth.com - Ross Dunn
  • Stephan Spencer's Scatterings
  • Stuntdubl
  • Techipedia
  • Tim Nash
  • Top Rank Blog
  • Trail of the Fire Horse
  • Utah SEO Blog
  • Yeepage Blogging Tips

SEO Toronto - Search Engine Optimization Specialists
Copyright © Search Engine People - All Rights Reserved.
Contact Us at 1-877-486-7875 or 905-426-9340 - contact@searchenginepeople.com