How Search Really Works: Simple Query Optimization

by Ruud Hein March 21st, 2008 

This post is part of an ongoing series: How Search Really Works.
Last week: The Compressed Index.

While human beings can scan a page and see if the whole phrase "a grandiloquent dictionary" appears on it, a search engine can't.

A search engine needs to:

  1. Lookup the occurrences for each word in the phrase
  2. See if the positions of words in the document fit the phrase

As a search engine isn't smart it needs to work smart.

Leverage Keyword Frequency

sort-by-frequency 

By storing the frequency with which a word appears in the whole index we can right away cut down to the smallest set from which to draw results.

Instead of selecting 15,570,000,000 documents in which "a" occurs and then checking which have the words grandiloquent and dictionary we can immediately limit the set to 222,000 documents; those documents that contain the relatively rare grandiloquent.

Ruud Hein

My paid passion at Search Engine People sees me applying my passions and knowledge to a wide array of problems, ones I usually experience as challenges. People who know me know I love coffee.

Ruud Hein

You May Also Like

7 Responses to “How Search Really Works: Simple Query Optimization”

  1. [...] How Search Really Works: Simple Query Optimization [...]

  2. I wish I wrote this! Nice work explaining a techie concept Ruud!

    Next up, I'd love to see you explain query-dependent and query-independent stuff, because I don't understand that very well, personally. My understanding is limited to some factors being processed ahead of time (prior to the search occurring, and being general relevance factors like PR and domain trust/age) and others being calculated on the fly (intitle, inanchor etc.)

  3. Ruud Hein says:

    Thanks Gab. I can't promise your topic is "next up" but I do have a whole slew of posts still to go!

  4. [...] This post is part of an ongoing series: How Search Really Works. Previously: Simple Query Optimization. [...]

  5. Utah SEO says:

    I haven't been able to catch up on your posts for awhile but I'm doing so now. I hope you keep up this series for awhile.

  6. Ruud Hein says:

    Happy to see you like it Jordan. Thanks for adding me on Twitter, by the way!

    I hope to keep the series going for a while, yes.