How Search Really Works: Relevance (1)

by Ruud Hein April 4th, 2008 

This post is part of an ongoing series: How Search Really Works.
Previously: Simple Query Optimization.

Search is always boolean: yes or no. True or false.

Either the words are in the document or not.

boolean-search

But as you see, not all documents are "born alike". Some are about our topic, some just mention it.

What we need, what we want, is not just a big list of results — we want a relevant list of results, preferably sorted so that the best bet appears on top.

Boolean Zone Scoring

Zone scoring uses multiplication values (weights) to calculate the "relevance" of the occurrences of our search term based on how it appears in which zone of the document.

Document zones we're all familiar with are header/title, body/content, footer.

boolean-zones

These weights are generally machine-learned by running test queries on a clean, non-spammed, non-gamed index. Relevance judges gauge how relevant the test results are.

Next week: Term Weight Scoring

Ruud Hein

My paid passion at Search Engine People sees me applying my passions and knowledge to a wide array of problems, ones I usually experience as challenges. People who know me know I love coffee.

Ruud Hein

You May Also Like

13 Responses to “How Search Really Works: Relevance (1)”

  1. Hey man,
    that's a pretty nice post. got to sign up for the RSS feed and stumble :) .

    I'm looking forward to the rest of teh series.

  2. Dave says:

    Interesting post. It depends in what context the keyword phrase is being used as well.

  3. Didn't know that it was called 'zone scoring'. Thanks for the informative post Ruud

  4. Forumistan says:

    Yep, it depends on the keyword phrase…

  5. [...] How Search Really Works: Relevance (1) [...]

  6. gtd says:

    Relevance is the reason why many of us do prefer del.icio.us to Google when we want good results. Instead of a search term, you use a tag. Also you can check which are "best" results seeing how many people has linked the same URL to that tag. Also you can feed links to a given tag into your RSS aggregator, to check for new content. Maybe we just need to check alternatives to search sites.

  7. By non-spammed/gamed, you mean NOT like the knitting for grandkids SERPs?

  8. Ruud Hein says:

    Thanks for the comments and feedback. It always helps to see which posts, or part of a post, resonate.

  9. [...] This post is part of an ongoing series: How Search Really Works. Previously: Relevance (1) [...]

  10. Utah SEO says:

    Wouldn't search engines use synonymy in conjunction with exact match keyphrasing to weigh relevance?

    I guess the ultimate question is could a document rank highly for a long tail phrase merely based off semantically related natural language with no direct keyphrase mention?

  11. Ruud Hein says:

    @jordan Good questions to which I don't want to say yes or no; not showing the hand of future posts :) (but psst… in general… no)

  12. Kurt says:

    Well , now I know better how to search, thanks for the information.

  13. [...] This post is part of an ongoing series: How Search Really Works. Previously: Relevance (2) [...]