Ruud HeinWelcome! Thanks for visiting!

Subscribe to the full feed

How Search Really Works: Relevance (1)

by Ruud Hein.


This post is part of an ongoing series: How Search Really Works.
Previously: Simple Query Optimization.

Search is always boolean: yes or no. True or false.

Either the words are in the document or not.

boolean-search

But as you see, not all documents are "born alike". Some are about our topic, some just mention it.

What we need, what we want, is not just a big list of results — we want a relevant list of results, preferably sorted so that the best bet appears on top.

Boolean Zone Scoring

Zone scoring uses multiplication values (weights) to calculate the "relevance" of the occurrences of our search term based on how it appears in which zone of the document.

Document zones we're all familiar with are header/title, body/content, footer.

boolean-zones

These weights are generally machine-learned by running test queries on a clean, non-spammed, non-gamed index. Relevance judges gauge how relevant the test results are.

Next week: Term Weight Scoring

I hang out at Twitter where I enjoy the company, the buzz, the nuggets of info and opinion we pass along.
Join me on Twitter!

submit guest post


As posted in How Search Really Works on April 4, 2008.

13 Responses so far: 10 comments and 3 trackbacks

  1. Hey man,
    that's a pretty nice post. got to sign up for the RSS feed and stumble :).

    I'm looking forward to the rest of teh series.

  2. Dave says:

    Interesting post. It depends in what context the keyword phrase is being used as well.

  3. Didn't know that it was called 'zone scoring'. Thanks for the informative post Ruud

  4. Forumistan says:

    Yep, it depends on the keyword phrase…

  5. gtd says:

    Relevance is the reason why many of us do prefer del.icio.us to Google when we want good results. Instead of a search term, you use a tag. Also you can check which are "best" results seeing how many people has linked the same URL to that tag. Also you can feed links to a given tag into your RSS aggregator, to check for new content. Maybe we just need to check alternatives to search sites.

  6. By non-spammed/gamed, you mean NOT like the knitting for grandkids SERPs?

  7. Ruud Hein says:

    Thanks for the comments and feedback. It always helps to see which posts, or part of a post, resonate.

  8. Utah SEO says:

    Wouldn't search engines use synonymy in conjunction with exact match keyphrasing to weigh relevance?

    I guess the ultimate question is could a document rank highly for a long tail phrase merely based off semantically related natural language with no direct keyphrase mention?

  9. Ruud Hein says:

    @jordan Good questions to which I don't want to say yes or no; not showing the hand of future posts :) (but psst… in general… no)

  10. Kurt says:

    Well , now I know better how to search, thanks for the information.

Trackbacks/Pingbacks

  1. [...] How Search Really Works: Relevance (1) [...]

  2. [...] This post is part of an ongoing series: How Search Really Works. Previously: Relevance (1) [...]

  3. [...] This post is part of an ongoing series: How Search Really Works. Previously: Relevance (2) [...]


Friend Connect

RECENT READERS

English flagItalian flagKorean flagChinese (Simplified) flagChinese (Traditional) flagPortuguese flagGerman flagFrench flagSpanish flagJapanese flagArabic flagRussian flagGreek flagDutch flagBulgarian flagCzech flagCroat flagDanish flagFinnish flagHindi flagPolish flagRumanian flagSwedish flagNorwegian flagCatalan flagFilipino flagHebrew flagIndonesian flagLatvian flagLithuanian flagSerbian flagSlovak flagSlovenian flagUkrainian flagVietnamese flagAlbanian flagEstonian flagGalician flagMaltese flagThai flagTurkish flagHungarian flag