This post is part of an ongoing series: How Search Really Works.
Previously: Simple Query Optimization.
Search is always boolean: yes or no. True or false.
Either the words are in the document or not.

But as you see, not all documents are “born alike”. Some are about our topic, some just mention it.
What we need, what we want, is not just a big list of results — we want a relevant list of results, preferably sorted so that the best bet appears on top.
Boolean Zone Scoring
Zone scoring uses multiplication values (weights) to calculate the “relevance” of the occurrences of our search term based on how it appears in which zone of the document.
Document zones we’re all familiar with are header/title, body/content, footer.

These weights are generally machine-learned by running test queries on a clean, non-spammed, non-gamed index. Relevance judges gauge how relevant the test results are.
Next week: Term Weight Scoring
I hang out at Twitter where I enjoy the company, the buzz, the nuggets of info and opinion we pass along.Join me on Twitter!
As posted in How Search Really Works.
You're welcome to join the conversation; add your response. You can track the conversation using the RSS 2.0 feed.
You can also trackback from your own site.
13 Responses to “How Search Really Works: Relevance (1)”
Trackbacks
-
Ask Ruud - Relevance (1) Says:
April 8th, 2008 at 5:35 am[…] How Search Really Works: Relevance (1) […]
-
How Search Really Works: Relevance (2) - Vector Space Says:
April 11th, 2008 at 2:37 pm[…] This post is part of an ongoing series: How Search Really Works. Previously: Relevance (1) […]
-
How Search Really Works: Grabbing Most Red M&M’s | Search Engine People Blog Says:
May 10th, 2008 at 7:08 pm[…] This post is part of an ongoing series: How Search Really Works. Previously: Relevance (2) […]

April 5th, 2008 at 9:56 am
Hey man,
that’s a pretty nice post. got to sign up for the RSS feed and stumble :).
I’m looking forward to the rest of teh series.
April 6th, 2008 at 11:18 am
Interesting post. It depends in what context the keyword phrase is being used as well.
April 7th, 2008 at 2:41 pm
Didn’t know that it was called ‘zone scoring’. Thanks for the informative post Ruud
April 7th, 2008 at 4:20 pm
Yep, it depends on the keyword phrase…
April 9th, 2008 at 3:24 am
Relevance is the reason why many of us do prefer del.icio.us to Google when we want good results. Instead of a search term, you use a tag. Also you can check which are “best” results seeing how many people has linked the same URL to that tag. Also you can feed links to a given tag into your RSS aggregator, to check for new content. Maybe we just need to check alternatives to search sites.
April 9th, 2008 at 7:01 am
By non-spammed/gamed, you mean NOT like the knitting for grandkids SERPs?
April 9th, 2008 at 9:16 am
Thanks for the comments and feedback. It always helps to see which posts, or part of a post, resonate.
April 13th, 2008 at 1:34 am
Wouldn’t search engines use synonymy in conjunction with exact match keyphrasing to weigh relevance?
I guess the ultimate question is could a document rank highly for a long tail phrase merely based off semantically related natural language with no direct keyphrase mention?
April 13th, 2008 at 9:49 am
@jordan Good questions to which I don’t want to say yes or no; not showing the hand of future posts
(but psst… in general… no)
April 14th, 2008 at 1:49 pm
Well , now I know better how to search, thanks for the information.