Search Engine People - Search Engine Positioning, Placement Service
Home  |  Blog  |  About Us  |  Careers  |  News  |  Contact Us

How Search Really Works: Grabbing Most Red M&M’s

Ruud HeinWelcome! Thanks for visiting!

Subscribe to the full feed

by Ruud Hein
May 2, 2008

This post is part of an ongoing series: How Search Really Works.
Previously: Relevance (2)

Instead of painstakingly grabbing the absolute best matches for your query to then rank those with infinite precision, one time saving strategy has search engines go for “close enough”.

Painstaking Precision

sorted-mm

Given all the time, money and resources in the world, here’s what we’d normally do.

Word by word you go through a search. You look in your documents and see which has word one…. word two… word three…. You get the picture.

Read the full post (520 words)

How Search Really Works: Relevance (2) - Vector Space

Ruud Hein by Ruud Hein
April 11, 2008

This post is part of an ongoing series: How Search Really Works.
Previously: Relevance (1)

Another way we can assess the relevance of a document is by term weighting.

From the keyword density myth we know that true term weighting is done collection wide.

By looking at the number of documents in the index that a term appears in we can make a measurement of information: how good, how special… how meaningful is this word?

The word the would not be special at all, appearing in way too many documents. Its worth would be close to zero.

Read the full post (455 words)

How Search Really Works: Relevance (1)

Ruud Hein by Ruud Hein
April 4, 2008

This post is part of an ongoing series: How Search Really Works.
Previously: Simple Query Optimization.

Search is always boolean: yes or no. True or false.

Either the words are in the document or not.

boolean-search

But as you see, not all documents are “born alike”. Some are about our topic, some just mention it.

What we need, what we want, is not just a big list of results — we want a relevant list of results, preferably sorted so that the best bet appears on top.

Read the full post (157 words)

How Search Really Works: Simple Query Optimization

Ruud Hein by Ruud Hein
March 21, 2008

This post is part of an ongoing series: How Search Really Works.
Last week: The Compressed Index.

While human beings can scan a page and see if the whole phrase "a grandiloquent dictionary" appears on it, a search engine can’t.

A search engine needs to:

  1. Lookup the occurrences for each word in the phrase
  2. See if the positions of words in the document fit the phrase

As a search engine isn’t smart it needs to work smart.

Leverage Keyword Frequency

sort-by-frequency 

Read the full post (146 words)

How Search Really Works: The Compressed Index

Ruud Hein by Ruud Hein
March 14, 2008

This post is part of an ongoing series: How Search Really Works.
Last week: Recognize this index?

Memory is much faster than looking things up.

In order for a search engine in high demand to serve its users efficiently it should keep things in memory instead of looking it up on a disk.

Traditionally large scale search engines will keep their complete dictionary in memory and the posting list on disk.

dictionary-in-memory-postings-on-disk

Inefficient Storage

Obviously the more you can keep in memory and the more information can be read back with one disk action, the better.

Read the full post (413 words)

How Search Really Works: Recognize This Index?

Ruud Hein by Ruud Hein
March 7, 2008

This post is part of an ongoing series: How Search Really Works.
Last week: "The" Index (2).

Oversimplified: we have at least a few pages in our index, have extracted every single word from those pages and have written down in an index where in which pages those words occur.

Want to talk numbers? We have some very precise ones for the English language.

Google says;

"We processed 1,024,908,267,229 words of running text and are publishing the counts for all 1,176,470,663 five-word sequences that appear at least 40 times. There are 13,588,391 unique words, after discarding words that appear less than 200 times."

Read the full post (525 words)

How Search Really Works: "The" Index (2)

Ruud Hein by Ruud Hein
February 29, 2008

This post is part of an ongoing series: How Search Really Works.
Last week: "The" Index (1).

Last week we saw how an inverted index (where a list of words points to a list of documents in which they appear) is insanely useful for doing AND queries.

inverted index

But what if you’re not looking for any document that has the words search AND people AND engine but you’re looking for Search Engine People?

Well, if document 42 in our example reads "the engine was found after a search by some people" or "people use a search engine such as Google" than a traditional inverted index would think it’s spot-on for your search. Ai….

Read the full post (516 words)

How Search Really Works: "The" Index (1)

Ruud Hein by Ruud Hein
February 22, 2008

This post is part of an ongoing series: How Search Really Works.
Previous Instalment: The Keyword Density Myth.

If a search engine would search "live" through the documents it knows about for the occurrence of the word we’re looking for it could take its time and then simply report where it found our word.

In this example our search engine has only one index: the documents itself.

 document-only-index

However, time is something a search engine doesn’t have; the query needs to be answered now.

What we need is a real index!

Read the full post (315 words)

How Search Really Works: The Keyword Density Myth

Ruud Hein by Ruud Hein
February 1, 2008

This post is part of an ongoing series: How Search Really Works.
Last week: Keyword Stuffing.

What is Keyword Density?

Keyword Density is a function, a calculation, of keyword frequency.

It’s calculated as number of occurrences divided by number of words and is usually expressed as a percentage.

keyword density example 

What is Keyword Density Used For?

Nothing much, really.

Keyword density can help in readability calculations.

Keyword density is also sometimes used as a simplified manner to introduce local keyword weight but should never be confused with it.

Why don’t Search Engines use Keyword Density?

local-keyword-density

Read the full post (564 words)

How Search Really Works: Keyword Stuffing

Ruud Hein by Ruud Hein
January 25, 2008

This post is part of an ongoing series: How Search Really Works.
Last week: Keyword Links.

Left to their own devices, people will assign keywords (tag or link) as they please.

They paint a rich picture of the linked content.

natural linking

Keyword stuffing is the unnatural repetitive use of a specific word or phrase.

In your content….

keyword-stuffing

..or your links…

 keyword-stuffing2

Permanent link to this post (61 words, estimated 15 secs reading time)
1 2 Next »

Subscribe

Full Feed
Email Updates

Recent Posts

  • One Week of Sphinn SEO Lessons
  • Friday Funnies: If MySpace Were A Person
  • 50+ Sites To Help You Bury Negative Posts About You or Your Company!
  • A Letter of Apology to my Wrists
  • Social Networking Going Mobile
  • Huge Growth + Talent Shortage = Increased M & A Activity
  • Fumbling Your Site
  • Friday Funnies: Slogan Of The Month
  • Facebook - It’s the new Yahoo!
  • 25 of Digg’s Most Trusted Sites

Most Popular Ever

  • The Avatar Experiment - Stunning vs Cute vs Guy
  • Which SEO Lord of The Rings Character Are You?
  • Offline Web Links! What??????
  • Free Google Mobile Adwords
  • How To See Your Google Adwords Listings In Other Countries and Cities

Most Popular this Month

  • 25 of Digg's Most Trusted Sites
  • Facebook - It's the new Yahoo!
  • Friday Funnies: Slogan Of The Month
  • Fumbling Your Site
  • Local Search Predicted to Be Killer App for Mobile Phones

Subjects

  • Affiliate Marketing
  • Authority Building
  • Blogging
  • Branding
  • Canada
  • Content
  • Coupons
  • eBooks
  • En fran栩s
  • Events
  • Experiments
  • Francophone
  • Funnies
  • Google
  • Guest Post
  • How Search Really Works
  • Local Search
  • Mobile Search
  • MSN/Live
  • News
  • Online Marketing
  • Online Retailing
  • Online Shopping
  • Opinion
  • Pages Jaunes
  • PPC
  • Quebec
  • Reputation Management
  • SEM
  • SEO
  • Social Media
  • Stats
  • Technology
  • The Algorithm is Human
  • Tips
  • Tools
  • video
  • Yahoo
  • Yellow Pages

Archive

  • May 2008
  • April 2008
  • March 2008
  • February 2008
  • January 2008
  • December 2007
  • November 2007
  • October 2007
  • September 2007
  • August 2007
  • July 2007
  • June 2007
  • May 2007
  • April 2007
  • March 2007
  • February 2007
  • January 2007
  • September 2006
  • July 2006
  • May 2006
  • March 2006

Search


Recent Readers

The Writers

  • Jeff Quipp
  • Jennifer Osborne
  • Ruud Hein
  • Tom Tsinas

Top Commentators

  • Utah SEO (4)
  • Gab Goldenberg (2)
  • Stefan Vervoort (2)
  • Catfish (2)
  • Dev Basu (2)
  • hugo (2)
  • Oliver Taco (2)
  • Hobo (2)
  • spostareduro (2)
  • Linda Bustos (2)

Blogroll

  • AbleReach Blog
  • aimClear Blog
  • Bill Hartzer
  • Blah Blah Tech
  • Courtney Tuttle's Blog
  • DailyMoolah
  • DoshDosh
  • Geyser Marketing
  • Gray Wolf's SEO Blog
  • Jaan Kanellis
  • Justilien - Link Building
  • Learning SEO Basics
  • Matt Cutts Blog
  • New Orleans Internet Marketing
  • NorthSouthMedia
  • Nowsourcing
  • Profectio - Dave Forde
  • Quiddity - Essence SEO Blog
  • Search Engine College
  • Search Engine Jounal
  • Search Engine Land
  • Search Engine Watch
  • SEO by the SEA
  • SEO Design Solutions
  • SEOco UK Blog
  • SEOPittfall
  • SexySEO
  • Small Business SEM
  • Social Desire
  • Sphinn
  • Stepforth.com - Ross Dunn
  • Stephan Spencer's Scatterings
  • Stuntdubl
  • Techipedia
  • Tim Nash
  • Top Rank Blog
  • Trail of the Fire Horse
  • Utah SEO Blog
  • Yeepage Blogging Tips

SEO Toronto - Search Engine Optimization Specialists
Copyright © Search Engine People - All Rights Reserved.
Contact Us at 1-877-486-7875 or 905-426-9340 - contact@searchenginepeople.com