It's been a while now that I've wanted to write about search from scratch. Seeing the enthusiasm and openness with which Kimberly Bock (aka SpostareDuro) learns and writes about learning SEO basics inspires me to make that "someday" a "today".

In its simplest form a computer can be taught to search by giving it two sets of information: a series of documents to be indexed and a list of irrelevant words.

The list of irrelevant words includes common words (a, an, the, etc.) but can also include stop words; words you don't want to have in your index.

This type of search is very accurate. It will find every occurrence of the world "Google" in a set of 10 thousand documents. And it will do so faster than you or I can.

The drawback of this type of search is that unless something is expressly mentioned in a document, search can't find it. If in our example set of 10 thousand documents the word "Google" never appears, our search for "Google" will never return any results even though every single document might be about Google founders Larry Page and Sergey Brin.

words-on-page-only

Meta Keywords

One way to deal with this obvious shortfall of simple search is to add extra words to each document. In the above example we would add the word "Google" to every document about Larry Page and Sergey Brin. And Google is about? Search. OK, add the word "search" and anything else you can think of that might be relevant as well.

This type of words, words not used to write actual content but simply to list which topics it might be relevant to, are often called keywords. They're similar to labels or tags on paper files and products: "Bank Statements", "To Do's", "Folgers Coffee".

The type of information they provide is called meta data: information about the information. In our case, information about the content, the subjects in a document.

Web pages can come with a rich set of meta data and keywords is one of them.

 meta keywords

Unfortunately people aren't always completely truthful in describing the content of their documents. Or they can't be bothered to do so. Or when they do, they use words and phrases which are anything but helpful.

Links as Meta Keywords

Google solved people's dishonesty or unhelpfulness by largely ignoring the meta keywords embedded in web pages. Instead they treated words used to link to a web page as meta keywords, in effect adding them to the list of words that web page is made of.

How does that look? In our example we would have a web page which mentions Larry Page and Sergey Brin but doesn't mention Google.

Someone writing about Google links to that page using the words "Google founders Larry Page and Sergey Brin." Those words are now added to the list of words found in our document.

  links as keywords