Google has implemented semantic search into its core algorithm by the recent introduction of Hummingbird. This is a phenomenal change and one of the biggest to happen after Caffeine. Many webmasters and internet marketers still have confusions regarding this new technology. In this post, I will try to sort out this confusion and discuss about semantic search and how Google implements semantics for predicting searchers intent in order to display results or return answers based on them.
What Is Semantics?
Semantics involves finding the relationship between words, phrases, symbols and the meaning they denote. It further involves the study of linguistics, syntax, etymology, communication, semiotics etc.
The Semantic Search
Semantic search involves the study and implementation of semantics in the search technology in order to find out the real intent of the searcher behind the search query and presenting the answers or set of results that closely relates to what the user is searching. It takes into account the importance of context and identifies a proper relationship between the terms used in the search query before presenting the final search results.
Where Does It Apply?
Search engines use semantics to return relevant results to the query. Ambiguous queries (those queries which have more than one meaning) are broken down and processed via set of pre-defined words helping the engines grasp the real context of the query. The use of semantics applies on research related queries where the user is looking for answers instead of navigating to a specific web page. Google applies semantics in its Knowledge Graph.
Page Rank And Relevancy Score: Two Basic Factors For Document Ranking
Google applies two basic factors for judging the importance and relevance of any webpage before ranking them. These factors are Page Rank (for measuring popularity by analyzing the backlinks) and relevancy (by analyzing the use of keywords or search query terms used in the webpage). But, this form of ranking documents do not help to find those pages which may be relevant to searchers intent as the popularity factor may reduce the rankings of semantically relevant documents. This is the reason that Google uses semantics to identify and prioritize the rankings of pages having semantically relevant content rather than only counting the keywords and backlinks for analyzing any webpage.
Query Processing In A Semantic Environment
The figure below describes the steps involved in the processing of the query by Google. The search query received by Google is parsed (using a parser) to identify one or more members (first and second search terms). In this process, synonyms or other replacement terms gets identified. These synonyms are known as candidate synonyms and they further get broken down and processed as qualified synonyms. Then, a relationship engine is used to identify the relationship between the members based upon their respective domains. Here a domain simply means a centralized category of similar words. First search term gets identified by the first domain which is a semantic category having a collection of pre defined entities. Similarly, the second term gets identified by a second domain also containing a database of similar entities. This helps Google to relate the terms to the closest matching identities (One essential point to note here is that Google will only find and relate words in the query with those already present in its database which is the Knowledge Graph, hence some queries although semantically similar might not show up). A separate search gets conducted by a query engine using domain matching relationship (do not get confused with the word domain with domain name, here domain means category) and final results gets displayed after a semantic query is identified (the query engine may pluralize or rephrase the query if required). Hence, in simple words, a complex query entered by the user is broken down and simplified involving several processes into semantic query. Thereafter, relevant web pages are identified and displayed as a final set of results.
Many search engine optimizers and internet marketers often miss the crucial part of identifying semantically related queries while doing keyword research because the main query gets broken down into semantic query before it is processed by Google. Hence, the chance of ranking increases when the content of the webpage is written keeping the semantic variants in mind mentioning all the entities matching specific domains.
Hummingbird And Semantics
Hummingbird is a change in search algorithm that utilizes several factors which helps to initiate conversation with the searcher and provides real answers to the queries instead of returning keyword matching documents. This is the dream of Google fellow, Amit Singhal (senior VP and head of Google search) to build a search engine similar to Star Trek which returns direct answers to users so that Google can be used as a personal assistant rather than a search engine. In his words The destiny of search is to become the Star Trek computer, a perfect assistant by my side. Hummingbird is all about conversation and long tail queries are often involved in conversation. Also, during conversation, we involve one or more entities and this is where Knowledge Graph and semantics enters. The crux is that Google has adapted its search algorithm to handle complex and conversational queries entered by the user with the introduction of Hummingbird. It has used semantics and Knowledge Graph to a much greater depth than it has used in the past. As I have mentioned before, do not confuse Hummingbird as a ranking factor, it is a change for better understanding of a search query. The signals for ranking documents remains the same and Panda, Penguin, EMD etc. are all parts of the main algorithm which is now the Hummingbird. Factors like Domain Authority, Page Rank, Social Popularity, Overall Content Relevancy, Tf-Idf Score, Domain age, Google Authorship, Use of Meta Data etc all contribute towards ranking a specific document. But, we can surely utilize this new model to adapt our existing content based on the manner a query gets parsed and identified.
As shown in the example below, a conversational query like How old is Justin Bieber returns a direct answer 19 years along with a Knowledge Graph. Here, Justin Bieber is an entity which Google has identified with the help of Knowledge Graph and accurately predicted the answer for the user query.
Hummingbird takes into account semantics and identifies relationship between search queries taking the help of Knowledge Graph before presenting the search results. A good point to note here is that semantics is not new for Google and the search engine giant has been using semantics from quite a long time but it was missing a detailed database of entity relationship that could help in easy identification of entities. After the introduction of knowledge graph in 16th May, 2012, Google added that missing database of entities that could quickly resolve the problem of finding relationship between entities. Hence, Hummingbird powered by Knowledge Graph is the new semantic model for Google.