Site icon Search Engine People Blog

Everything You Need To Know To Understand Google’s Knowledge Graph

Back in May 2012 Google announced the launch of the Knowledge Graph, a brand new mechanism to help searchers "discover new information quickly and easily". But what exactly is Knowledge Graph? Where does Google get the knowledge? And how, with so much information out there, does Google know what to display?

If these are the sort of questions on your mind then keep reading to find out more about the Knowledge Graph.

What is the knowledge Graph?

Knowledge Graph is a system which has been around for a while now but is arguably one of the biggest changes to Google in recent years. In a nutshell the Knowledge Graph is a system utilised by Google to enhance the relevancy of search results by providing a range of facts, figures and relevant data related to the users most likely search intent. This approach allows a user quick access to additional information and provides the ability to explore related subjects within that search.

Knowledge Graph is Google's first step towards showcasing its semantic search capabilities, becoming more of a "knowledge engine" rather than the traditional "information engine" model which we've come to know and love. Google Product management Director Johanna Wright stated:

"We're in the early phases of moving from being an information engine to becoming a knowledge engine and these enhancements are one step in that direction"

As people, we know that some words can have a variety of meanings depending on the context in which they are used. Until now Google have never truly understood that context. Semantic search and Knowledge Graph are the first steps towards truly understanding the meaning of a given word and how it relates to other real-world things (or entities as they are otherwise known).

Google is rapidly moving away from serving search results based on a 'keyword match' model to a more intelligent model which can interpret the meaning of the search query in the same way as people do.

This approach allows Google far more flexibility and grants the ability to answer far more complex search queries, more successfully predicting the intent of the searcher and refining search results based on the initial query.

For example, if I perform a search for my company High Position, I'm instantly provided with relevant information such as a map, the business address and telephone number all within search results. There's even a somewhat questionable team photo (I should really change that!).

Embarrassing photo aside, if I were a customer looking for the postal address or a contact telephone number (which prior to Knowledge Graph I would have obtained from the website) Google have predicted my intent and provided the information directly within search results, thus shortening my user journey. Pretty cool hey?

So, at the heart of the Knowledge Graph sits a collection of semantic information deriving relationships and connections between real-world things or "entities", and its this semantic data which allows Google to better understand user intent - identifying the contextual meaning of a search query, thus providing the ability to serve more relevant data within search. And at the front of the Knowledge Graph sits a collection of knowledge panels and data carousels which provide the quick access to the related data.

How Intelligent is Knowledge Graph?


Rather than questioning its intelligence in its customary sense, perhaps a more pertinent question would be "how many entity relationships is Google aware of and how many corresponding facts does Google hold?" Or just how much does it know?

Back in May 2012 when Knowledge Graph was initially launched Google stated that their pool of knowledge contained information about more than 500 million entities with more than 3.5 billion corresponding facts about the different entities and the relationships between them. At the time this seemed like an eye-wateringly large amount of data, although many searchers were still not aware of the far-reaching implications this data would hold.

In December 2012, 6 months after the launch, Google revealed that they now held data on 570 million entities and 18 billion facts about the entities and connections between them.

So in 6 months Googles entity knowledge grew by 14% and the fact base grew by an astonishing 414% - and that was almost a year ago! So its safe to say Knowledge Graph will have obtained a far greater depth and breadth of information since then.

In the 6 months from May 2012 to December 2012 Knowledge Graph grew at a rate of 10 million entities and ~2.14 billion facts per month. If Knowledge Graph has continued to develop at this exponential rate from December 2012 until now it may now contain somewhere in the region of 680 million entities and 42 billion facts/connections! Thats a lot of knowledge! Im sure the mathematicians our there will somehow prove that the growth curve must slow with time, but you get the idea!

Where Does Google Source the Knowledge Graph Data?

A common question on the minds of many people when discussing the Knowledge Graph is how and where do Google even begin to acquire this level of data? Where do these facts and connections come from and how do Google know it's accurate? They must get the data from somewhere, right?

Correct, they outsource!

Google are a clever bunch but its fair to say, but they dont know everything about everything, so they have to rely on data from third parties to build this pool of knowledge. To do this Google extract this data from a wide variety of sources such as Wikipedia and CIA World Factbook, subject specific resources such as Weather Underground or World Bank for weather information and economic statistic, and public data sources such as Freebase, as well as Googles own massive pool of search data.

Whilst services like those mentioned above are the primary source for entity and fact based data and establishing connections between them, other factors such as structured data can also influence visibility of data within the Knowledge Graph. For example, the Data Highlighter in Google Webmaster Tools can assist webmasters in notifying Google of structured data for use in Knowledge Graph and SERPS (Search Engine Results Pages):

There are also many other factors which a webmaster can utilise to inform Knowledge Graph such as schema.org Organization markup for designating a businesses logo, inclusion of your business within Google Places for Business to supply Knowledge Graph with business and contact details (as per the previous High Position example), and the verification/use of Google+ with Authorship markup to generate recent news and 'follower' information.

Data Accuracy

So with the dependence on third party data how can Google be sure of data accuracy? Data accuracy is without doubt one of the most concerning issues for Google when it comes to knowledge graph. Im sure its a massive bug bearer for the big G and its fair to say they dont always get it right.

Shortly after the launch of the Knowledge Graph Conductor provided this report which showed a minimal 20% accuracy for selected trending search terms. There have also been various eSafety concerns such as inappropriate nude photos appearing in knowledge panels as well as various other anomalies and errors. Some people have called Knowledge Graph a "best guess graph" but Google say:

"Our goal is to be useful; we realize well never be perfect, just as a persons or librarys knowledge is never complete" - Source:webpronews.com

So Google are clearly aware that the Knowledge Graph data may never be perfect. Google Senior VP Amit Singhal told Search Engine Land's Danny Sullivan "Google will use a combination of computer algorithms and humans" to determine whether facts are indeed facts or fiction, and if it is the latter then the relevant service provide will be informed of the error.

Its all too easy to be negative about Knowledge Graph and find example of areas where data may not be 100% accurate. Just writing this post I've encountered a few examples of misinformation such as this example in the knowledge panel for Amit Singhal in which his recent post is apparently this article on fastcompany.com, actually written by Ayana Byrd of Fast Company.

But let's not forget Google also rely on us, the users, to provide feedback when we encounter an wrong information within Knowledge Graph.

No Knowledge Graph isn't perfect, but in reality it's a massively comprehensive system sourcing and cross referencing data from a range of sources to provide the most relevant response to a query. That's no easy feat. Nothing is perfect and Knowledge Graph is not without fault, but this is a massive step for Google and a work in progress.

So What Type of Information Does Knowledge Graph Present?

With awareness of at least 570 million entities, 18 billion facts and their semantic relations its fair to say that Knowledge Graph covers a broad range of information.

Google state that there are several fundamental types of information that you should expect to find within Knowledge Graph results, which are:

These fundamental data types can take many forms depending on the intent of the user. We've covered several types of data already in this post but here are a few more which you may expect to see on a day to day basis.

Answer Boxes

A very common knowledge panel aimed at providing direct answers to a given question or otherwise providing a relevant definition, such as this example for the very pertinent question of "what is knowledge graph".

Personal Information


When performing a search for an iconic figure such as an actor, musician or an otherwise famous person you may expect to see information such as images and a brief description of the person. However you may also see information such as their children or spouse, TV shows theyve appeared in or albums which they've made, such as this example for Will Smith.

Now you can really begin to see how Knowledge graph starts bringing together information through connections between different entities.

By the way, each of the nuggets of information are links (in typical Google blue) to the relevant topics.

Episode & Cast Information


With the Personal Information example we began to see how Knowledge Graph pulls on related data on topics such as Movies & TV Shows; however Google have taken this one step further by integrating data around the episodes and the cast.

For example in this search for Doctor Who, which would have been very appropriate last week following the recent 50th anniversary of the first ever episode of Doctor Who, we can see details of the first and final episodes (spot the error!), the writers and the cast.

Did you spot the error? If not, pay close attention to the date of the final episode. I mentioned that Doctor Who celebrated their 50th anniversary recently but according to Google episodes only ran for 26 years from 1963 - 1989!

Image Carousel

You may also run into the carousel from time to time depending on your search intent. The carousel provides a very visual integration of relevant search results and may consist of a number of entities.

Searches involving filmography and discography often reveal a carousel of image results relevant to that query, such as this example for will smith filmography

Tourist Attractions Carousel

Likewise, searching for tourist attractions in a specific location often yields a carousel of relevant attractions.

You may notice the arrow on the right hand side which allows scrolling directly within SERPS whre additional information is available.

Points of Interest

Talking of tourist attractions, searching for a specific location also often produces Points of Interest relevant to that particular location, such as this example for Colchester which pulls in the very popular Colchester Zoo and the historic Colchester Castle.

You'll also notice that in this instance an example of how Google have integrated the current weather into the knowledge panel. Its pretty accurate too!

What's the Weather Like?

Talking about weather, do you want to learn a little more on what the weather is going to be like in your location? Just ask Google!

The screenshot below shows a non-geographic based query that combines Google's geo-targeting with data sourced from the Weather Channel, Weather Underground and Accuweather, presented in this nifty knowledge panel.

Nutritional Information

Launched in May 2013, you may also see a variety of facts and figures, and perhaps even a calculator when searching for nutritional information, such as this example for chicken meat.

In this example for nutritional information about chicken we can see a calorie calculator for the type and weight of the chicken as well as a whole bunch of other nutritional facts. This may be extremely useful in the run up to Christmas to ensure you keep off those Christmas pounds. Unfortunately a search for how many calories are in a turkey didnt produce any nutritional information nor did Christmas pudding - perhaps Google dont want to ruin our Christmas!

Medical Information

Also relevant to the cold, horrible winter period you may seek medical information on specific medicines. Initially added back in November 2012 Knowledge Graph can now provide a variant of information related to medicines.

Lets hope the accuracy of medical information is sufficient. This is definitely one of those instances whereby Google dont want to get the information wrong!

Comparison Search

Finally you may not be aware that Knowledge Graph allow you to compare certain types of information, such as this example of Earth compared to Mars.

In this example Google provide popular information such as distance from the sun, radius and surface area, utilising a show/hide mechanism (the downward arrow at the bottom) to reveal various other nuggets of information from gravity and density to orbital period and escape velocity.

Filters

I thought I'd end this section by briefly mentioning that Google are getting ever more savvy with Knowledge Graph and entity relationships and are now capable of filtering certain types of information. In this example you can filter Foo Fighters albums by most popular, oldest to newest and vice versa.

Or how about this commonly used example whereby a user can filter artists by genre?

No doubt there are many more examples of Knowledge Graph data which I haven't covered; but I hope by seeing these examples you can begin to see how Google is evolving Knowledge Graph into an intuitive system designed to enhance the users experience, providing a vast array of valuable data and insight.

How Does Google Determine What Information to Display?

With such a breadth and depth of entity relationship and fact drive data there's potentially a lot for Google to choose from. So how does Google decide which information should and should not be displayed?

The data displayed in each knowledge panel is often responsive to the search query received; meaning that the intent of the search often determines what data is displayed. Google state:

"

[they refine] the information based on the most popular questions people ask about that subject"

So if users often ask a particular type of question then that can influence what information is shown in the Knowledge Graph. This alone is a pretty interesting subject so for those interest Bill Slawski of SEO by the Sea created this great post on this subject which explains details the Knowledge Graph patent and breaks down how Google may decide which data should be displayed. Well worth a read.

Knowledge Graph is Here to Stay

Knowledge Graph now has a massive role in search results. When launched back in May 2012 it was initially only available in English, in the US. Knowledge Graph quickly expanded into Spanish, French, German, Portuguese, Japanese, Russian and Italian in December 2012, followed by Polish, Turkish, Simplified Chinese and Traditional Chinese in May 2013. More recently Knowledge Graph has been rolled out to cover Croatian, Serbian, Hindi, Slovak, Lithuanian, Slovenian, Persian, Catalan, Latvian and Filipino. That makes Knowledge Graph available in 22 languages worldwide!

Google are striving to change alongside the natural evolution of our search behaviour. Sometimes, as webmasters, it feels like were continually chasing Google round and round in circles trying to find out what works, what doesnt work, what we can do better and where on earth Google is headed next. But in reality Google is chasing us, the users, always trying to evolve with us, providing the most relevant results and the best user experience.

As user search habits evolve we're naturally becoming savvier to what we want to know thus our search behaviour is changing. The growth of mobile has played a massive role in this where the introduction of voice search has lead to a huge expansion in natural language queries and conversational search. You only need look at the totally hands free Google Glass and "parameterless search", there's no keyboard in those!

As users were naturally moving away from the generic keyword-match approach and moving towards more complex queries looking for more precise answers. So Google must evolve to cater for our needs, not the other way around! The introduction of the Hummingbird algorithm was a move towards a more "human friendly" way to provide answers to direct questions by achieving a better understanding of the meaning behind the words - semantic search. All of these factors combine with the aim of creating a user experience like no other.

Its clear that the evolution of knowledge graph is one which is not stopping here. What's coming next? Who knows! But one thing is for sure, Knowledge Graph and the move to the semantic web are here to stay!

What are your experiences with Knowledge Graph? Do you think its missing any vital information or have Google hit the nail on the head? Will Google ever be able to identify intent accurately enough to one day evolve search beyond the keyword-based search we know today?