Site icon Search Engine People Blog

Ruud Questions: Mike Grehan

To people in the search industry, Mike Grehan needs no introduction. To those outside of it: Mike Grehan consistently comes up with ideas about search which, in hindsight, often proof to be right. He's able to articulate those ideas in a simple way and either causes people to smack their forehead as they wonder why they didn't see this -- or infuriate them as we love to disagree with what some perceive as linkbait.

In short: Mike Grehan is to SEO what Jakob Nielsen is to usability.

He is Acronym Media. There he also published the amazingly interesting New Signals to Search Engines: Future Proofing Your Search Marketing Strategy.

Mike will be speaking at SES Toronto on Signals: What Relevancy Indicators Are Search Engineers Watching For Today? and Is PageRank Broken? The Future of Search. He'll also moderate Universal and Blended Search: Comprehensive Visibility Challenges and oversee Extreme Makeover: Live Site Clinic.

Highlights
  • Search returns results from only a fraction of the web
  • Toolbar data is about navigation data; landing page quality, discovery
  • New search is harder to game
  • "Ranking" nodes on a social network is entirely different from ranking static pages
  • The Internet and the web are two different things: apps very important for marketing of the future
  • Social media/networking is more marketing than SEO

You give two rules of thumb for the new SEO (or New SEO):

1. to understand where SEO is going, see where *search* is going
2. the end user experience is key

Besides your gut instinct grown from years of experience with search, which dots do you tend to try to connect normally? Patents? White papers? Media quotes from folks on the inside?

I became completely intrigued by information retrieval on the web after a contact at a search engine many years ago gave a me tip to read Professor Gerard Salton's book "Introduction To Modern Information Retrieval (IR)." Back then I could only grasp the bare fundamentals because, as a practicing marketer, I'd never had a need to read scientific literature.

However, even this basic knowledge opened my eyes to what was really happening under the hood at search engines. Salton is readily quoted as the father of modern information retrieval. And although the book was first published back in 1983, beyond the vector space model he developed for automatic text retrieval and indexing, he was already talking about the use of citation analysis for ranking. Of course, that's a term we hear quite frequently in SEO circles as both Jon Kleinberg's HITS algorithm and the more widely known PageRank algorithm both draw on this principle.

Once I had immersed myself in the world of IR I started to gain a much better understanding of how search really works. Prior to that, like most of the early search engine positioning (as we were known then) practitioners, I worked mainly on anecdotal evidence.

Yes, I do read patents and monitor the literature as well as keeping up to speed with lectures from various researchers in the field. But I'd probably say that as a member of acm the access I get to their huge digital library, in particular the SIGIR (information retrieval special interest group) section is one of my main resources.

You've commented on the size of the index, it's crawlability and the tension between those two: sheer size makes constant updating of the entire web impossible -- and when possible it's not real-time enough for the end user.

Have we reached the "good enough" saturation point where grabbing the first 10-1000 sites for any query gives enough value?

I think it's safe to say that most search engines (if not all) work with a tiered index. So keywords don't necessarily map to every single document. And certainly, there's an element of local ranking involved. By that I don't mean local results, I mean more about trying to draw on clusters and communities.

But the web continues to grow exponentially every day, in particular with user generated content, so search engines have only ever been able to serve results from the fraction of the web that they have managed to capture in the crawl. That means it has really always been a "good enough" situation. From day one of search, there has probably always been a better result for certain queries somewhere else on the web that a search engine hasn't yet discovered.

I don't think crawling the web, or the HTTP/HTML protocol is going to go away any time soon, but certainly the paper I wrote on "new signals" to search engines gives an indication of how, progressively, information retrieval on the web will change.

An essential part of the new search is feeding user data back in the system and using that to shape the user experience. One of the sources you mention is toolbar data. Is what is being measured that way not simply another version of the rich get richer? (PDF)

The toolbar data allows search engines to get an indication of something they had no idea about previously: where did the searcher go next? It's a little presumptuous to think that each time a searcher clicks on a link in the SERPs that they automatically land on exactly the result they were looking for.

Just because they don't hit the back-button on the browser doesn't essentially mean that they were satisfied with the result. They may take a journey of six or seven clicks before they find exactly what they were looking for. So the toolbar gives a search engine the opportunity to discover user trails. And it's the user trails to wherever they drop out which gives a high indication of quality.

It's not really like the rich get richer scenario, because a search engine can follow your direct navigation too. So if you bookmark a page which you originally found through search and make repeat visits, that's a good signal of quality. And just being able to follow direct navigation regardless of whether search was involved can give a strong signal of popular resource locations that the search engine may not be aware of via the crawl. There's a ton of data that can be mined to get a better insight into the behavior of the end user.

Relevancy indicators search engineers watch for today (links to SES page) include email, link sharing on social networks, up/down voting, etc.

In other words, new search calculates citations and co-citations taking into account *more* sources than "just" web pages.

Does that mean that the new search will suffer from the same manipulation issues?

Well, the outside world is full of good guys and bad guys and everything in between. So there are always people looking to game the system (whatever the system is). But I do believe as we move further away from just looking at links and text, it will be a harder system to game. In the case of social search (when we eventually get there) you're dealing with information in a network of trust. It's harder to game a tight community of people in a social network than it is to cloak for a crawler.

With pure link citation calculations engineers figured out that they could adjust for some manipulation *and* make the results more relevant if links from some sites would carry more meaning (authority) than links from another.

Are we going to see the same applied to the social link spectrum then so that a link shared by Jane Doe has less power, less meaning, less authority, than a link shared by Dave Winer, Matt Cutts, or Steve Rubel?

Network theory is going to play a vital role in the future of marketing. Already it has been applied to the web with the algorithms I've mentioned (HITS/PageRank) and both of those algorithms are being tested in social networks.

Just as these citation analysis based algorithms determine mathematically the prestige of a particular page or file type, in the same way they can be used to find the prestigious people in a network. In marketing we refer to these people as opinion leaders, or early adopters. So data mining the links between people is going to be very important. The internet is not just the world wide web anymore. It's a network of networks of people constantly connected and communicating. Understanding these networks, I believe, is the future not just of search marketing, but all marketing.

Having said that, we're talking about living breathing human beings and not static web pages, so ranking is going to be an entirely different to that of discovering which static web pages have a link to another and more.

To seed user data you'd have to come from a wide spread of IP's, different configurations, different surfing behaviour, different regions, different queries... In other words, you'd need to rent a botnet.

If you were in a decision making position at Google and someone suggested integrating user data -- how would you suggest the company protects itself against the New Black Hat?

A senior researcher recently said: "As Google moves away from a web of content to a web of applications" Or something very close to that and I realized I wasn't too far wrong in my thought paper about side-stepping the browser. The internet and the world wide web are two different things. So the development of apps for everything from your laptop to your mobile device to Amazon's Kindle and so much more will probably see the end of basic HTML spam in the wider marketing universe.

I also read recently that Google has a guy who they believe to be the best programmer in the Galaxy writing Chrome from the ground up. This is not another version of Chrome as it exists. So when I put that together with the quote I mentioned above, it suggests to me that Chrome is about to become much more of an operating system than a browser.

And add to that their foray into mobile devices with Android and you can see how important apps will be in the future of marketing (and search).

Two basic use scenario's for social media, social networks, keep coming back:

1. make something, anything, "go viral"; get links,
2. branding; especially branding through mere engagement, not just presence.

Is that it? Is that all we can do with these tools?

So we need to break this down into components and stop calling it one thing: Social media. First we need social networks and then we can have social media, after that we can start and talk about social search as I've mentioned previously.

I say this with true respect, Ruud, but you shouldn't make the same mistake as many in the industry by trying to draw too strong a link between SEO and (as we know it) social media. The link is actually very tenuous and applies mainly to (I hate this term) link bait.

Social media (as it's known) is literally just the further development of public relations amalgamated with the world's biggest panel testing service. As marketers we listen, we learn and then we respond. There's a whole lot more to marketing online in this sphere than there is in SEO.

It's a much longer conversation and it's beyond the scope of this general interview. But maybe we can do a return some time in the future to cover the many dimensions of social on the web and not just the little bit that SEOs want to claim for themselves.

Finally for the geek in me:

- favorite email application/service?

If I haven't already been boring enough. I'm going to go into boring overdrive with these questions. I'm a Outlook guy on my desktop and a Gmail account for freedom mail.

- favorite application to store searchable information?

I hear the sound of zzzzzzs coming on again. I keep all of my stuff in various locations around the planet (I travel a lot and live in different places) . So my favorite tool for accessing everything I own digitally is a GoToMyPc account. This way I can access all of my various computers from any location in the world (which usually means from my laptop).

- and you time manage/organize.... how?

And if you haven't slipped into a coma already I use Outlook for just about everything . I have a number of different add ins I use such as a CRM component. But generally, even though reading my blog or my Twitter posts you'd never believe it, I'm a very organized person.

Yes, I head up the biz-dev and marketing department of a very busy and fast growing New York agency, but I still travel so, so much. That means I need to use technology to keep things as simple as possible. So no digital wizardry for me. I'm very much a KISS person.