Inside Google’s search quality group, Amit Singhal runs the core ranking team, which is responsible for those algorithms you hear so much about. The team ran some 6,000 experiments last year that tried tweaking those mathematical formulas, ultimately producing between 450 and 500 small and large changes in Google’s search engine.
Singhal, a former researcher at AT&T Labs (the former Bell Labs), joined Google in 2000 and now is a Google Fellow, a title reserved for its most accomplished engineers. For a story on Google’s search efforts in the latest issue of BusinessWeek, he spoke to me about some of the inner workings of the team whose work determines the results you see after you type a query into Google.
This is the second in a series of four interviews with the leaders of Google’s search quality team, following an interview posted yesterday with Udi Manber, VP of technology for search. Interviews with Scott Huffman, who runs search evaluation, and Matt Cutts, head of the anti-Web spam team, will run on Tech Beat on Saturday and Sunday. And if you want the big picture, we just posted an interview with Google CEO Eric Schmidt.
Q: What attracted you to Google given your academic background?
A: There was an open hunger for new ideas. Most of these people were relatively new to search at the time compared to me. The environment was and still is that no idea is a bad idea. All ideas are great ideas to try and experiment. You try hundreds of ideas and not all of them will succeed.
Google has evolved over the years. But we have tried very hard to maintain the same culture of innovation in each group. Within each group, we have tried very hard to maintain the same hunger. Each group of 300-500 people works like a small company and moves very fast.
Q: Can you give me examples of how your team improves search?
A: We launch hundreds of changes every year. Some are small, such as getting an acronym right, some are as big as Universal Search.
Q: Tell me about a small one, like acronyms.
A: A few years ago, our engineers were noticing that on acronyms, we were returning lots of good results but the bolding on the page was not sufficient. If you type CIA, it could mean Central Intelligence Agency, or it could mean Culinary Institute of America. If we did not highlight that this result will go to the government agency, and that result is related to food, users were taking more time to click, wasting more of their time. Could we shave off 30 or 40 milliseconds off in their reaction time? Within a few weeks, we had an experiment, users were liking it, the clickthrough rates were great, and their response times were down.
Q: How do you decide what improvements to try?
A: Sometimes someone sends us a complaint on a query result and we look at it and say, Why is this happening, this is just broken. Most any query that any Googler complains about comes to my email box. I track every query that is brought by a Googler or a friend or family member.
We don’t do things by hand. When you actually build algorithms to solve problems, not only do you benefit the users you are most conversant with, the U.S. users, but our algorithms go around the world. By building this algorithmic antidote and observing what signals make it possible, we can make improvements to our system worldwide.
Q: Language is inherently ambiguous. How do you infer what people are really looking for with a query?
A: One big issue is stemming—the idea that (for example) run, running, and runner are all variants of the same word. This has been studied deeply in academia for 30 to 50 years. In my academic life, I said, of course, apple means the same as apples.
But when I come to the real world, apple means one thing and apples means something else. Humans understand that in some contexts, Apple means a computer, and in some contexts it means a front. Or in some contexts, GM means General Motors and in some contexts, GM means genetically modified. Or does the word bio mean biography? It depends on what’s around it.
When I came here, my first stemming system just exploded in my face. It didn’t work. Someone would type “growing apples” and I would stem that query to “grow apple” or “growth apple” and the documents we returned would be about Apple’s quarter-over-quarter growth. The user meant nothing like that. Clearly my stemming algorithm did not give the user what they wanted.
So we got a team together and tried to figure out how do we do it for users on the Web, and that’s how we came up with a few patents on this technique on how you make word equivalence work with search systems. Does auto mean the same as car? It depends if it’s about trains or not. What seems obvious to human beings, machines cannot do. It seems so easy to human beings. But algorithms cannot do it that easily.
Q: Udi talked about the search quality team doing more than 6,000 experiments a year. How does Google do that many in a year?
A: You need that infrastructure where they can go from idea to data in a few hours. What if I make that yellow? If I can try it on a full-blown Web search system and get an evaluation back in a few hours, then you know if it’s a good idea. Much less than 1% of our traffic is enough to gauge something.
Just today, I had a discussion with a team, they’re trying to do something that indexes a whole lot more information than we currently index about a document. You can imagine it requires a whole lot more disk [drives] and a whole lot more CPU [computers], and I say, What’s taking you guys so long, what’s the slowdown? They said, well, we need three times the disk and this X the CPU, and I go and talk to our infrastructure leads today and they’re going to get their disk and CPU this week, to experiment more.
Q: How are those experiments run?
A: Both using our internal raters, human beings, who say, yeah, good thing, and observing real users in a tiny fraction of our traffic, observing how they interact with something. And we have metrics, like clickthroughs.
Q: So how do you evaluate whom to give the go-ahead to? It must be hard to say no.
A: It is always hard to say no. That’s where my experience comes in. We will try the outlandish ideas, many of them. But we will guide them into success. What Google has done in the search group is we have dramatically shrunk the innovation cycle. We can guard people against some of the potholes we have visited into the past. We have mapped the surface. That institutional knowledge allows us to take those outlandish ideas and make them real in a matter of quarters instead of years.
This is what we breathe day in and day out. Hundreds of us breathe search algorithms day in and day out.
Q: Is there anything you do to make sure you don’t get in a rut? It’s often so easy for that to happen with big, successful companies.
A: That’s the biggest dilemma. That’s what I tell my team. What we may have is a very well-oiled oval wheel. And when the first round wheel comes along....
I would be lying if I said I had a magic bullet to solve that. It’s the openness to innovation, the desire to dismantle status quo. And getting the best people hired into the group is how we do it. No one should feel, if I dismantle the current search system, someone will get upset. That’s the wrong environment. When I came, I dismantled [Google cofounders] Larry and Sergey’s whole ranking system. That was the whole idea. I just said, That’s how I think it should be done, and Sergey said, Great!
Q: Is there anyone who can dismantle your ranking system?
A: I truly hope so. And I hope it’s someone inside Google.
Q: You’ve seen some flow of engineers leaving Google. Why is that?
A: Within search, I can pretty much count on my fingers people we have lost that I wish we had not, in nine years of doing this. Less than 1% attrition. In search, we have maintained a very, very healthy crew.
Some attrition is natural. One guy, great guy who worked with me at Bell Labs, worked here five years, but wanted to work in Australia. Those are the people I’m counting on my fingers. I hear in the press that there’s a brain drain. And the people that they mention, I know what they did here.
Q: How does the week-to-week process work in the search group?
A: Tuesday meetings, the leads [leaders of the search quality group] meet. We discuss a lot of new projects, new ideas and what can be done. That’s an hour to two hours Tuesday morning.
Then there’s the launch meeting on Thursdays where some of these 5,000 or so experiments are brought to the meeting by the engineers doing the experiments. Matt [Cutts, head of the anti-Web spam team], myself, Udi, and many others evaluate the results of the experiments and all the other data we have gathered, whether it’s positive for our users or not, what are some of the potential downsides, and there’s an open discussion with the team in the room.
Then we decide that, yes, we’re going to go ahead and launch this feature, or this feature needs to be changed in this manner. Those are the two key processes. Also my staff meeting, on what need to invest in next, six things we’re not investing enough in. Freshness [of search results] was one of those things that came back several years ago.
So we pick up things like big, gaping holes in our system. Build a team. Get them off less important things.
Q: Anything you’ve focused on more recently than freshness?
A: Localization. We were not local enough in multiple countries, especially in countries where there are multiple languages or in countries whose language is the same as the majority country.
So in Austria, where they speak German, they were getting many more German results because the German Web is bigger, the German linkage is bigger. Or in the U.K., they were getting American results, or in India or New Zealand. So we built a team around it and we have made great strides in localization. And we have had a lot of success internationally.
Q: What about truly real-time search?
A: Freshness has been a focus for awhile. There are lots of things that we are working on. We are deeply aware of the potential and the challenges in real-time search.
It’s not just Twitter. It’s all of the stuff being updated on a real-time basis. Twitter is a great service. But when it comes to information and information needs for users, the quality of content is critical. Someone who has put thought and hours into writing a story about Google and someone [else] comes along and says either Google’s great or Google sucks in three words. Just because something was said 20 seconds ago doesn’t quite make it something we should put in front of our users.
Not to say someone couldn’t say something important in the last 20 seconds. And that’s where our years of experience with crawling, indexing, and relevance comes in.
Q: Is it possible Google is too tied to the traditional signals that it has used?
A: I don’t have any religion when it comes to what should or should not be used to return results. It’s not religious, it’s very pragmatic—what the results say. We have a real responsibility as a company to respect people’s time.
Q: Will Google need to provide a different interface or some other fundamental method to give people real-time information besides simply inserting a few links into the first results page?
A: It’s a very deep question. With Universal Search, the “10 blue links” paradigm has already been broken. We show three or five image results. Sometimes you might see two videos. We are experimenting with the user interface at a more fundamental level. It’s available more than is apparent out there.
Q: Do you have a philosophy that holds you back from making a fundamental change to do real-time or social search?
A: Our philosophy that the user comes first is the only philosophy we should have, because if they want social aspects to search, we’ll find a way to make it happen. Same with real-time. We’ll find the technology. We’ll find the interface. We’ll get it to our users.
Q: How has Google’s cost-cutting affected you in search?
A: Some of the changes the company has made are all the right things. We were somewhat inefficient earlier. The search group is still growing. We are hiring. We’re investing more in our infrastructure, we are hiring, we are doing more things than ever before. A large proportion of our resources goes to search. And ads.
Q: Are there any new kinds of tools you’ve developed to help improve search?
A: One of the tools that we developed recently is when we were doing our freshness work, and we are still doing it: It was getting hard for engineers to [prove] something happened on Google. But by the time they complained about something that happened, our freshness algorithms would catch up. Other engineers would read their email 10 minutes later, and say, it’s working fine for me. This was becoming incredibly frustrating for engineers.
So they developed a tool called Replay where they can freeze time retrospectively and take a picture of it. I can say, What would have happened had I run this query at 12:49? Then we can go debug all the systems and say, Ah, that system failed. Oh, we need to reduce latency over there.
Q: Anything else we should be paying attention to in your group?
A: People cannot write stories about Google innovating at a breakneck pace because people like writing stories about other things that are more attention-grabbing. But how has Google served its users for a decade? That fire in the belly is stronger than ever inside Google. The innovation in search is faster than ever before in my nine years here. I come in every day feeling like a kid in a candy store.
Q: How do you feel about the oft-mentioned notion that Google is a one-trick pony?
A: Companies do one or two or three things very well. In time, they learn a lot more and they do many more things well. We have done a great job for the world with search and a great job with the economic system on the Internet with our ads products. We have invested enough in other things, some of which over time have become reasonably useful for our users.
So it’s clear that one thing that we are best known for brings a large part of our revenue. But if you look at Google as a company, the value that we bring to the world is quite a bit more than just search. Everything gets lumped under search. But if you look at these other things, they are some of the best systems in their respective worlds.
Q: I think the criticism is: Where’s the money in those?
A: The right way to look at it is not the money. Is there value to the users? If you bring value to the users, I think we will succeed in the long run. Some things make more money than others, but as long as we keep bringing value to the world, we will be successful.