Google has revealed that AI is now playing a part in its search results. We don't have much information about exactly what it does or how it works, but this is the start of something big.
Search is essentially a problem in applied intelligence. Once humans used to organize things they found on the web so that others could find them but it didn't take long for the web to get too big for manual search to work. Google found a way of doing things automatically when its founders Sergey Brin and Larry Page invented PageRank - a way of using the statistics of links to work out how important a page was to any given search topic.
What isn't widely appreciated is that Google doesn't rely on PageRank anymore - if it uses it at all. Instead Google's search algorithm is based on a large set of "signals" that are combined to provide an indication of how important a page is. These signals are mostly ad-hoc and created by Google engineers on a "try it and see what happens" basis.
In a recent interview for Bloomberg it was revealed that Google has been using an AI based signal - RankBrain - for the past few months. This isn't surprising in that it has always been clear that part of Google's extensive interest in AI goes beyond self driving cars and classifying cat videos on You Tube and into its core search business.
It is also obvious that search is something that really needs AI to do properly. If a user types in a complex query then AI is what is needed to find out what the query is about and find pages that are on topic. You can also see that AI could be used to rank the quality of pages that are on the topic - you don't need page rank, just intelligence enough to discover if the page is quality information or not.
What RankBrain seems to do is deal with search queries that Google hasn't seen before, about 15% of the total queries.
Although we don't have details of what RankBrain is doing - Google is very secretive about how its search engine works for obvious reasons - it seems to be based on Word2vec. This is a technique that uses a shallow neural network to capture the way words relate to each other. It was invented by Google's AI researchers led by Tomas Mikolo. The neural network takes the input words and maps each word to a vector in a high dimensional space. The way that this is done captures many of the semantic relationships between the words so that words that mean similar things correspond to vectors in the same direction in the space and they capture many regularities. For example, the vector operation:
vector('king') - vector('man') + vector('woman')
is close to the vector("queen") and so on.
What RankBrain seems to be doing is semantic processing on the input query enabling the search algorithm to return pages that are more relevant to the query. It seems to have nothing to do with ranking the importance of the pages.
Even so RankBrain has surprised its creators by becoming, in a few short months, the third most important signal among the hundreds that the search algorithm uses.
From helping connect ambiguous search phrases with relevant pages AI will surely move deeper into the core of Google's search and eventually replace the hundreds of ad-hoc signals that are currently used.
When this happens Google search will no longer be the hit-and-miss affair we currently put up with. You will be able to ask a question and the search engine really will provide you with the most relevant high quality pages. This is when the true impact of the web on human intelligence will finally become apparent.
Some programmer's think you have to write code the hard way - without much assistance by way of code completion, syntax highlighting. Others of us rely on all of the above and more. OpenSource.com rec [ ... ]
The list of mentoring organizations for this year's Google Summer of Code has been posted and there's a record number of them. The list includes large and well known projects together with smaller and [ ... ]