The Allen Institute's Semantic Scholar
Written by Nikos Vaggalis   
Tuesday, 10 November 2015

A new tool has been launched, one that applies the power of AI to searching through scientific papers.

semantic scholar-logo

The problem that Semantic Scholar sets out to address is that there's a vast pool of research papers in the public domain that remain in the shadows, unable to be found, unable to be read. 

Needless to say, there is a lot of untapped potential, which if harvested would push forward the development of collective research, ideas, blueprints and other areas of endeavour.

The main person behind the project is Oren Etzioni, a veteran in the search engine AI field who is now involved in Scholar's development by the Allen Institute for Artificial Intelligence (AI2)   


semantic scholar - search


So how is AI involved in the process ?

AI is burdened with the task of identifying connections between documents, even inside scanned ones as well as images and diagrams, based on phrases, keys, authors, topics, other data and citations.

In other words the task is to crawl, relate and classify, and such classification, without the use of raw computing power and intelligent software, would not be feasible.

To witness the experience first hand, we put it to the test by searching for a few simple terms. We then compared the results with those from Google Scholar.

When searching for 'C# Traits', out of the 10 results yielded on the first and most important page, we obtained 6 relevant results,whereas the same query in Google Scholar produced 8 relevant results.


csharp traits - semantic scholar


But when simply searching for 'C#' we got zero(!) relevant results as the search engine searches for the author's name by default. So this query brought up unrelated papers, like Adaptive Subgradient Methods for Online Learning by Alan C. Bovikl and Stochastic Optimization and Image quality assessment by John C. Duchi, because the letter C was found in their names; hardly useful. Skimming through subsequent result pages nothing changed. 

The same query in Google Scholar yielded a staggering 10 out of 10. Of these 8 related to books, which Google has the luxury of having. Unfair advantage?

It seems that the search engine needs to be steered by giving the optimizer a hint to help it perform a better search, for example feed it multiple terms. It is also likely that refinements to improve the results will be made as the tool evolves. 

Filtering through the data can be done by Author, Publication Date, Data Set Used, Key Phrase and Overview, after the search is completed since the raw data is already collected.

So should Google Scholar watch out?

Not yet I think, as Semantic Scholar hosts CS papers only with an outlook on expanding into other scientific fields shortly. While not looking to take on Google Scholar, it certainly aims to become a post modern, more advanced hi-tech tool than the rest.

After a promising start, it remains to be shown if it fulfils the great ambition set as a goal.  We just have to wait and see.  





Golang Back In TIOBE Top 10

Google's system language Go is ranked #8 in the TIOBE Index for February 2024. This is the third time it has entered the Top 10. However, it is now in the highest position it has ever had to date.

Google Season Of Docs 2024 Announced

This year's Google Season of Docs has been announced, and as usual will provide direct grants to open source projects to improve their documentation and give professional technical writers an opportun [ ... ]

More News


raspberry pi books



or email your comment to:



Last Updated ( Tuesday, 17 November 2015 )