CROKAGE Gets Stack Overflow Answers For You
Written by Sue Gee   
Wednesday, 21 August 2019

CROKAGE, which stands for Crowd Knowledge Answer Generator, is a fledgling tool that finds answers from Stack Overflow Q & A threads that have both relevant code and succinct explanations.

Searching Stack Overflow's vast collection of questions and answers is what we all do when we get stuck. But with more than 18 million questions and 27 million answers finding useful information is a bit hit and miss. CROKAGE sets out to provide comprehensive solutions for daily programming tasks containing code examples and succinct explanations. It isn't the first attempt at improving the results of a Stack Overflow search, but it appears to outperform the alternatives.


News of this tool and how it originated comes from the Stack Overflow blog, which in turn takes details from the paper authored by Rodrigo Fernandes, Masud Rahman, Chanchal Roy, Kevin A. Schneider and Marcelo de Almeida Maia, "Recommending comprehensive solutions for programming tasks by mining crowd knowledge" which was presented at  27th International Conference on Program Comprehension, held in Montreal in May.

As Ben Popper explains:

As computer scientists, this group of academics knew that developers searching for solutions to coding questions are impaired by a lexical gap between their query (task description) and the information (lines of actual code) associated with the solution that they are looking for. Given these obstacles, developers often have to browse dozens of documents in order to synthesize a full solution.

The team was assembled at the University of Saskatchewan gy Professor Building on Chanchal K. Roy and was able to build on two previous tools developed there: RACK and NL2API along with the BIKER, a tool that uses word embedding technique to calculate the similarity score between two text description Using millions of Q&A threads from Stack Overflow as the training corpus, the team trained a word-embedding model with FastText.  

More information about CROKAGE comes from GitHub which has a replication package for the tool which can be cloned if desired:

CROKAGE receives as input a query written in natural language and uses state-of-art text retrieval models combined with three state-of-art API recommender tools to retrieve the most related Stack Overflow answers to that query, sorted by relevance. CROKAGE then uses natural language processing to extract the code and relevant sentences to compose a summary containing the solution for the query. 


In their paper the researchers report:

We evaluate our approach using 97 programming queries, of which 50% was used for training and 50% was used for testing, and show that it outperforms six baselines including the state-of-art by a statistically significant margin. Furthermore, our evaluation with 29 developers using 24 tasks (queries) confirms the superiority of CROKAGE over the state-of-art tool in terms of relevance of the suggested code examples, benefit of the code explanations and the overall solution quality (code + explanation).

The best way to understand what CROCKAGE does, and whether it outperforms other methods of looking for programming answers is to use it. The current version only provides solutions for Java and you are asked to evaluate its performance with a 5-star rating. I tried the query:
    How do I create a function

which you'll notice is a bit of a trick question as you can't create standalone functions in Java - unless you are thinking of Lambdas, of course.

You have the choice of 1, 5 or 20 answers. Trying for 5 answers I was only able to award 3 out of 5 stars because, while the top answer was right, stating You cannot and then suggesting a couple of workarounds, the rest were not relevant, getting more and more off topic.



It did perform this and other searches very quickly and after a few tests its top answers tended to be good ones. 

Finding good answers from Stack Overflow is difficult because, natural language understanding is difficult and, as my demo question indicates, it may be that the even a valid answer isn't the full story. I asked about functions and the ideal answer would be something that told me that originally Java didn't support functions but now it has lambdas. Perhaps such an answer doesn't even exist on Stack Overflow as a single answer. What is really needed is some one to curate the database that is Stack Overflow, weed it, prune it, merge answers into something complete. Perhaps this is too much to ask of AI at the moment.

More Information

CROKAGE Beta experimental site

CROKAGE: A New Way to Search Stack Overflow

CROKAGE-replication-package on GitHub 

Related Articles

How To Ask A Successful Question on Stack Overflow 

DeepCode Gets Cash And Opens Free Tier


To be informed about new articles on I Programmer, sign up for our weekly newsletter, subscribe to the RSS feed and follow us on Twitter, Facebook or Linkedin.



The Appeal of Google Summer of Code

With the list of participating organizations now published, it is time for would-be contributors to select among them and apply for Google Summer of Code (GSoC). Rust has joined in the program fo [ ... ]

Apache Shiro 2.0 Released

Apache Shiro 2.0 has been released. The Java security framework now requires at least Java 11, and has added support for Jakarta EE 10.

More News

raspberry pi books



or email your comment to:

Last Updated ( Friday, 14 October 2022 )