CheckCell Detects Bugs In Spreadsheets
CheckCell Detects Bugs In Spreadsheets
Written by Janet Swift   
Monday, 27 October 2014

A free Excel  plug-in that takes a new approach to automatically finding data errors in spreadsheets has been released. Rather than looking for errors in programs and formulas, CheckCell sets out to debug the underlying data.

The tool, called CheckCell comes from Daniel Barowy and Dimitar Gochev, computer science doctoral students at the University of Massachusetts Amherst. It was launched at this year's international computer programming language design conference, OOPSLA.

It addresses the problem that spreadsheet data errors can be consequential referring to the story we reported in April 2013, see Is Excel To Blame For Our Economic Pain?.

In his conference session Daniel Barowy told delegates:

"Consider the case of a paper written by Harvard economists Carmen Reinhart and Kenneth Rogoff a couple of years ago. The paper was influential, lending credibility to government austerity measures in Europe and the United States. But in 2013, UMass Amherst economist Thomas Herndon and colleagues found, in combing through the data by hand, that methodological errors undermined Reinhart and Rogoff's argument. In particular, Reinhart and Rogoff exaggerated the impact of key data values in a spreadsheet."

The CheckCell group, supervised by Prof Emery Berger. wondered whether software might be developed to find these kinds of errors automatically and successfully devised an approach that found a number of the same errors uncovered by Herndon.

The paper presented at OOPSLA refers to the approach as data debugging, which combines program analysis and statistical
analysis to automatically find potential data errors by locating data that has a disproportionate impact on the computation. 

Putting  it more informally Barowy explains:

"Basically, CheckCell identifies data points that have a big impact on the final result, even if the impact is super subtle and difficult to detect. CheckCell immediately flags data points that are very suspicious, the ones that deserve a second look. It's like having a helper who says, 'pay attention to these cells, they really matter.'"

The technique uses what Berger calls "a threshold of unusualness." CheckCell marks high-impact data points in red and asks the spreadsheet designer to check them. If they are indeed correct, they turn green and will not be flagged in subsequent analyses.

The example cited involves getting the digits in a data point transposed, so when a teacher has an “A” student who would be expected to get a 94 on a test and the spreadsheet says that student got a 49, CheckCell would flag it.  A similar example is used in the paper: 

 checkcelleg

Isn't this just range checking of the sort we have always told users to build into spreadsheets? 

I think it is fair to say it is a valuable extension of the idea that isn't as coarse grained as manual checks would be and by working from the unexpected outcome to the data brings a welcome degree of automation to the problem.

Berger says that in the future the team plan to extend CheckCell's use to large-scale data sets where errors that have a large impact are difficult to find. Having already shown that CheckCell would have uncovered some of the flaws in the Reinhart/Rogoff data, this seems a worthwhile endeavor. 

You can install CheckCell 1.0.2 for Excel 10 and later via GitHub and its source code is also available. gCheckCell, a version for Google Spreadsheet is also on GitHub.

 

Banner


250,000 Pi Zero W Sold
05/05/2017

In nine weeks the Raspberry Pi Foundation states that it has sold 250,000 Pi Zero W single board computers. This is all the more amazing because the one Zero per order restriction is still in force. T [ ... ]



Button Feedback With An Electric Arc
21/05/2017

Yes, its another odd ball idea from the ACM CHI Human-Computer Interactions conference. Have you ever wished that your keyboard could be more responsive? Well careful what you wish for. Your next keyb [ ... ]


More News

 

 
 

 

blog comments powered by Disqus

Last Updated ( Thursday, 06 November 2014 )
 
 

   
Banner
Banner
RSS feed of news items only
I Programmer News
Copyright © 2017 i-programmer.info. All Rights Reserved.
Joomla! is Free Software released under the GNU/GPL License.