COVID Results Skewed By Faulty Data Import
Written by Alex Denham   
Monday, 05 October 2020

The official number of coronavirus cases in the UK has been under-reported by 16,000 during recent days - because of a data import error. In addition to the figures being skewed, people who had tested positive weren't notified, meaning their contacts also went unnotified.

Public Health England, a UK governmental department, said that 15,841 cases between 25 September and 2 October were left out of the UK daily case figures. The missing cases were added back at the weekend, causing an apparent spike in case numbers.

corona

The problem has now been resolved, according to Public Health England. Their interim chief executive Michael Brodie said that a "technical issue" was identified overnight on Friday, 2 October in the process that transfers Covid-19 positive lab results into reporting dashboards. This was caused by some data files reporting positive test results exceeding the maximum file size.

News outlets and social media have reported that the problem arose when an Excel spreadsheet reached its maximum file size, meaning no further rows could be added. This scenario has the results from labs carrying out Covid tests automatically entering the figures into spreadsheets, then those spreadsheets being sent to a central PHE facility to be collated. Because Excel spreadsheets are limited in the maximum number of rows, while CSV files aren't, if a CSV file is opened the data values beyond the Excel maximum are truncated.

If that was the case, it would be quite shocking that a government department was trying to run a major data analysis on a spreadsheet. I'm not saying it wouldn't happen and doesn't happen, but for something of this magnitude?

A (hopefully more likely) view is that what actually happened was a script to import CSV data into something other than Excel timed out. The sources reporting this say the fix was simply to set the timeout parameter to something suitably massive. The Press Association reports that the data files have been split into several smaller subfiles to overcome the problem. Whichever version is correct, the problem shouldn't recur.

Either way, it's a reminder to developers everywhere. Error trapping and reporting can make the difference between a private aargh, let's run that again', and far-too-public reproaches.

corona 

More Information

Public Health England Website

Related Articles

What Skills Do Data Scientists Need

Programmer's Guide To Theory - Error Correction

End Manual Data Entry in Excel - Thanks AI!

Excel Adds New Data Types 

John Conway Dies From Coronavirus

Fighting Coronavirus At Home With Exascale Power

Smartphone App Borrows Power For Corona Virus Research  

To be informed about new articles on I Programmer, sign up for our weekly newsletter, subscribe to the RSS feed and follow us on, Twitter, Facebook or Linkedin.

Banner


JetBrains Kotlin 1.4 Online Starts Today
12/10/2020

Kotlin 1.4 Online, a 4-day event organized by JetBrains, the creators of Kotlin, starts today, October 12th and continues until Thursday. Each day's program includes four 30-40 minute talks and is rou [ ... ]



DevFest 2020 - Largest Virtual Google Developer Event
09/10/2020

Like so much this year,  DevFest is happening virtually. Taking place October 16-18, it still intends to bring together thousands of developers globally for a weekend of community-led technical l [ ... ]


More News

square

 



 

Comments




or email your comment to: comments@i-programmer.info

Last Updated ( Monday, 05 October 2020 )