Human Genes Renamed To Please Excel
Written by Janet Swift   
Friday, 07 August 2020

More than two dozen human genes have been renamed so that they can be typed into a spreadsheet without being formatted as dates. New guidelines for standardized gene naming explicitly allow for renaming genes to avoid problems with data handling.

HGNCbanner 

The human genome has tens of thousands of unique genes - originally it had been assumed to be more than 100,000 but this number has subsequently been revised downwards. Giving each individual gene a meaningful name is seen as important to facilitate effective communication and the fact that some genes have had to be renamed on account of Excel has attracted a great deal of attention.

It was the Verge that initially carried this story, alerted by a tweet that drew attention to this extract from the newly published  Guidelines for human gene nomenclature:

 

HGNC

The Verge outlined the problem with:

when a user inputs a gene's alphanumeric symbol into a spreadsheet, like MARCH1 -- short for "Membrane Associated Ring-CH-Type Finger 1" -- Excel converts that into a date: 1-Mar. This is extremely frustrating, even dangerous, corrupting data that scientists have to sort through by hand to restore. It's also surprisingly widespread and affects even peer-reviewed scientific work. One study from 2016 examined genetic data shared alongside 3,597 published papers and found that roughly one-fifth had been affected by Excel errors.

Elsepeth Bruford, coordinator of  the HUGO Gene Nomenclature Committee, revealed to The Verge that so far the names of some 27 genes have been changed and she noted that while there has been some dissent about the decision, it was easier to rename human genes than it was to change how Excel works.

In fact, HGNC had initially tried to change the way that geneticists used Excel and last year posted a YouTube video that showed how to enter data in Excel in order to avoid it converting gene names to dates:  

So, by changing gene names, are the geneticists now caving in when they should be asking Microsoft to fix the date formatting issues, which annoy other groups of users as well? 

The consensus both among those commenting on the Verge's article and on Hacker News which linked to it, is that eliminating names that contain dates is a sensible move. This is because Excel is a useful tool for scientists across all disciplines to work with data and that while it is possible to "tame" Excel's autoformatting this isn't foolproof, especially if you want to share spreadsheets with other users who have their own formatting options.

To us, it seems that this is the biggest case of the tail wagging the dog we have encountered in some time. I make you wonder what would have happened if Excel has wielded such power in former times? Perhaps e=mc2 would have been E1=M1*C1*C1 or quark might have been autocorrected to quart.

 

excellogo

More Information

Guidelines for human gene nomenclature

Related Articles

Calculating with Dates in Excel

Dates Are Difficult

To be informed about new articles on I Programmer, sign up for our weekly newsletter, subscribe to the RSS feed and follow us on, Twitter, Facebook or Linkedin.

 

Banner


Promoted Add-Ons Pilot For Firefox
15/09/2020

Mozilla has announced a pilot program, ostensibly to give developers an opportunity to boost the discoverability of their Firefox add-ons, by having them reviewed and recommended. During the pilo [ ... ]



D3.JS 6 Adds Iterable Support
01/09/2020

There's a new version of D3.js, the JavaScript library for manipulating documents based on data. Improvements in D3 v6 include a move to using native collections (Map and Set), as well as the abi [ ... ]


More News

graphics

 



 

Comments




or email your comment to: comments@i-programmer.info

 

Last Updated ( Friday, 07 August 2020 )