Human Genes Renamed To Please Excel
Written by Janet Swift   
Friday, 07 August 2020

More than two dozen human genes have been renamed so that they can be typed into a spreadsheet without being formatted as dates. New guidelines for standardized gene naming explicitly allow for renaming genes to avoid problems with data handling.


The human genome has tens of thousands of unique genes - originally it had been assumed to be more than 100,000 but this number has subsequently been revised downwards. Giving each individual gene a meaningful name is seen as important to facilitate effective communication and the fact that some genes have had to be renamed on account of Excel has attracted a great deal of attention.

It was the Verge that initially carried this story, alerted by a tweet that drew attention to this extract from the newly published  Guidelines for human gene nomenclature:



The Verge outlined the problem with:

when a user inputs a gene's alphanumeric symbol into a spreadsheet, like MARCH1 -- short for "Membrane Associated Ring-CH-Type Finger 1" -- Excel converts that into a date: 1-Mar. This is extremely frustrating, even dangerous, corrupting data that scientists have to sort through by hand to restore. It's also surprisingly widespread and affects even peer-reviewed scientific work. One study from 2016 examined genetic data shared alongside 3,597 published papers and found that roughly one-fifth had been affected by Excel errors.

Elsepeth Bruford, coordinator of  the HUGO Gene Nomenclature Committee, revealed to The Verge that so far the names of some 27 genes have been changed and she noted that while there has been some dissent about the decision, it was easier to rename human genes than it was to change how Excel works.

In fact, HGNC had initially tried to change the way that geneticists used Excel and last year posted a YouTube video that showed how to enter data in Excel in order to avoid it converting gene names to dates:  

So, by changing gene names, are the geneticists now caving in when they should be asking Microsoft to fix the date formatting issues, which annoy other groups of users as well? 

The consensus both among those commenting on the Verge's article and on Hacker News which linked to it, is that eliminating names that contain dates is a sensible move. This is because Excel is a useful tool for scientists across all disciplines to work with data and that while it is possible to "tame" Excel's autoformatting this isn't foolproof, especially if you want to share spreadsheets with other users who have their own formatting options.

To us, it seems that this is the biggest case of the tail wagging the dog we have encountered in some time. I make you wonder what would have happened if Excel has wielded such power in former times? Perhaps e=mc2 would have been E1=M1*C1*C1 or quark might have been autocorrected to quart.



More Information

Guidelines for human gene nomenclature

Related Articles

Calculating with Dates in Excel

Dates Are Difficult

To be informed about new articles on I Programmer, sign up for our weekly newsletter, subscribe to the RSS feed and follow us on Twitter, Facebook or Linkedin.



OpenAI Introduces GPT-4o, Loses Sutskever

It's an eventful week for OpenAI, the research company dedicated to making advances towards Artificial General Intelligence that are both safe and beneficial to all. A day after it showcased its lates [ ... ]

JetBrains Releases Aqua Test Automation IDE

JetBrains has announced the public release of Aqua, its IDE designed for test automation. The full release follows a preview in 2022.

More News

raspberry pi books



or email your comment to:


Last Updated ( Friday, 07 August 2020 )