Tabs Or Spaces - One Billion Files Later An Answer
Written by Mike James   
Saturday, 03 September 2016

This issue splits the programming world like no other topic. It's a cross-platform, cross-language divide that pits fellow programmer against the barbarians who simply format their code in the wrong way. For indents should it be spaces or tabs?


This really is an important question, unless you are a real newbie or a real barbarian who doesn't bother with code indenting at all. Worse still perhaps you mix tabs and spaces without noticing. For Python programmers it isn't even optional as the indenting structure is part of the language but even here you can choose to use tabs or spaces to achieve an indent. Having just had to sort out a Python file with inconsistent spacing and tabs I for one wish that it was standardized. 

The key arguments are that you should use tabs because they replace n spaces by a single control code and the actual spacing produced can be altered without having to edit the file. However, having control codes that look like white space in a program can cause problems when the code is moved from one editor to another or when it is used in a code analyzer or pasted into a wordprocessor. Hence many programmers prefer to indent with spaces because of the simplicity. The only real disadvantages of using spaces is that you can't easily change the indentation and if you do paste code into a wordprocessor using a variable width font any embedded spacing is ruined. 

This space v tab argument drove Googler Felipe Hoffa to extreme lengths to find out the truth. He wrote a program that examined 400,000 GitHub repositories which held 1 billion files and 14 terabytes of code to answer one question - tabs or spaces. 

The results are surprising. There is no doubt that the spaces beat the tabs by a large margin. What is really interesting is the breakdown by language:


 Most languages seem to have programmers who really don't know where the tab key is. But what is is with C and Go programmers? Whatever drives the C programmer to use tabs more than spaces, it doesn't seem to affect the C++ programmers who seem to have lost the tab key almost as much as the Python and Ruby guys.

Wait a minute... Python programmers don't use tabs?! That seems almost anti-pythonic, but it is true, PEP 8 - the style guide for python code - says quite clearly:

Spaces are the preferred indentation method.

Tabs should be used solely to remain consistent with code that is already indented with tabs.

Python 3 disallows mixing the use of tabs and spaces for indentation.

Python 2 code indented with a mixture of tabs and spaces should be converted to using spaces exclusively.

When invoking the Python 2 command line interpreter with the  –option, it issues warnings about code that illegally mixes tabs and spaces. When using -tt these warnings become errors. These options are highly recommended!

Well that explains that and leaves the C programmers as the only tabbing mystery.

So why do so many C, but not C++ programmers, use tabs?



More Information

400,000 GitHub repositories, 1 billion files, 14 terabytes of code: Spaces or Tabs?

Related Articles

Wire Up The Programmer To Avoid Bugs

You Don't Need To Touch Type To Go Fast

Weak typing - the lost art of the keyboard 


To be informed about new articles on I Programmer, sign up for our weekly newsletter, subscribe to the RSS feed and follow us on, Twitter, FacebookGoogle+ or Linkedin



pgxman - PostgreSQL Extension Manager

pgxman is a package manager like npm, but instead of Javascript packages, it is for PostgreSQL extensions. It detects and streamlines extension operations and looks after dependency manageme [ ... ]

GNU libmicrohttpd 1.0 Released - The Web For IoT

The IoT is very dependent on web technologies, but for many applications the software needed is excessive. libmicrohttp is a small C library that lets you add HTTP to your C programs.

More News


raspberry pi books



or email your comment to:

Last Updated ( Saturday, 03 September 2016 )