Undefined Behavior Just Not Worth The Effort!
Written by Mike James   
Wednesday, 30 April 2025

Some very interesting research has just been published that throws a lot of light on the crazy belief that undefined behavior is useful, essential even, to certain types of optimization rather than the huge mistake it really is.

I keep writing about Undefined Behavior, or UB, in C and C++ largely because I cannot believe that we can be so stupid as to think that leaving bits of a language undefined is a good idea rather than a disaster. What makes it even stupider is that the compiler writers claim to optimize code by making use of it!

So often I encounter arguments that are along the lines of "without undefined behavior we would lose some serious optimizations". I have never believed this and now there is proof positive that I'm right!.

Two researchers at Lisbon University, Lucian Popescu and Nuno P. Lopes, have tried to quantify the advantage of UB optimization:

"Although there is a common belief within the compiler community that UB enables certain optimizations that would not be possible otherwise, no rigorous large-scale studies have been conducted on this subject. At the same time, there is growing interest in eliminating UB from programming languages to improve security."

They make use of the LLVM compiler which is well known to make use of UB optimizations, a reason I tend to stay well away from it.  The bottom line to their research is that the gains are slight and even when there is a gain it can be obtained in more rational ways without the need to rely on UB.

Many programmers struggle to see what possible optimization could result from UB. This is because UB is very varied in its nature and not all UB is useful for optimization. The main reason for defining something as UB was a desire to not standardize C on one type of machine architecture. For example, it was only recently that integer arithmetic was defined to be two's complement. Before that, an operation which relied on arithmetic was defined to give the expected answer for all positive arithmetic, even when overflow occurred but - negative overflows were undefined behavior.

What exactly undefined means is open to some different practical interpretations. The most reasonable is that it is only undefined in the language. When you actually run a program with negative overflow then the behavior you get depends on the machine it is run on. This is not an unreasonable situation given that C programs in particular often target a particular architecture. However, the purist view is that this is undefined in the world, not just the language, and if it happens the program is not a valid program.

How does this help with optimization?

The answer is that if it never happens in a legal program, you can assume that it really never happens and use this to optimize code. Suppose you have a program that contains a test for negative overflow - as this is never going to happen you can remove the test and get an instant optimization. Crazy or what!

The paper gives the example of optimizing the test a+b>a into b>0, which isn't true for two's complement arithmetic. The paper gives lots of examples which makes reading it an education in itself.

The best known example is:

"Over time, compilers have evolved to exploit UB for optimization, operating under the assumption that programs are well-defined. However, real-world programs often contain bugs that can trigger UB. A notable example (below) from the Linux kernel highlights the dangers of such optimizations . Since dereferencing a null pointer triggers UB, the compiler can assume that after line 4, the pointer tun is non-null. Consequently, the compiler optimizes away the if statement, creating a security vulnerability."

1 unsigned tun_chr_poll(struct file *file) {
2 struct tun_file *tfile = file->private_data;
3 struct tun_struct *tun = __tun_get(tfile);
4 struct sock *sk = tun->sk;
// dereferences tun; implies tun != NULL
5 if (!tun) // always false
6 return POLLERR; 7 ... 8 }

The paper then goes on to document and explain many other optimizations. The benchmarks are then run with optimizations on and optimization disabled.

The conclusions are striking:

"The results show that, in the cases we evaluated, the performance gains from exploiting UB are minimal. Furthermore, in the cases where performance regresses, it can often be recovered by either small to moderate changes to the compiler or by using link-time optimizations."

My conclusion from this is that there is clearly little to no justification for UB existing as an optimization principle.

A nice conclusion, but read the paper as it is easy to understand and has a lot to teach.

  • Mike James is Chief Editor of I Programmer and the author of several programming books in the I Programmer Library. His recently published Deep C Dives: Adventures in C  looks in depth at specific aspects of C that make it a unique language and tackles the topic of undefined behaviour.

 

 Ccoverdetail

More Information

L. Popescu and N. P. Lopes. Exploiting Undefined Behavior in C/C++ Programs for Optimization: A Study on the Performance Impact. Proc. of the ACM on Programming Languages, Volume 9 Issue PLDI, June 2025.

Related Articles

Undefined Behavior Begone!

C Undefined Behavior - Depressing and Terrifying (Updated)

C Pointer Declaration And Dereferencing

C23 - What We Have To Suffer

GCC Gets An Award From ACM And A Blast From Linus        

To be informed about new articles on I Programmer, sign up for our weekly newsletter, subscribe to the RSS feed and follow us on Twitter, Facebook or Linkedin.

 

Banner


Z3 Completed This Day In 1941
12/05/2025

On May 12, 1941 Konrad Zuse completed his Z3 computer, the first program-controlled electromechanical digital computer. It followed in the footsteps of the Z1 - the world’s first binary digital [ ... ]



JetBrains CLion Now Free For Non-Commercial Use
08/05/2025

JetBrains is extending its non-commercial licensing model to CLion, its IDE for C and C++ development on Linux, OS X and Windows. This means that if you are using CLion for hobby development,&nbs [ ... ]


More News

espbook

 

Comments




or email your comment to: comments@i-programmer.info

 

Last Updated ( Wednesday, 30 April 2025 )