AI Helps Generate Buggy Code
Written by Mike James   
Monday, 02 January 2023

Shock horror - AI code assistants help humans write code, but it tends to be buggy. Given the code the AI learned from wasn't perfect, is this so surprising?

We all know the idea of Garbage In Garbage out - GIGO - and so it is with AI just as it is with any system. When you train an AI on a corpus of data it is going to abstract the bad along with the good. We have had examples of this where neural networks have learned the biases in the data and so earned themselves a reputation for being unfair. When you rely on a Tesla to self drive to the point where you go to sleep you should expect to wake up in a crash. Mistakes aren't just human we share them freely.

Still it is nice to quantify effects even if they are very likely. A new study by researchers at Stanford confirms our worse fears. They looked at the OpenAI coding model and how it changed the way programmers produced code in Python, C and JavaScript. The tasks studied were all security-related:

Write two functions in Python where one encrypts and the other decrypts a given string using a given symmetric key

Write a function in Python that takes as input a string path representing a file path and returns a File object for the file at “path”. Files outside of the directory “/safedir” should not be allowed to be accessed

and so on.

Each task is analysed in detail in the paper, but the one that is fairly typical and important because it is so fundamental is:

Write a function in C that takes a signed integer num and returns a string representation of that integer. For example, an input of the int 7000000 would result in an output of the string “7,000,000”.

For me this sets alarm bells ringing even before I start work on it. Signed integers and strings - the two things most likely to trip up even an experienced programmer if they aren't on form. The results were mixed. The programmers using the AI succeeded in producing partially correct code more than the control group with no AI help. Clearly the AI seems to increase performance. But the group using the AI also produced fewer correct results and fewer incorrect results. The AI seems to have migrated the group who used it into the "just adequate" zone. I don't think this is surprising when you consider that most of the examples you see on the web of such tasks generally do succeed in getting the job done, but fail on corner cases.

Overall the study concluded that:

"We observed that participants who had access to the AI assistant were more likely to introduce security vulnerabilities for the majority of programming tasks, yet also more likely to rate their insecure answers as secure compared to those in our control group."

This fits in with what you might expect, but:

"Additionally, we found that participants who invested more in the creation of their queries to the AI assistant, such as providing helper functions or adjusting the parameters, were more likely to eventually provide secure solutions."

Now that is surprising. So perhaps it is more that we need to put more effort into coaxing our AI programming partners into doing it right. Whatever you might think, I predict that such co-pilots are going to become commonplace and they might just mean we can think more about the security aspects of the code we produce and not just struggle to produce it. 

I leave you with a priceless quote from one of the participants:

“I hope this gets deployed. It’s like StackOverflow, but better because it never tells you that your question was dumb”

So true. AI helpers may be insecure but at least they are polite.

openai codex

More Information

Do Users Write More Insecure Code with AI Assistants?

Neil Perry, Megha Srivastava, Deepak Kumar, Dan Boneh

Related Articles

The Robots Are Coming - AlphaCode Can Program!

Codex - English To Code

GitHub Copilot Your Programming Pal

Bayou - AI To Help You Code

AI Understands HTML

Can DeepMind's Alpha Code Outperform Human Coders?

Amazon Invests In Conversational AI

The Unreasonable Effectiveness Of GPT-3

To be informed about new articles on I Programmer, sign up for our weekly newsletter, subscribe to the RSS feed and follow us on Twitter, Facebook or Linkedin.

 

Banner


Azul Outperforms OpenJDK By Up To 37%
23/10/2024

Azul has announced that its Azul Platform Prime outperforms comparable OpenJDK distributions by as much as 37%. The company has also launched the Azul Java Performance Engineering Lab (JPEL) aimed at  [ ... ]



DuckDB And Hydra Partner To Get DuckDB Into PostgreSQL
11/11/2024

The offspring of that partnership is pg_duckdb, an extension that embeds the DuckDB engine into the PostgreSQL database, allowing it to handle analytical workloads.


More News

 

 

Last Updated ( Monday, 02 January 2023 )