AI Finds Vulnerabilities - Not Everyone Is Happy

Written by Mike James

Wednesday, 13 August 2025

An obvious use for AI, the right sort of AI, is to get it to scan a code base and point out security vulnerabilities. What could possibly go wrong?

aivuln2

Google has a project called Big Sleep (is it only me who thinks this is not a good name - even if it did evolve from project Naptime?) which is a collaboration between Project Zero and DeepMind. The aim is to build an AI that can automate software vulnerability research. It found its first problem, in SQlite, back in November 2024.

"Today, we're excited to share the first real-world vulnerability discovered by the Big Sleep agent: an exploitable stack buffer underflow in SQLite, a widely used open source database engine. We discovered the vulnerability and reported it to the developers in early October, who fixed it on the same day. Fortunately, we found this issue before it appeared in an official release, so SQLite users were not impacted."

I stil think that joy in finding a serious security problem is slightly off-putting, but it is essential work. Open source is supposed to be more secure because of Linus's law "given enough eyeballs, all bugs are shallow". Perhaps now we have to consign the law to the bin along with Moore's law. Who needs eyeballs when you have an AI agent to check code.

The latest news is that Big Sleep, now powered by Gemini, has reported 20 vulnerabliities. We can't currently know exactly what they are due to Big Sleep's disclosure policy which gives maintainers time to fix problems before they are made public. You can however see what programs the bugs are in and an indication of severity:

6 x High 1 x Low impact issue in Ghostscript
1 x High 2 x Medium impact issue in imagemagick
1 x High 1 x Medium impact issue in redis
3 x High 5 x Medium 1 x Low impact issue in ffmpeg
1 x High impact issue in V8
1 x High impact issue in ANGLE
2 x High 1 x Low impact issue in libcupsfilters
1 x High impact issue in cups-filters 1.x
5 x High 1 x Medium impact issue in QuickJS
1 x Low impact issue in libxslt

We will have to wait to see if these turn out to be important or not. There have been reports of maintainers asking for AI-generated issues to be banned. For example, mcclure on GitHub commented:

"Filtering out AI-generated issues/PRs will become an additional burden for me as a maintainer, wasting not only my time, but also the time of the issue submitters (whose generated "AI" content I will not respond to), as well as the time of your server (which had to prepare a response I will close without response)."

It's not just that AI issues are likely to overwhelm maintainers it seems that they are as likely to hallucinate a problem as they are in other applications. It also seems likely that to cope with the unavoidable rise in AI-generated bug reports, maintainers will have to use AI to screen issues. It looks like it's another escalating war with AI on both sides.

aivuln

More Information

Issue Tracker

Project Zero

GitHub Code Scanning Now Uses Machine Learning

EU Bug Bounty - Software Security as a Civil Right

Bug Bounty Bonanza

State of Software Security

Codacy - Automated Code Review

Build Apps with Windsurf's AI Coding Agents - The Course

To be informed about new articles on I Programmer, sign up for our weekly newsletter, subscribe to the RSS feed and follow us on Twitter, Facebook or Linkedin.

DH2i Launches DxEnterprise For SQL Server 2025
21/10/2025

DH2i has released DxEnterprise for SQL Server 2025 which brings mission-critical high availability capability for SQL Server 2025-backed AI applications.

+ Full Story

Formae Launched As Terraform Alternative
22/10/2025

Platform Engineering Labs has launched formae, an open-source Infrastructure as Code platform built to replace state-driven systems like Terraform.

+ Full Story

More News

Comments

or email your comment to: comments@i-programmer.info

Last Updated ( Wednesday, 13 August 2025 )

More Information

Related Articles

Comments