An autonomous AI agent at depthfirst, an applied AI security startup, scanned FFmpeg’s 1.5 million lines of C code and found 21 zero-day vulnerabilities at a total compute cost of roughly $1,000. Shortly after, Google shipped Chrome 149 to the stable channel with patches for a record 429 security vulnerabilities, a figure that already exceeds the total count of Chrome security fixes released across all of 2025.
Both events arrived independently. Together, they describe a structural shift the security industry hasn’t fully processed: automated discovery has dropped below the cost threshold of a small startup, while the pipeline that converts bug reports into deployed patches still runs at human speed, on human budgets.
The $1,000 Bug Hunt
FFmpeg is embedded in virtually everything that touches video. The open-source media library runs inside browsers, streaming backends, Python packages, container images, and embedded appliances. Its parsers and demuxers handle raw, untrusted input from network streams and media files, which has always made that code the highest-value surface for anyone probing the codebase for exploitable conditions.
The agent produced a reproducible proof-of-concept for each of the 21 findings, per depthfirst’s technical writeup on the FFmpeg scan. Nine have been assigned CVE identifiers, running from CVE-2026-39210 through CVE-2026-39218. The rest were fixed upstream before receiving numbers.
- CVE-2026-39210: Heap buffer overflow in the TS demuxer, introduced in 2010, lacking length bounds checks before a two-byte read
- CVE-2026-39211: Integer overflow from a 2010 swscale refactor, where a size factor formula had no upper bounds on user-controlled parameters
- CVE-2026-39212: Stack overflow from a July 2025 regression in ffmpeg_opt.c, where a preset file could trigger option parsing recursively without a depth limit
- CVE-2026-39213: Heap buffer overflow in the yuv4mpegenc rawvideo input path, introduced in 2023 without validating dimensions against packet size
One stack overflow in service-description-table parsing code traces to a 2003 commit, sitting undiscovered across 23 years of security reviews and automated fuzzing. Parsers for obscure legacy broadcast standards rarely receive focused attention in a standard audit, and fuzzers need structured seed inputs to reach them in the first place. The startup also developed a working remote code execution exploit primitive from the findings and published proof-of-concept code.
The compute bill puts things in context. Anthropic’s Claude Mythos model ran approximately 300 separate scans across the same library at a cost of around $10,000 to produce its findings there. The startup used what it describes as previous-generation models and reached comparable depth at a tenth of the price. Serious vulnerability discovery is no longer priced out of a small team’s reach.
Chrome 149’s Record Patch
Google promoted Chrome 149 to the stable channel on June 2, 2026, carrying the largest patch haul in the browser’s history. The release surpassed the total count of Chrome security fixes shipped across all of 2025 in a single version bump.
| Severity | Count |
|---|---|
| Critical | 22 |
| High | 87 |
| Medium | 226 |
| Low | 94 |
| Total | 429 |
The worst of the 22 critical bugs is CVE-2026-10881, rated 9.6 on the Common Vulnerability Scoring System (CVSS). It’s an out-of-bounds read and write in ANGLE (Almost Native Graphics Layer Engine), Chrome’s graphics abstraction layer. A crafted HTML page can trigger it to break out of the browser’s sandbox and execute arbitrary code on the host machine. Google paid $97,000 to the external researcher who reported it.
Google found 371 of the total bugs internally; outside researchers contributed 58. Among the 22 critical-severity vulnerabilities, 19 were Google’s own finds. Chrome 148, the preceding major release, had patched 151 vulnerabilities in a May 2026 update, and Chrome 149 nearly tripled that count in one cycle. That acceleration across consecutive major releases points to an internal scanning program ramping up, not a one-time catch-up.
Google has not formally attributed the surge to AI tooling. The context is hard to separate from it. In April 2026, the company restructured its Chrome Vulnerability Reward Program (VRP) explicitly in response to a surge in AI-generated submissions, shifting requirements toward concise, reproducible proofs-of-concept rather than the verbose write-ups that automated tools characteristically produce. None of the vulnerabilities had been reported as actively exploited at the time of release.
A Discovery Cost Curve That Keeps Falling
The depthfirst run fits into a pattern that has been building for about 18 months. Google’s Big Sleep agent found confirmed vulnerabilities in the same library in 2025. Anthropic’s Mythos pulled out additional bugs there, including a 16-year-old flaw in the H.264 codec that had been triggered more than five million times by automated testing tools without being caught. The startup then found 21 more using older, cheaper models.
Mythos also found a 27-year-old denial-of-service vulnerability in OpenBSD’s TCP SACK implementation and a 17-year-old remote code execution bug in FreeBSD’s NFS server (CVE-2026-4747) that granted unauthenticated root access, per Anthropic’s Mythos technical assessment. In each case, the model read source code, formed hypotheses, ran the software in an isolated container, and produced a proof-of-concept exploit. One Linux kernel exploit chain, starting from a public CVE identifier, completed in under a day at a compute cost under $2,000.
The same progression has shown up across other codebases. Anthropic’s Claude Opus 4.6, a widely available commercial model, found 22 Firefox vulnerabilities in two weeks, including 14 Mozilla classified as high severity, all previously missed by human reviewers and fuzzers. A separate AI agent discovered an authenticated remote code execution flaw in Redis that had gone undetected for more than two years. A February 2026 study showed an AI agent could reproduce working exploits for more than half of 100 real Linux kernel bugs, outperforming traditional fuzzing on the same benchmark. Discovery has moved fast enough that the 90-day responsible disclosure window, long treated as an industry standard, is increasingly mismatched to the pace at which organizations can act on what they receive.
The Pipeline That Hasn’t Scaled
The HackerOne Pause
On March 27, 2026, HackerOne suspended new submissions to its Internet Bug Bounty (IBB) program, one of the open-source community’s most significant reward programs since its 2013 launch. The program had paid more than $1.5 million in rewards in its lifetime. HackerOne cited a worsening imbalance between vulnerability discoveries and the ability of open-source maintainers to address them, saying the balance had “substantively shifted.” AI-assisted research, HackerOne said, was expanding discovery across the ecosystem faster than the organizations responsible for fixing it could absorb.
curl had already shut down. In January 2026, Daniel Stenberg, curl’s founder and lead maintainer, ended the tool’s seven-year bug bounty program after the cost of triaging AI-generated reports made it unsustainable. The program had paid more than $90,000 for 81 verified vulnerabilities. Valid submissions, those describing a real and exploitable issue, had fallen from roughly 15% of incoming reports to below 5%, according to John Morello, co-founder and chief technology officer at Minimus, a security startup, speaking to Dark Reading.
AI-assisted hunting hasn’t necessarily found more critical zero-days; instead, it’s shifted the bottleneck entirely to validation, forcing triage teams to wade through thousands of plausible-sounding but non-exploitable reports.
Morello, CTO of Minimus, speaking to Dark Reading in April 2026.
97 of 1,596
Anthropic has been tracking its own disclosure pipeline through Project Glasswing, the company’s ongoing vulnerability research program. Help Net Security’s May coverage of the Project Glasswing update showed that as of May 22, 2026, the company had disclosed 1,596 vulnerabilities across 281 open-source projects. Of those, 97 were confirmed patched and 88 had been assigned CVE or GHSA (GitHub Security Advisory) identifiers. Anthropic’s dashboard explicitly flags independent human triage and review as the rate-limiting step.
The company stated the problem plainly: “The relative ease of finding vulnerabilities compared with the difficulty of fixing them amounts to a major challenge for cybersecurity.”
Open-source volunteers are receiving reports at a volume that makes distinguishing real findings from plausible-sounding hallucinations effectively impossible at current staffing levels. Triage fatigue has become the working term for what those maintainers now face. When a real vulnerability sits buried under hundreds of noise reports, time to patch grows and users stay exposed longer. Mythos was scanning thousands of files across codebases in weeks; the 281 projects on the receiving end had months to review what arrived.
In March 2026, the Linux Foundation responded with a $12.5 million emergency security funding push backed by Anthropic, AWS, GitHub, Google, Google DeepMind, Microsoft, and OpenAI. Anthropic also partnered with the Open Source Security Foundation’s Alpha-Omega project to direct triage resources and released Claude Security in public beta for Claude Enterprise customers as a remediation aid.
Google Rewrites the Rules
Google’s April 2026 restructuring of its Chrome and Android Vulnerability Reward Programs is the clearest formal acknowledgment that the discovery economy has shifted permanently. Per the updated Chrome Vulnerability Reward Program rules, the major changes included:
- Zero-click Pixel Titan M2 full-chain exploits with persistence: raised from $1 million to $1.5 million
- Same exploit without persistence: raised from $500,000 to $750,000
- Full-chain Chrome browser process exploits: up to $250,000
- Low-severity issues removed from the program entirely
- Concise, reproducible proof-of-concept now required for most submissions; verbose write-ups no longer sufficient without one
The logic is a price signal. Google said it was prioritizing vulnerability categories “that remain more challenging for automated AI tooling to find,” a direct acknowledgment that anything an agent can produce at low compute cost is no longer scarce enough to reward at premium rates. The most difficult bugs, those requiring deep system knowledge and creative exploitation chains, command top prices. Code-pattern memory safety bugs that AI scanners find in bulk have become routine, and the program has priced them accordingly.
The company paid a record $17.1 million to 747 researchers in 2025, a 40% increase over 2024, and estimates total 2026 payouts will rise further even with cuts to some individual categories. Chrome 150, the next major release, is expected at the end of June. Google’s internal teams accounted for 371 of Chrome 149’s patches, more than 86% of the total.
The Project Glasswing disclosure tracker showed 1,596 vulnerabilities reported to open-source maintainers as of May 22; 97 had been patched.




