Skip links

73 Seconds to Breach, 24 Hours to Patch: The Case for Autonomous Validation

Share:

Facebook
Twitter
Pinterest
LinkedIn

By Sila Ozeren Hacioglu, Security Research Engineer at Picus Security.

In April 2026, Anthropic released its newest frontier model, codename Mythos, to twelve partners under a gated preview. Not general availability; the company explicitly held it back as it was (correctly) deemed too dangerous for open release.

In its first 14 days inside that sandbox, it wrote 181 working Firefox exploits. The previous state-of-the-art model managed two. Uh oh.

It surfaced thousands of zero-days across every major OS and browser, including a 27-year-old bug in OpenBSD, an operating system whose entire reputation is built on not having bugs like this.

Over 99% of what Mythos found is still unpatched in production today.

That’s not a forecast. That happened.

Now pair it with what’s already in the wild. 

Let’s back up a bit. In February, AWS Threat Intelligence published a postmortem on a FortiGate campaign run by a single operator. One person, low skill, no hands on keyboard.

The AI did the work, and it hit 2,516 devices across 106 countries in parallel, taking just minutes per target. Zero days weren’t required. Known CVEs and misconfigurations were enough; the AI simply operated faster than anyone could respond.

Figure 1. AWS Threat Intelligence FortiGate campaign hits 2,516 devices in 106 countries

Two data points, one message: offense now runs at machine speed. And the question every defender should be asking is, not “are we compliant?” or “are we covered?” It’s more granular, and more pressing:

What’s actually getting through my controls today, and how far?

If the honest answer involves a quarterly pentest report and some dashboard screenshots, consider the rest of this piece required reading.

How Fast Can Attackers Exploit a Published CVE in 2026?

A decade ago, the median time from a CVE’s publication to a working exploit appearing in the wild was measured in months, long enough for a real patch cycle. By 2024, that window had shrunk to about 56 days. By 2025, it was down to 23 days. 

Recent CVE-to-exploit pairings from CISA KEV, VulnCheck KEV, and exploit databases now show a median delta of roughly 10 hours.

Figure 2. Average CVE-to-exploit window: 2.3 years (2018) vs. ~10 hours (2026).

Reversing a published fix into a working exploit is no longer a specialist craft; it’s now a prompt.

This means that the comfortable assumptions of vulnerability management, that CVSS scores meaningfully prioritize, that “exploitability” is a useful filter, that you have time between disclosure and weaponization, have all quietly broken.

The safer working assumption is now: every vulnerability has an exploit, or will, before you finish your next change-management meeting.

Unfortunately, autoimmunity for defense doesn’t exist yet. 

And blue side AI without validation is just guesswork at machine speed, and that’s an expensive hunch to deploy into production.

Over 99% of Mythos findings remain unpatched. The Glasswing public report lands in July.

This guide from Picus Labs covers the 12 operational recommendations security teams need to close the gap between AI-speed offense and human-speed defense, including five actions for week one.

Download Now

The Real Bottleneck Isn’t Tooling — It’s the Spaghetti Handoff

Let’s start with the attacker first. 

At second zero, the AI script kicks off. By second five, a CVE is exploited. MFA bypassed by twenty. Web shell dropped at thirty. Credentials dumped at forty-five. By second seventy-three, the compromise is complete. 

No human in the loop, no hesitation, no team meetings, no coffee breaks.

Now picture the defender. 

The SIEM alert fires at one minute, after the attacker is already done. A Tier 1 analyst picks it up around minute five. Someone triggers a SOAR playbook, by hand, at minute fifteen. A Jira ticket gets filed an hour in. Four hours later, it lands in the IT ops’ queue. 

The patch goes out the next day, twenty-four hours after the breach that took seventy-three seconds to complete.

Figure 3. The agility gap: AI compromise (73s) vs. patching (24h) due to cross-team friction.

Notice where the time goes. It isn’t inside any one tool. The EDR is fast. The SIEM is fast. The vulnerability scanner is fast. The time dies between the tools: the Slack messages, the copy-pasted hash, the PDF report emailed for review, the ticket waiting for approval, the red team script being rebuilt by hand for the blue team.

This is the spaghetti handoff, and it’s as messy as it sounds. 

You can buy a faster scanner, plug in a smarter EDR, even bolt an LLM onto your SIEM, and none of them will markedly speed up your response, because the gap isn’t inside any of your tools. It lives between teams and between systems. Accelerating one node in a graph doesn’t accelerate the graph.

This is a big part of why this conversation has moved out of the CISO’s office. 

Six months ago, AI-driven cyber risk was a technical problem to delegate. Today, boards are treating it as existential and governing it directly. Budgets are unlocked, but not for ‘more of the same.’ They’re funding credible, evidence-based plans.

What Are the Three Pillars of Cyber Resilience in the Age of AI-Powered Attacks

The fundamentals that made organizations resilient before Mythos still apply. There are three.:

Pillar 1: Identify. You can’t defend what you can’t see. Even with comprehensive exposure visibility across network, endpoint, cloud, and identity, and aggressive attack surface management, the blind spots (orphaned remote access, missing segmentation, MFA gaps) are where machine-speed attackers live.

Pillar 2: Protect. Effective network and endpoint controls, properly tuned. Tailored detection focused on credential access, lateral movement and privilege escalation rather than generic vendor rules.

Pillar 3: Validate. This is the one most programs undervalue, and it’s the one that actually answers the question we started with. Validation has two halves, and yes, you need both.

Defensive validation Breach and Attack Simulation (BAS). Are my prevention and detection controls actually catching what’s hitting me right now? Which assets do my controls fail to protect? What’s the residual risk after my stack runs?

Offensive validationAutonomous Pentesting. Can an attacker actually breach us? Which exposures chain together into a real path to our crown jewels? What’s truly exploitable in our environment, not just theoretically vulnerable?

Figure 4. BAS and Automated Penetration Testing Together

Run only BAS, and you’ll know your controls work in isolation but not whether an attacker can route around them. Run only autonomous pentesting, and you’ll find attack paths but won’t know which controls are silently failing on the assets the pentest never touched. Run them as one continuous loop, where each informs the other, and you’ll finally have an answer to “what gets through, and how far” that’s grounded in evidence rather than hypothetical opinion.

But evidence isn’t enough on its own. When offense runs at machine speed, the loop itself has to run at machine speed.

How Picus Approaches Autonomous Validation in a Post-Mythos World

A continuous loop is the right answer. But “continuous” still implies a human pacing it. In a post-Mythos world, the gap that matters isn’t between seeing and detecting; it’s between detecting and proving, fast enough that an AI-driven adversary doesn’t find out for you first.

That’s where validation goes from continuous to autonomous: agents reading the alert, scoping the test, running the simulation, pushing the fix, and writing the report, while the SOC catches up on some much-needed sleep.

We’ll be unpacking exactly what that looks like (the architecture, the agentic workflows, the operational reality of running it inside a real enterprise) at the Autonomous Validation Summit on May 12 & 14, hosted with Frost & Sullivan and featuring practitioners from Kraft Heinz and Glow Financial Services, alongside PicusCTO, Volkan Erturk.

>> See it in action at the summit.

Sponsored and written by Picus Security.

Adblock test (Why?)

Share:

Facebook
Twitter
Pinterest
LinkedIn
Explore
Drag