Claude Mythos exposed a hard truth: his company’s patching process is too slow



In 2024, researchers from the University of Illinois found that GPT-4, when given a description of common vulnerabilities and exposures (CVE), could autonomously exploit 87% of a curated one-day data set of 15 vulnerabilities. Without the description, only 7% could explode. This provided a “margin of safety” for the industry because, while AI could exploit known vulnerabilities, it could not discover them.

However, on April 7 Anthropic announced that Claude Mythos Preview had closed that gap, with the model autonomously discovering thousands of zero-day vulnerabilities in major operating systems and browsers. On the other hand, Mythos scored 83.1% in CyberGym’s vulnerability replay benchmark. In a campaign targeting OpenBSD across 1,000 scaffolding runs, the total compute cost was less than $20,000.

Exploitation times are collapsing. Langflow’s CVE-2026-33017 (CVSS 9.8) was exploited 20 hours after disclosure no public proof of concept. Marimo’s CVE-2026-39987 (CVSS 9.3) was Hit in 9 hours and 41 minutes..

The defensive infrastructure that most organizations rely on was not designed for this. Rapid7 Threat Landscape Report for 2026 states that the average time from CVE publication to CISA’s list of known exploited vulnerabilities (KEV) is five days. Google M-Trends 2026 The report found that exploitation occurs before a patch is even released. When the Langflow advisory was published, the first exploit arrived within 20 hours. When Marimo’s notice was posted, it took less than 10 hours.

The assumption that your patch window is safe because exploitation takes time is no longer true. Here are its basic components.

Replace CVSS-only prioritization with a three-layer filter

Most vulnerability management programs still prioritize based on CVSS score alone. CVSS quantifies the “theoretical” severity of a vulnerability without considering whether a vulnerability is being exploited in the wild or how quickly someone could weaponize it. A CVSS 8.8 vulnerability with a history of active exploitation (such as Docker’s CVE-2026-34040) has a lower priority than a CVSS 9.8 vulnerability that may never be exploited in the wild.

TO recent study validated against 28,377 real-world vulnerabilities offers a concrete replacement: a three-layer decision tree that incorporates CISA KEV status, Exploit Prediction Scoring System (EPSS) scores, and CVSS, thus forming a singular prioritization filter.

Three-layer vulnerability prioritization filter

Layer

data source

Limit

Action

SLA

1. Active exploitation

CISA KEV Catalog

List

Immediate patch

Hours

2. Intended exploitation

EPSS via FIRST.org

Score ≥ 0.088

Scale to Level 0 channel

24 hours

3. Gravity baseline

CVSS via NVD

Score ≥ 7.0

Typical remediation

By policy

Validated result: 18x efficiency increase, 85.6% coverage of exploited vulnerabilities, ~95% reduction in urgent remediation workload. All three data sources are open and free.

The integration described is completely automatable. You can create a script to query the CISA KEV API, the FIRST.org EPSS API, and the NVDand have that script run on your asset inventory for each published CVE. The human being in this process must remain informed as an approver, but not as a trigger.

Close the agent authorization gap

Rapid exploit creation not only changes how patches are prioritized, but also how controls are configured for all systems controlled by agents that now possess privileged credentials. Their authorization policies have not been evaluated against the behavior of AI agents, and that is now a measurable risk. CVE-2026-34040 showed that the Docker authorization plugin architecture silently skips all plugins when the request body exceeds 1 MB. Common AuthZ plugins (OPA, Casbin, Prisma Cloud) are unaware of this type of bypass, which occurs in the Docker middleware before the request reaches the plugin.

When Cyera demonstrated this vulnerabilityThey demonstrated that an AI agent debugging framework could infer the bypass path while completing a legitimate task, without any instructions to exploit anything.

The Internet Engineering Task Force (IETF) is working on authorization models for agents. the document draft-klrc-aiagent-auth-01published in March by AWS participants, Zscaler, Ping Identity, and OpenAI, proposes using the current Secure Production Identity Framework for Everyone (SPIFFE) and OAuth 2.0 for AI agents to obtain short-lived, dynamically provisioned credentials.

On the other hand, the IETF Draft Agent Identity Protocol (draft-prakash-aip-00) reports that of approximately 2000 Model Context Protocol (MCP) servers surveyed, none had authentication.

But these regulations are still months or years away from being implemented. For now, security teams should proactively incorporate agent-level testing scenarios for all authorization limits, such as large request sizes, burst frequency, and multi-step privileged request escalation.

Map the blast radius of your credential

in a survey conducted by CSA/Zenity and published on April 16, 53% of organizations said they had already seen cases where AI agents exceeded intended permissions and 47% experienced a security incident involving an agent.

When AI creation tools, such as flow (CVE-2025-59528, CVSS 10.0), Langflow or n8n are compromised, the blast radius extends far beyond the host. These tools contain API keys for border models, database credentials, vector storage tokens, and OAuth tokens for enterprise systems. A compromised AI generator host is not just a breach of a single system. It is a collection of credentials that unlocks authenticated access to each connected service.

Without credential dependency maps for each AI tool host, responding to incidents that compromise agents is a guess. For each case, document each credential, the scope of its access, and the relevant credential rotation process. Also start migrating static API keys to short-lived tokens when downstream services allow it.

Five actions for this quarter

1. Implement three-layer KEV-EPSS-CVSS filter

Replace CVSS-only prioritization according to the table above. Automate data collection from all three APIs as part of a script scheduled in your asset inventory. Desired outcome: 18x more efficient, 85.6% exploited vulnerability coverage, 95% reduction in urgent remediation workload.

2. Deploy event-based patches for Level 0 services.

Determine which services are included in the critical exposure level: services exposed directly to Internet users, AI build hosts, and container orchestration control plane. Enable event-based patching on a CVE release instead of waiting for the next maintenance window for this level.

Goal: Deploy the patch to canary within four hours of a critical CVE being declared. Use CISA KEV and EPSS feeds to trigger event-based patches. In situations where it is impossible to meet the four-hour patching goal due to legacy dependencies, change freeze windows, or rollback risk, immediately apply compensating controls, such as removing Internet exposure to the vulnerable service, rotating credentials for the vulnerable service, disabling affected functionality of the service (if applicable), and identifying an exception owner for the exposure until a patch can be deployed.

It is not acceptable to allow unlimited exposures for extended periods while waiting for a maintenance window.

3. Test authorization limits at the agent level.

Create test cases for each API that AI agents can communicate with through AuthZ policies. Specifically, include test cases for requests that exceed body sizes of 1 MB, 5 MB, and 10 MB. This includes test cases for burst rates > 100 requests per second and test cases for unusual parameter combinations (privileged flags, host mounts, capacity additions). Besides, patch for Docker Engine 29.3.1 to fix CVE-2026-34040.

4. Mapping credential blast radius for all AI builder hosts.

Document each credential for each custom AI pipeline instance and Langflow, Flowise, n8n. Classify each credential based on its lifetime (static key vs. short-lived token). Identify what each credential can access. Configure anomalous IP or identity alerts for any credential access.

5. Shadow AI discovery scan for this week.

According to CSA data, there is a greater than 50% chance that your agents have exceeded their expected limits. Check your security information and event management (SIEM) and network monitoring tools for communications with the AI ​​generator’s default ports: Langflow 7860, Flowise 3000, and n8n 5678. Any unauthorized instance is an unmonitored attack surface.

takeaway

AI agents are emerging and tStandards bodies are responding. The IETF has multiple drafts related to agent authentication and authorization. He Coalition for Safe AI has published his MCP Security Taxonomy and Security by Design Principles.

But these standards move at the speed of the standard body, and the exploitation window is now measured in hours. Organizations that implement the three-layer filter and event-based patches this quarter will have a measurable reduction in exposure. Those who wait will run calendar-based patch cycles against an adversary that operates in less than 20 hours.

Nik Kale is a principal engineer specializing in security and enterprise AI platforms.



Source link

Leave a Reply

Your email address will not be published. Required fields are marked *