OCSF Explained: Shared Data Language Security Teams Missing



The security industry has spent the last year talking about models, co-pilots and agents, but a quieter shift is happening one level below all that: Providers are aligning around a shared way of describing security data. The framework of the open cybersecurity scheme (OCSF), is emerging as one of the strongest candidates for that position.

Offers suppliers, companies and professionals a common way to represent security eventsfindings, objects and context. That means less time rewriting field names and custom parsers and more time correlating detections, running analytics, and creating workflows that can work across products. In a market where every security team is bringing together endpoint telemetry, identity, cloud, SaaS and AI, a common infrastructure has long felt like a pipe dream, and OCSF now puts it within reach.

OCSF in plain English

OCSF is an open source framework for cybersecurity schemes. It is vendor neutral by design and deliberately agnostic of storage format, data collection, and ETL options. In practical terms, it provides application teams and data engineers with a shared structure for events so that analysts can work with a more consistent language for threat detection and investigation.

That sounds dry until you look at the daily work inside a security operations center (SOC). Security teams have to spend a lot of effort normalizing data from different tools in order to correlate events. For example, detecting an employee logging in from San Francisco at 10 a.m. on their laptop and then accessing a cloud resource from New York at 10:02 a.m. could reveal a leaked credential.

However, setting up a system that can correlate those events is not an easy task: different tools describe the same idea with different fields, nested structures, and assumptions. The OCSF was created to reduce this tax. It helps vendors map their own schemas to a common model and helps customers move data across lakes, pipelines, and security incident and event management (SIEM) tools without requiring time-consuming translation at each hop.

The last two years have been unusually fast.

Most of OCSF’s visible acceleration has occurred in the past two years. The project was announced in August 2022 by Amazon AWS and Splunk, building on work contributed by Symantec, Broadcom and other well-known infrastructure giants Cloudflare, CrowdStrike, IBM, Okta, Palo Alto Networks, Rapid7, Salesforce, Securonix, Sumo Logic, Tanium, Trend Micro and Zscaler.

The OCSF community has maintained a steady cadence of releases over the past two years.

The community has grown rapidly. AWS said in August 2024 that OCSF had expanded from a 17-company initiative to a community with more than 200 participating organizations and 800 contributors, which expanded to 900 when OCSF joined the Linux Foundation in November 2024.

OCSF is popping up across the industry

In the observability and security space, OCSF is everywhere. AWS Security Lake converts natively supported AWS logs and events to OCSF and stores them in Parquet. AWS AppFabric can generate OCSF – Normalized Audit Data. AWS Security Hub findings use OCSF and AWS publishes an extension to get details of specific cloud resources.

Splunk can translate incoming data to OCSF with edge processor and ingestion processor. Cribl supports seamless conversion of streaming data to OCSF and compatible formats.

Palo Alto Networks can forward data from the Strata Sogging service to Amazon Security Lake on OCSF. CrowdStrike is positioned on both sides of the OCSF pipeline, with Falcon data translated to OCSF for Security Lake and Falcon Next-Gen SIEM positioned to ingest and analyze data formatted in OCSF. OCSF is one of those rare standards that has crossed the chasm from an abstract standard to an industry-wide operational plumbing standard.

AI is giving new urgency to the OCSF story

When companies deploy an AI infrastructure, large language models (LLMs) sit at the center, surrounded by complex distributed systems such as model gateways, agent runtimes, vector stores, tool calls, recovery systems, and policy engines. These components generate new forms of telemetry, many of which transcend product boundaries. Security teams across the SOC are increasingly focused on capturing and analyzing this data. The central question is often what an agent AI system actually did, rather than just the text it produced, and whether its actions led to security breaches.

That puts more pressure on the underlying data model. An AI assistant that calls the wrong tool, retrieves the wrong data, or chains together a risky sequence of actions creates a security event that must be understood across all systems. A shared security scheme becomes more valuable in that world, especially when AI is also used on the analytics side to correlate more data, faster.

For OCSF, 2025 will focus on AI

Imagine a company uses an AI assistant to help employees search for internal documents and activate tools like ticketing systems or code repositories. One day, the wizard starts extracting the wrong files, calling tools it shouldn’t be using, and exposing sensitive information in its responses.

Updates in OCSF versions 1.5.0, 1.6.0, and 1.7.0 help security teams piece together what happened by flagging unusual behavior, showing who had access to connected systems, and tracing step-by-step calls to wizard tools. Instead of just looking at the final answer the AI ​​gave, the team can investigate the entire chain of actions that led to the problem.

What’s on the horizon?

Imagine that a company uses an AI-enabled customer service bot and one day the bot starts giving long, detailed responses that include internal troubleshooting guidance intended only for staff. With the types of changes being developed for OCSF 1.8.0, the security team was able to see what model was handling the exchange, what provider was supplying it, what role each message played, and how token counts changed throughout the conversation.

A sudden increase in notice or completion tokens could indicate that the bot received an unusually large hidden notice, pulled too much background data from a vector database, or generated an overly long response that increased the possibility of sensitive information being leaked. This gives researchers a handy clue about where the interaction went, rather than leaving them alone with the final answer.

Why this is important for the overall market

The bigger story is that OCSF has quickly gone from being a community effort to becoming an actual standard that security products use every day. Over the past two years, it has gained stronger governance, frequent releases, and hands-on support across data lakes, ingestion pipelines, SIEM workflows, and partner ecosystems.

In a world where AI expands the security landscape through scams, abuses, and new attack paths, security teams rely on OCSF to connect data from many systems without losing context along the way to keep their data safe.

Nikhil Mungel has been building distributed systems and AI teams at SaaS companies for over 15 years.



Source link

Leave a Reply

Your email address will not be published. Required fields are marked *