Liquid AI's smallest model yet, LFM2.5-230M, outperforms models 4x its size in data extraction and can run "anywhere"

Liquid AI, founded by former MIT computer scientists, today released its smallest AI language model yet. LFM2.5-230Mand companies would do well to consider it for its uses in data mining and local deployment on smartphones, laptops, and robotics.

This is a basic 230 million parameter model designed explicitly for on-device agent workflows, and as Liquid states in its launch blog post, that small size makes it possible to run almost "anywhere." According to Liquid, it also outperforms models more than 4 times its size on select benchmarks, specifically performing better in data mining than Alibaba Qwen3.5-0.8B (Instruct)’s 800 million parameter count and Google Gemma 3 1B’s 1 billion parameter count.

The model is aimed at developers and engineers building lightweight data extraction pipelines and autonomous edge systems.

Operating under a dual-use commercial license, the model remains free for individuals and businesses generating less than $10 million in annual revenue, while requiring a paid enterprise agreement for larger corporations.

This release distinguishes itself from other small AI models by using the LFM2 architecture to achieve high inference speeds without the huge memory overhead typical of parameter-heavy transformers.

While leading AI companies—Anthropic, OpenAI, Google, Microsoft, Meta, and others—push parameter counts into the hundreds of billions or trillions to achieve cutting-edge performance, a parallel race is focused entirely on edge and on-premise deployments.

Liquid AI’s release of LFM2.5-230M signals a fundamental shift toward architectural efficiency rather than brute force scaling. By squeezing 19 trillion pre-trained tokens into a footprint of 230 million parameters, the company demonstrates that edge devices don’t need massive computing power or persistent cloud connections to run complex, multi-step agent workflows.

How LFM2.5-230M works

The LFM2.5-230M model differs from standard transformer architectures and is instead based on the LFM2 framework. This architecture works as a hybrid system, interweaving short-range gated convolutions with pooled query attention to process information efficiently.

For those following the evolution of efficient architectures, Liquid’s approach shares a similar conceptual goal: managing long contexts and sequential data effectively on edge hardware without the quadratic memory costs of pure attention mechanisms. The model supports an expansive 32K context window, allowing you to ingest substantial documents or continuous streams of robotic telemetry.

When analyzing the performance graphs provided at launch, the architectural efficiency becomes visually evident. The model maintains a memory footprint of less than 400 MB while achieving preload and decoding speeds that surpass comparable models such as the Gemma 3 1B IT and Granite 4.0-H-350M.

On a Samsung Galaxy S25 Ultra equipped with a Qualcomm Snapdragon Gen4 CPU, the model achieves a decoding speed of 213 tokens per second. Even on a very limited Raspberry Pi 5, the model maintains a decoding speed of 42 tokens per second. Additionally, internal benchmarking shows that the GPU inference stack delivers lower end-to-end latency than competing small models at all concurrency levels.

Why it is important for companies

To understand why a 230 million parameter model is necessary, you have to look at how companies currently manage data.

Organizations have traditionally relied on rigid, rule-based extract, transform, and load (ETL) scripts to move and process data. However, these legacy systems are notoriously fragile; A simple change to a document layout or schema update can disrupt the entire process.

To solve this, the industry is shifting towards "AI ETL," where machine learning infers mappings, detects deviation from the schema, and adapts to changes automatically. In a modern, lightweight data extraction process, an AI model connects to unstructured sources (such as PDF files, emails, or web forms) and structures data into formats such as JSON without the need for hard-coded rules.

For enterprises, using a massive flagship model like Claude Opus 4.6 (which costs $5 per million input tokens) to parse routine invoices, format addresses, or route telemetry data is economically infeasible.

This is where models like the LFM2.5-230M become critical. Explicitly designed as a lightweight extraction engine, it allows enterprises to automate repetitive formatting and data analysis at a fraction of the compute cost and latency, running directly on local hardware rather than relying on costly and continuous API calls in the cloud.

Small Model Benchmarks: LFM vs. Class 3B

The AI industry in mid-2026 is experiencing a renaissance in "little" models, but the definition of "little" varies enormously.

Recently, the open weight community was stunned by Weibo’s VibeThinker-3B, a model of 3 billion parameters built on a Qwen2-style backbone that achieved a massive 94.3 on the AIME 2026 math benchmark, rivaling 600 billion parameter giants through aggressive data curation and reinforcement learning.

Similarly, Google’s Gemma 4 family, which recently surpassed 200 million downloads, takes cutting-edge AI to the limit, including E2B (2 billion parameters) designed specifically for mobile and IoT deployments.

In contrast, Liquid AI’s LFM2.5-230M operates in a completely different weight class. With just 230 million parameters, it’s about a tenth the size of Google’s smallest Gemma 4 model and the VibeThinker-3B.

Due to its microscopic footprint, LFM2.5-230M is not designed to compete in reasoning-intensive workloads such as advanced math, coding, or creative writing, a limitation that Liquid AI explicitly recognizes.

However, in the intended domains of data mining and tool calling, the model punches well above its weight class.

Benchmarks published by Liquid AI show that LFM2.5-230M scored 43.26 in the BFCLv3 Tool Usage benchmark, dominating IBM’s Granite 4.0-350M (39.58) and completely outperforming larger billion-parameter models like Google’s Gemma 3 1B IT (16.61).

On CaseReportBench for data mining, it scores 22.51, decimating the Qwen3.5-0.8B (Instruct).

LFM2.5-230M demonstrates that while 3 billion parameter models like VibeThinker solve advanced calculations, a 230 million parameter model is the superior and highly optimized option for executing structured tool calls and keeping agent channels running efficiently on constrained hardware.

Advanced research uses

Because it excels at tool calling, LFM2.5-230M primarily functions as a skill selection layer. Liquid AI demonstrated this capability by implementing the model on a Unitree G1 humanoid robot.

Running entirely on the device via the NVIDIA Jetson Orin computing module integrated into the robot, the model successfully processes complex environmental commands.

As noted on the company’s technical blog, the model takes free-form instructions like *"Stay still for 2 seconds, then walk forward at 1 meter per second for 3 meters, keep one knee forward with one leg for 5 seconds, and walk backward at 0.5 meters per second for 3 meters."* and automatically translates it into a structured multi-step plan requiring pre-trained low-level skills provided by NVIDIA’s SONIC framework.

Base and post-trained models are available immediately in Hugging Face, with native day-one support across the entire inference ecosystem for llama.cpp (GGUF), MLX, vLLM, SGLang, and ONNX.

Customized and dual-use LFM open license

Liquid AI ships LFM2.5-230M under the LFM v1.0 open license. Despite the word "open" in the title, this is not an Open Source Initiative (OSI) compliant license; operates as a restricted, dual-use trading framework.

For independent developers, researchers, and startups, the license works identically to open source software.

Users are granted a perpetual, worldwide, royalty-free license to reproduce, modify, and distribute the model, provided they retain the original copyright notices and prominently indicate any modifications.

However, the license includes a strict "Commercial use limitation". Any legal entity that generates $10 million or more in annual revenue loses the right to use the model commercially under this agreement.

Large companies that cross this financial threshold must negotiate a separate, paid commercial agreement with Liquid AI to deploy the model into production.

This strategy protects the company from having its intellectual property freely absorbed by the major technology conglomerates, while continuing to seed the model at the grassroots developer level.

Source link

Liquid AI’s smallest model yet, LFM2.5-230M, outperforms models 4x its size in data extraction and can run “anywhere”

How LFM2.5-230M works

Why it is important for companies

Small Model Benchmarks: LFM vs. Class 3B

Advanced research uses

Customized and dual-use LFM open license

Leave a ReplyCancel Reply

ROG XREAL R1 glasses are a great gaming companion

Apple adds Google Gemini coding assistant in Xcode 26.6 update

Notion already throws in the towel on its standalone email client

How LFM2.5-230M works

Why it is important for companies

Small Model Benchmarks: LFM vs. Class 3B

Advanced research uses

Customized and dual-use LFM open license

Leave a ReplyCancel Reply

Trending now

ROG XREAL R1 glasses are a great gaming companion

Apple adds Google Gemini coding assistant in Xcode 26.6 update

Notion already throws in the towel on its standalone email client