As AI models become increasingly commoditized, startups are rushing to build the software layer on top of them. An interesting participant in this space is Osaurusan Apple-exclusive open source LLM server that allows users to move between different local AI models, either on-premises or in the cloud, while keeping their files and tools on their own hardware.
Osaurus evolved from the idea of a Desktop AI CompanionDinoki, co-founder of Osaurus terence pae described as a sort of “AI-powered Clippy.” Dinoki’s customers had asked him why they should buy the app if they still had to pay for tokens – the usage units that AI companies charge for processing messages and generating responses.
That made Pae think more deeply about running the AI locally.
“That’s how Osaurus started,” Pae, previously a software engineer at Tesla and Netflix, told TechCrunch during a call. The idea, he explained, was to try to run an AI assistant locally. “You can do pretty much everything on your Mac locally, like browse your files, access your browser, access your system settings. I thought this would be a great way to position Osaurus as a personal AI for individuals.”
Pae began building the tool in public as an open source projectadding features and fixing bugs along the way.

Today, Osaurus can flexibly connect with locally hosted AI models or cloud providers such as OpenAI and Anthropic. Users can freely choose which AI models they are using and keep other aspects of the AI experience on their own hardware, such as the models’ memory or their files and tools.
Since different AI models have different strengths, the advantage of this system is that users can switch to the AI model that best suits their needs.
This structure turns Osaurus into what’s called a “harness”: a control layer that connects different AI models, tools, and workflows through a single interface, similar to tools like open claw either Hermes. However, the difference is that these tools are usually aimed at developers who know a terminal. And sometimes, as in the case of OpenClaw, they can raise security issues and holes to worry about.
Meanwhile, Osaurus features an easy-to-use interface that consumers can use and addresses security concerns by running things in a hardware-isolated virtual sandbox. This limits the AI to a certain scope, keeping your computer and data safe.

Of course, the practice of running AI models on your machine is still in its infancy, as it is resource-intensive and hardware-dependent. To run local models, your system will need at least 64 GB of RAM. To run larger models, such as DeepSeek v4, Pae recommends systems with approximately 128GB of RAM.
But Pae believes the needs for local AI will decline over time.
“I can see its potential, because the intelligence per wattage, which is like the metric of local AI, has increased significantly. It’s on its own innovation curve. Last year, local AI could barely finish sentences, but today it can run tools, write code, access your browser and order things from Amazon (…) it’s getting better and better,” he said.

Today, Osaurus can run MiniMax M2.5, Gemma 4, Qwen3.6, GPT-OSS, Llama, DeepSeek V4 and other models. It also supports Apple’s basic on-device models, Liquid AI’s LFM family of on-device models, and in the cloud can connect to OpenAI, Anthropic, Gemini, xAI/Grok, Venice AI, OpenRouter, Ollama, and LM Studio.
As a full MCP (Model Context Protocol) server, you can also grant access to your tools to any MCP-compatible client. Plus, it comes with 20+ native plugins for Mail, Calendar, Vision, macOS Use, XLSX, PPTX, Browser, Music, Git, Filesystem, Search, Fetch, and more.
More recently, Osaurus was updated to also include voice capabilities.
Since the project went live almost a year ago, it has been downloaded more than 112,000 times, according to its website.
Currently, Osaurus’ founders (including co-founder Sam Yoo) participate in the New York-based startup accelerator Alliance. They are also thinking about next steps, which could offer Osaurus to companies, such as those in the legal space or in healthcare, where running local LLMs could address privacy concerns.
As the power of local AI models grows, the team believes it could reduce demand for AI data centers.
“We’re seeing this explosive growth in the AI space, where (cloud AI providers) have to scale using data centers and infrastructure, but we feel like people haven’t really seen the value of on-premise AI yet,” Pae said. “Instead of relying on the cloud, they can deploy a Mac Studio locally, and it should use substantially less power. You still have the cloud capabilities, but you won’t be relying on a data center to be able to run that AI,” he added.
When you purchase through links in our articles, we may earn a small commission. This does not affect our editorial independence.





