OfflineLLM 5: Local AI Gets a Lot More Useful With MCP and Agents
Bilaal Rashid
07 June 2026
When people talk about local AI, the conversation usually revolves around privacy.
That’s understandable. Running an AI model on your own device means your chats, files, and prompts stay under your control. Nothing needs to be sent to a cloud server.
But privacy alone isn’t enough.
The biggest advantage of cloud AI services has never been the models themselves. It’s everything around them. They can search the web, access your calendar, interact with your tools, and automate tasks.
Local AI has traditionally been isolated.
With OfflineLLM 5, we’re closing that gap.
Today we’re releasing the biggest update in OfflineLLM’s history, introducing Model Context Protocol (MCP) support, new chat personalities, and significant performance improvements across our custom Apple Silicon execution engine.
MCP Support
The headline feature in OfflineLLM 5 is support for Model Context Protocol, better known as MCP.
If you’ve been following the AI space recently, you’ve probably seen MCP becoming the standard way for AI models to connect to external tools and services.
In practical terms, MCP allows your local AI assistant to do much more than answer questions.
You can connect OfflineLLM to services such as:
- GitHub
- Zapier
- Atlassian
- Calendar
- Reminders
- Apple Music
- Web Search
- And any others that you choose to bring along
Instead of being limited to the information already inside a conversation, your AI assistant can interact with the tools you use every day.
The result is something that feels less like a chatbot and more like an assistant.
From Chatbot to AI Agent
One of the most interesting things about MCP is that it changes what local AI can actually do.
You can ask a model to search for information, access project documentation, manage tasks, trigger automations, or interact with external services through connected tools.
For a long time, these kinds of workflows were mostly associated with cloud-based AI platforms.
Now they’re available in OfflineLLM while still giving users control over their models, data, and integrations.
As AI agents become more capable, we believe privacy will become even more important. The more responsibility you give an AI system, the more valuable it becomes to know where your data is going.
Faster Across the Board
Performance has always been a major focus for OfflineLLM.
Version 5 includes a number of improvements to our custom execution engine, resulting in faster generation speeds, lower latency, and a more responsive experience throughout the app.
OfflineLLM is built specifically for Apple Silicon and is designed to take full advantage of Apple’s GPU architecture. Every release includes ongoing optimisation work, and Version 5 continues that trend.
Whether you’re chatting with a model, analysing documents through RAG, using voice chat, or working with vision models, interactions should feel noticeably faster.
New Chat Personalities
Not every conversation needs the same kind of assistant.
OfflineLLM 5 introduces built-in chat personalities that make it easy to tailor your AI assistant to different tasks without manually editing system prompts. The initial options include Assistant, a balanced general-purpose helper for productivity, research, and everyday questions; Coder, which is tuned for programming, debugging, and technical problem-solving; and Storyteller, designed for creative writing, brainstorming, roleplay, and long-form content generation. We’ll continue expanding these personalities and capabilities in future updates.
Everything Else
While MCP is the star of this release, OfflineLLM continues to support everything users already rely on:
- Run AI models entirely on-device
- Install your own models, including DeepSeek, Llama, Qwen, Gemma, Phi, Mistral, and more
- Live Voice Chat with two-way conversations
- Vision models for image understanding
- Retrieval Augmented Generation (RAG)
- Apple Intelligence integration
- OpenAI-compatible API support
- Siri Shortcuts
- Widgets
- Advanced model configuration
- Custom system prompts
- Beginner and Advanced modes
- No ads, tracking, or subscriptions
And, of course, everything continues to work with privacy as the default.
Why We Built OfflineLLM
We started OfflineLLM because we believed AI shouldn’t require handing over your data to a remote server.
That belief hasn’t changed.
What has changed is what’s possible on modern Apple hardware.
Every year, iPhones, iPads, Macs, and Vision Pro devices become more capable of running increasingly powerful AI models locally. Features that once required expensive cloud infrastructure can now run directly on consumer devices.
OfflineLLM exists to take advantage of that shift.
Our goal is simple: give users access to powerful AI without requiring them to trade away privacy, ownership, or control.
Available Now
OfflineLLM 5 is available now from the App Store for iPhone, iPad, Mac, and Apple Vision Pro.
If you’ve ever wanted the flexibility of modern AI tools without relying on the cloud, this is the biggest step forward we’ve made so far.
We’re excited to see what you build with it.