Local AI
Run language models and image generation on your own hardware. The platform AI watches and meters; the local AI runs in your house. For speed, privacy, and zero-marginal-cost workloads — not frontier reasoning.
13 tools · last updated 2026-06-12
AUTOMATIC1111
free · open sourceAUTOMATIC1111's Stable Diffusion web UI is the long-standing default interface for generating images locally with Stable Diffusion models. It runs in your browser against your own GPU, with a huge extension ecosystem and simpler controls than node-based tools like ComfyUI. Your prompts and images never leave your machine. Development has slowed, but it remains the easiest local image-generation on-ramp.
ComfyUI
free · open sourceComfyUI is an open-source, node-based editor for running image-generation models like Stable Diffusion and Flux on your own GPU. You wire models, prompts, and processing steps into reusable workflows, which makes it the professional's tool for local image work. Prompts and outputs never leave your machine. The node interface has a real learning curve compared to simpler front ends.
GPT4All
free · open sourceGPT4All is a free, open-source desktop application from Nomic that runs open-weight chat models on ordinary consumer hardware with essentially no setup. Install it, pick a model, and everything — prompts, documents, answers — stays on your machine, including a local-document chat feature. It will not match frontier cloud models for hard reasoning, but private everyday tasks cost nothing per use.
Hugging Face
freemium · open sourceHugging Face is the central registry for open-weight AI models — the place where models you can download and run on your own hardware are actually published. Every local tool on this list pulls from it directly or indirectly. It is a venture-backed platform, not sovereign infrastructure itself, but once the weights are on your disk, the platform's fate no longer affects you.
Jan
free · open sourceJan is an open-source desktop app that gives you a ChatGPT-style chat interface running entirely on your own computer, with no account and no telemetry. Download a model, start chatting; conversations never leave the machine. It suits people who want the familiar chat experience without a platform watching, and accepts the usual local-model tradeoff: less reasoning power than frontier cloud models.
llama.cpp
free · open sourcellama.cpp is the open-source C++ inference engine underneath much of the local-AI ecosystem, including Ollama and LM Studio. It runs open-weight models efficiently on ordinary CPUs and GPUs and gives you direct control over quantization, context size, and performance. It is a power tool: most people should start with the friendlier layers above it and drop down only when they need that control.
llamafile
free · open sourcellamafile, a Mozilla project, packs a language model and its runtime into a single executable file that runs on macOS, Windows, Linux, and the BSDs with no installation at all. Copy the file to any machine — even one with no internet — and you have working private AI. It is sovereignty by way of portability, better as a demo and utility than a daily driver.
LM Studio
freeLM Studio is a polished desktop app for running open-weight language models locally, with a built-in model browser and one-click downloads from Hugging Face. It is the easiest path for non-developers: no terminal required, and your prompts stay on your machine. The app itself is proprietary, which is a real if modest tradeoff against fully open alternatives like Jan or Ollama.
LocalAI
free · open sourceLocalAI is an open-source, drop-in replacement for the OpenAI API that runs entirely on your own hardware. One self-hosted endpoint serves chat models, embeddings, speech, and image generation, so software written for cloud AI can point at your server instead with no code changes. It is aimed at people running a home server or homelab rather than a single desktop app.
MLX
free · open sourceMLX is Apple's open-source machine-learning framework built specifically for M-series Macs, using unified memory to run open-weight models that would not fit in a typical graphics card's memory. For a Mac owner, it is the native fast path to serious local inference on hardware you already own. It is developer-oriented; non-coders will reach it indirectly through apps like LM Studio.
Ollama
free · open sourceOllama is the simplest way to run open-weight language models on your own computer: one command to install, one to run a model. It exposes an OpenAI-compatible API on localhost, so tools built for cloud AI can work against your own hardware instead. Prompts never leave your machine. Local models trail the frontier in reasoning, so match them to the right jobs.
Open WebUI
free · open sourceOpen WebUI is a self-hosted, ChatGPT-style web interface that sits on top of Ollama or any OpenAI-compatible endpoint. It adds the product layer local AI usually lacks — chat history, multiple users, document chat — while logging nothing to anyone else's servers. You run it yourself, typically in Docker, so it suits people one step past the beginner stage who want local AI to feel finished.
vLLM
free · open sourcevLLM is an open-source serving engine for running language models at high throughput on real GPU hardware, using techniques like PagedAttention and continuous batching to squeeze maximum performance from each card. It is the production tier of self-hosting: the right tool when you are serving a team, an app, or paying customers, and overkill for one person chatting on a laptop.