LocalAI
free · open sourceLocalAI is an open-source, drop-in replacement for the OpenAI API that runs entirely on your own hardware. One self-hosted endpoint serves chat models, embeddings, speech, and image generation, so software written for cloud AI can point at your server instead with no code changes. It is aimed at people running a home server or homelab rather than a single desktop app.
Trust shape
Trustless
Runs on your hardware with no third party in the loop — nobody can stop it, nobody can take it.
Facts
- Website: localai.io
- Source: github.com/mudler/LocalAI
- Platforms: linux, macos, server
- Self-hostable: yes
- Last updated: 2026-06-12
Editor's note
A drop-in OpenAI-compatible API on your own hardware — chat, embeddings, speech, and image gen from one endpoint.
Climbing the ladder?
This atlas tells you what exists. If you want the how — building with AI on infrastructure you control — that's what AI Captains Academy teaches, fellow builder to fellow builder.
AI Captains Academy →Build or maintain LocalAI? Claim this listing to keep its facts current.
Related in Local AI
AUTOMATIC1111
AUTOMATIC1111's Stable Diffusion web UI is the long-standing default interface for generating images locally with Stable Diffusion models. It runs in your browser against your own GPU, with a huge extension ecosystem and simpler controls than node-based tools like ComfyUI. Your prompts and images never leave your machine. Development has slowed, but it remains the easiest local image-generation on-ramp.
ComfyUI
ComfyUI is an open-source, node-based editor for running image-generation models like Stable Diffusion and Flux on your own GPU. You wire models, prompts, and processing steps into reusable workflows, which makes it the professional's tool for local image work. Prompts and outputs never leave your machine. The node interface has a real learning curve compared to simpler front ends.
Hugging Face
Hugging Face is the central registry for open-weight AI models — the place where models you can download and run on your own hardware are actually published. Every local tool on this list pulls from it directly or indirectly. It is a venture-backed platform, not sovereign infrastructure itself, but once the weights are on your disk, the platform's fate no longer affects you.
llama.cpp
llama.cpp is the open-source C++ inference engine underneath much of the local-AI ecosystem, including Ollama and LM Studio. It runs open-weight models efficiently on ordinary CPUs and GPUs and gives you direct control over quantization, context size, and performance. It is a power tool: most people should start with the friendlier layers above it and drop down only when they need that control.