Topic

AI & LLMs

Models you can run on your own hardware, prompt patterns that ship, agent frameworks that don't catch fire, and the awkward questions nobody answers in the breathless launch posts. Ollama, vLLM, llama.cpp, LocalAI, plus the quieter stuff — embeddings, RAG, evals, and figuring out when the cloud API is actually the right answer. If you'd rather understand the trade-offs than chase benchmarks, you'll feel at home here.

80 articles in this topic.

Featured posts

Local Coding Agents Need Less Context

Local coding agents don't fail because your 27B model is too small. They fail because you let 200K tokens of garbage pile up in the context window. Cap it low.

31 Jul, 2026
15 min read
LM Studio vs Jan vs GPT4All: Desktop LLM Clients

LM Studio, Jan, and GPT4All compared, the best desktop LLM clients for local AI chat on Mac, Windows, and Linux in 2026.

30 Jul, 2026
10 min read
RAGAS: Evaluating RAG Without Vibes

Stop guessing if your RAG pipeline works. RAGAS gives you reproducible metrics: faithfulness, answer relevance, context precision and recall.

28 Jul, 2026
9 min read
KV Cache Quantization: Free LLM Context, Almost

KV cache eats your VRAM at long context, not the weights. Q8/Q4 KV quantization in llama.cpp and vLLM cuts it 2-4x with almost no quality hit.

25 Jul, 2026
9 min read
Aider & Cline: Terminal AI Coding That Actually Ships

Aider vs Cline: two agentic AI coding tools that go beyond autocomplete. Which terminal AI coding agent ships cleaner work?

22 Jul, 2026
9 min read
Mixture of Experts (MoE) for Self-Hosters, Demystified

MoE LLMs like Mixtral and DeepSeek-V3 run 70B-class quality on 7B-ish active params. Here's how sparse activation works and how to run it at home.

19 Jul, 2026
10 min read

Featured posts

Local Coding Agents Need Less Context

LM Studio vs Jan vs GPT4All: Desktop LLM Clients

RAGAS: Evaluating RAG Without Vibes

KV Cache Quantization: Free LLM Context, Almost

Aider & Cline: Terminal AI Coding That Actually Ships

Mixture of Experts (MoE) for Self-Hosters, Demystified

All AI & LLMs articles