CUDA vs ROCm vs CPU: Running AI on Whatever GPU You've Got
CUDA vs ROCm for AI on Linux: NVIDIA's easy path, AMD's emotional journey, and why CPU inference isn't dead yet. Real Docker setups included.
All the articles with the tag "machine learning".
CUDA vs ROCm for AI on Linux: NVIDIA's easy path, AMD's emotional journey, and why CPU inference isn't dead yet. Real Docker setups included.
LLaMA, Mistral, Falcon, GPT — the LLM landscape is crowded. Compare model families, sizes, licensing, and what each is actually good for.
Temperature, top-p, top-k, context length — LLM inference parameters explained so you stop guessing why the model gives weird output.
Confused by AI agent frameworks? Compare LangGraph, CrewAI, and AutoGen with real Python examples, a no-nonsense breakdown, and zero hype. Pick the right one.
GGUF, GGML, AWQ, GPTQ — LLM file formats and quantization levels explained: trade-offs between model quality, size, and inference speed.
Piper vs Coqui TTS compared: speed, voice quality, Docker setup, and Home Assistant integration. Run offline neural TTS on your own hardware, no cloud fees.
How tiny 7B and 8B models keep punching above their weight — knowledge distillation, the teacher-student trick that makes local AI actually usable on home hardware.
Self-supervised learning is the technique behind GPT, BERT, and modern LLMs. Learn how models teach themselves from unlabeled data.
RAG is the default answer for giving LLMs access to documents. But chunking, embedding, and retrieval introduce failure modes that a virtual filesystem sidesteps entirely.
Google's Gemma 4 is the best open model they've shipped yet. Here's how to pull it, run it, and actually use it for real work with Ollama on your own hardware.
1-bit models store weights as -1, 0, or 1. That sounds insane until you see them run a 100B parameter model on a laptop CPU. Here's what's actually happening.
AMD finally has a fast, open source local LLM server that uses both GPU and NPU. If you've been jealous of Nvidia users, Lemonade is worth your time.