Home Assistant Voice: Wyoming vs Rhasspy

You Kicked Out Alexa. Now What?

You did it. You deregistered the Echo, unplugged the Google Home, gave them both a little speech about privacy, and dropped them in the donation bin. Felt great. Very principled.

Then your spouse tried to turn off the bedroom lights at midnight and had to fumble around in the dark for the physical switch like some kind of animal from the before-times.

This is the price of self-hosting voice assistants. It works — really well, actually — but you have to build it yourself and understand what you’re building. The two serious options in the Home Assistant ecosystem are Rhasspy (the OG community solution) and the Wyoming protocol stack (Nabu Casa’s modular successor that became mainstream around 2025). They’re not the same thing, they’re not interchangeable, and picking the wrong one for your situation is going to cost you a weekend.

Let’s sort this out.

The Brief History (Skip If You Don’t Care)

Rhasspy showed up around 2019 as the answer to “how do I do offline voice in Home Assistant without Alexa?” It was glorious for its time: a monolithic app with a web GUI, built-in support for a dozen STT engines, intent handling, wake words, the whole pipeline in one container. The community loved it.

The problem is that “one container with everything” is both its strength and its ceiling. Rhasspy 2.5 is stable but maintenance has slowed significantly. The Rhasspy 3 branch — which was supposed to modularize the whole thing — stalled in development. It exists, but it’s not ready for production use and there’s no clear ETA.

Meanwhile, in 2023, Nabu Casa (the company behind Home Assistant) announced the Wyoming protocol: a lightweight, streaming audio protocol for connecting modular voice services. Instead of one monolithic app, you get separate services for each piece of the pipeline:

faster-whisper — STT (speech-to-text), runs Whisper models locally
Piper — TTS (text-to-speech), neural voices, fast, sounds good
openWakeWord — wake word detection, runs on-device before any STT happens
Wyoming Satellite — the “ears and mouth” component that runs on the device itself (an ESP32-S3, a Pi Zero 2W, whatever)

Each service runs in its own container, talks Wyoming protocol, and Home Assistant orchestrates them through the Wyoming integration. By 2025, this is the stack Nabu Casa is actively developing and shipping hardware for.

Hardware Reality Check

Before we dig into config, let’s be honest about what you’re running this on, because model choice and expected performance vary wildly.

Hardware	Recommended STT Model	Expected Latency
Raspberry Pi 4 (4GB)	faster-whisper tiny	2-4s (rough)
Pi 4 + Coral USB TPU	faster-whisper small	~1s
Intel N100 mini-PC	faster-whisper medium	<1s
Older x86 with 8GB RAM	faster-whisper medium	1-2s

Tiny on a Pi 4 without acceleration is frustrating. It transcribes correctly most of the time but the pause between “Hey Jarvis” and anything happening is long enough that your family will go back to light switches. If you’re on bare Pi 4 hardware, either add a Coral USB accelerator or accept that this is a personal project, not a household rollout.

N100 mini-PCs (Beelink EQ12, GMKtec M5 Plus, etc.) have become the homelab sweet spot. They’re around $150-180, run 24/7 on ~10W, and handle faster-whisper medium without breaking a sweat. Medium is where it actually feels responsive.

Wyoming Stack: The Self-Hosted Setup

Here’s a practical Docker Compose that puts the core Wyoming services on a Pi or mini-PC. This assumes you’re running Home Assistant separately (HA OS, HA Container, whatever).

services:
  wyoming-whisper:
    image: rhasspy/wyoming-faster-whisper:latest
    container_name: wyoming-whisper
    restart: unless-stopped
    volumes:
      - ./whisper-data:/data
    ports:
      - "10300:10300"
    command: >
      --uri tcp://0.0.0.0:10300
      --model medium-int8
      --language en
      --device cpu
    environment:
      - TZ=America/New_York

  wyoming-piper:
    image: rhasspy/wyoming-piper:latest
    container_name: wyoming-piper
    restart: unless-stopped
    volumes:
      - ./piper-data:/data
    ports:
      - "10200:10200"
    command: >
      --uri tcp://0.0.0.0:10200
      --voice en_US-lessac-medium
    environment:
      - TZ=America/New_York

  wyoming-openwakeword:
    image: rhasspy/wyoming-openwakeword:latest
    container_name: wyoming-openwakeword
    restart: unless-stopped
    ports:
      - "10400:10400"
    command: >
      --uri tcp://0.0.0.0:10400
      --preload-model ok_nabu
    environment:
      - TZ=America/New_York

A few things worth knowing here:

Model variants for faster-whisper: tiny, tiny-int8, base, base-int8, small, small-int8, medium, medium-int8. The -int8 quantized versions use roughly half the memory with minimal accuracy loss. Start with medium-int8 if your hardware can handle it.

Piper voices: Lessac medium is a solid default. There are dozens of voices at huggingface.co/rhasspy/piper-voices. If you want your HA to sound less like a GPS and more like a person, spend 10 minutes browsing them.

openWakeWord models: ok_nabu, hey_jarvis, alexa, hey_mycroft. You can load multiple with repeated --preload-model flags. They all run simultaneously and it’s cheap on CPU.

Adding Wyoming to Home Assistant

In HA: Settings → Devices & Services → Add Integration → Wyoming Protocol

Add it three times — once pointing at each service (whisper on 10300, piper on 10200, openWakeWord on 10400). Then go to Settings → Voice Assistants, create a pipeline, and wire them together. It takes about five minutes and it just works.

The Voice PE Puck and M5Stack Atom Echo

If you want actual always-on voice endpoints in rooms (not just talking to a tablet), you have two reasonable cheap options.

Home Assistant Voice Preview Edition

This is Nabu Casa’s official hardware — an ESP32-S3-based puck with a multi-mic array and speaker. It runs the Wyoming Satellite firmware, connects to your HA instance over WiFi, and uses your self-hosted Wyoming services for STT/TTS. Everything stays local.

Flash it with ESPHome (the add-on handles this automatically), and in HA it shows up as a satellite device. Pick which voice pipeline it uses, done. It’s the path of least resistance.

M5Stack Atom Echo ($20)

If you don’t want to wait for stock on the Voice PE or you want to scatter several of these around cheaply, the Atom Echo is the classic choice. It’s a tiny ESP32 brick with a built-in microphone and speaker. Not great for noisy rooms, but fine for a bedroom or office.

Flash it with the ESPHome Wyoming Satellite firmware. There’s a community-maintained config:

esphome:
  name: atom-echo-bedroom
  friendly_name: Bedroom Echo

esp32:
  board: m5stack-atom

wifi:
  ssid: !secret wifi_ssid
  password: !secret wifi_password

api:
  encryption:
    key: !secret api_encryption_key

ota:
  platform: esphome
  password: !secret ota_password

i2s_audio:
  - id: i2s_in
    i2s_lrclk_pin: GPIO33
    i2s_bclk_pin: GPIO19
  - id: i2s_out
    i2s_lrclk_pin: GPIO33
    i2s_bclk_pin: GPIO19

microphone:
  - platform: i2s_audio
    i2s_audio_id: i2s_in
    adc_pin: GPIO23
    id: mic

speaker:
  - platform: i2s_audio
    i2s_audio_id: i2s_out
    dac_pin: GPIO22
    id: spk

voice_assistant:
  microphone: mic
  speaker: spk
  noise_suppression_level: 2
  auto_gain: 31dBFS
  volume_multiplier: 2.0
  wake_word: okay nabu
  on_wake_word_detected:
    - light.turn_on:
        id: led
        blue: 100%
  on_listening:
    - light.turn_on:
        id: led
        blue: 50%
  on_tts_start:
    - light.turn_on:
        id: led
        green: 100%
  on_end:
    - light.turn_off:
        id: led

light:
  - platform: neopixelbus
    id: led
    type: GRB
    pin: GPIO27
    num_leds: 1
    name: LED

The wake word detection for the Atom Echo runs on-device with openWakeWord — audio never leaves the device until you’ve said the magic words. Only then does it stream audio to your Wyoming stack for transcription. That’s the privacy model: local wake word, local STT, local TTS, local intent handling.

LLM Integration: Talking to Ollama

Out of the box, Wyoming + HA handles structured commands (“turn off the kitchen lights”, “set thermostat to 70”) through the built-in Conversation agent. For that, it’s excellent.

But if you want free-form conversation or more complex reasoning, you can replace the Conversation agent backend with a local LLM. The relevant integrations:

Home Assistant built-in: Settings → Voice Assistants → your pipeline → Conversation Agent. You can swap the default “Home Assistant” agent for any LLM integration.

Ollama integration: There’s a community integration called hass-ollama-conversation (available through HACS) that connects HA’s Conversation agent to a local Ollama instance. It sends your transcribed query to Ollama, gets a response, and passes it back to Piper for TTS.

Extended OpenAI Conversation: Another HACS integration that works with any OpenAI-compatible API — which includes Ollama, whose OpenAI-compatible API is available automatically at http://localhost:11434/v1/.

The practical config in configuration.yaml for the Ollama integration:

# After installing via HACS
ollama:
  host: http://192.168.1.50:11434
  model: llama3.2:3b
  prompt: >
    You are a smart home assistant. Answer concisely in 1-2 sentences.
    Control devices when asked. The current time is {{ now() }}.

Llama 3.2 3B is a good choice here — it’s fast enough on modest hardware, understands home automation context, and doesn’t require a GPU. The 7B and 8B models give better reasoning but add noticeable latency to what should feel like a snappy voice interaction.

Honest take: the LLM path adds latency and complexity. For “turn off lights” it’s overkill. For “is the front door locked and did I leave any windows open?” it’s actually useful because it can synthesize state from multiple entities rather than requiring you to ask three separate questions.

Rhasspy: When It’s Still the Right Call

Let’s be fair to Rhasspy, because “use Wyoming for everything new” doesn’t mean Rhasspy is dead for everyone.

Keep Rhasspy if:

You’ve got a working Rhasspy 2.5 setup with custom intents and slot programs that you’ve tuned for years. The migration pain is real and Wyoming’s intent handling is simpler by design.
You need the Rhasspy GUI intent builder. Wyoming assumes you’re comfortable with YAML and HA automations; Rhasspy had a proper web UI for building intents visually.
You’re running on hardware where Wyoming’s modular approach (multiple containers, multiple ports) adds complexity you don’t want.
Your STT requirements are unusual — Rhasspy 2.5 supports Kaldi, DeepSpeech, and other backends that Wyoming doesn’t.

Use Wyoming for:

Anything new. There’s no reason to start a fresh deployment on Rhasspy when Wyoming is what’s being actively developed.
Hardware integration. The Voice PE puck, the Atom Echo, and any future HA-certified voice hardware speaks Wyoming natively.
Modular upgrades. Want to swap your wake word engine without touching STT? Wyoming’s architecture makes that a one-line change.
Long-term viability. Rhasspy 2.5 maintenance has slowed and Rhasspy 3 is stalled. Wyoming is where Nabu Casa is putting resources.

If you’re on Rhasspy 2.5 and it’s working, the migration isn’t urgent. But plan for it eventually — especially if you want to use the newer HA voice hardware.

Troubleshooting the Common Pain Points

Wake word false positives: openWakeWord’s sensitivity can be tuned. In the wyoming-openwakeword command, add --threshold 0.5 (default is lower). Higher = less sensitive = fewer false triggers.

STT accuracy drops for specific words: Faster-whisper is trained on general speech. If your entity names are unusual (“turn on Zephyrus” for a PC), add them as aliases in HA or retrain with custom vocabulary. The simpler fix: rename the entity to something Whisper handles naturally.

Audio echo/feedback on the Atom Echo: The noise_suppression_level: 2 in the ESPHome config helps. Also make sure your Piper TTS volume isn’t blasting — the Atom Echo mic will pick it up and try to transcribe the HA response. Use volume_multiplier to tune it.

Wyoming services not connecting: Check that your firewall isn’t blocking the Wyoming ports (10200, 10300, 10400) between the server running the containers and your HA instance. These aren’t HTTP — HA opens a persistent TCP connection to each service.

Piper voice sounds robotic: You’re using a low-quality voice model. Switch from en_US-lessac-low to en_US-lessac-medium or en_US-lessac-high. The high-quality models are around 60MB and noticeably better.

The Bottom Line

Wyoming is the future of local voice in Home Assistant. It’s modular, actively developed, and the official HA voice hardware speaks it natively. If you’re starting fresh, use Wyoming — full stop.

Rhasspy isn’t dead, but it’s living in maintenance mode. If you have a working Rhasspy 2.5 deployment you like, no one’s forcing you to migrate today. But if you’re spinning up voice for the first time or adding new satellite devices, Wyoming is the clear choice.

On hardware: don’t underestimate the model size question. A Pi 4 with faster-whisper tiny will frustrate you and your family. Get an N100 box, add a Coral USB to your Pi, or accept that voice assistant latency is going to be a recurring topic at dinner. The $150 mini-PC investment genuinely transforms the experience from “cool experiment” to “thing the household actually uses.”

Your spouse wants to turn off the bedroom lights by voice. Wyoming on an N100 with a couple of Atom Echoes and medium-int8 can actually deliver that — no cloud, no Alexa, no data leaving your network. It just takes an afternoon to set up instead of 30 seconds with a commercial product.

That’s the trade you made when you donated those Echos. It was the right call.

Home Assistant Voice: Wyoming vs Rhasspy

You Kicked Out Alexa. Now What?

The Brief History (Skip If You Don’t Care)

Hardware Reality Check

Wyoming Stack: The Self-Hosted Setup

Adding Wyoming to Home Assistant

The Voice PE Puck and M5Stack Atom Echo

Home Assistant Voice Preview Edition

M5Stack Atom Echo ($20)

LLM Integration: Talking to Ollama

Rhasspy: When It’s Still the Right Call

Troubleshooting the Common Pain Points

The Bottom Line

Responses from around the web

Discussion

Related Posts

ESPresense: Room-Level Bluetooth Presence in Home Assistant

HACS: When Custom Integrations Bite You

Home Assistant Add-Ons vs Docker Containers

BirdNET-Pi for Self-Hosted Bird Identification

Home Assistant Voice: Wyoming vs Rhasspy

You Kicked Out Alexa. Now What?

The Brief History (Skip If You Don’t Care)

Hardware Reality Check

Wyoming Stack: The Self-Hosted Setup

Adding Wyoming to Home Assistant

The Voice PE Puck and M5Stack Atom Echo

Home Assistant Voice Preview Edition

M5Stack Atom Echo ($20)

LLM Integration: Talking to Ollama

Rhasspy: When It’s Still the Right Call

Troubleshooting the Common Pain Points

The Bottom Line

Related Reading

Responses from around the web

Discussion

Related Posts

ESPresense: Room-Level Bluetooth Presence in Home Assistant

HACS: When Custom Integrations Bite You

Home Assistant Add-Ons vs Docker Containers

BirdNET-Pi for Self-Hosted Bird Identification