Replace Alexa: Build a 100% Offline "Jarvis" with Home Assistant & DeepSeek Voice

25 Min Read | 100% Private & Offline

A futuristic smart home living room at night

It's 2026. Are you still letting Amazon and Google listen to your kitchen conversations? Big Tech voice assistants are getting worse—more ads, missed commands, and constant privacy concerns.

It's time to build your own. An offline, private, highly intelligent "Jarvis" that controls your home and actually understands context, powered by Home Assistant and the DeepSeek AI model.

The Architecture: How It Works

Unlike Alexa, which sends your voice to the cloud, everything here happens on your local network. The data never leaves your house.

Mic Input

Wake Word
(openWakeWord)

STT
(Whisper)

The Brain
(DeepSeek API)

TTS
(Piper)

Action/Reply

Prerequisites: You need a running Home Assistant OS server (e.g., on a Raspberry Pi 5 or NUC) and a supported USB microphone or speakerphone connected to it.

1. Prepare the Brain (DeepSeek via Ollama)

Your smart home needs Intelligence. For this, we will use the external DeepSeek server you built in our previous guide. This separates the heavy AI lifting from your Home Assistant server.

Ensure your Ollama server is running and accessible on your local network (e.g., http://192.168.1.50:11434). Test it from your Home Assistant machine's terminal:

                        curl http://192.168.1.50:11434/api/generate -d '{"model": "deepseek-r1:8b", "prompt": "Are you online?", "stream": false}'
                    

If you get a JSON response, your brain is ready.

2. Install Home Assistant Add-ons

In Home Assistant, navigate to Settings > Add-ons > Add-on Store. We need to install three essential local services.

A. The Ears: Whisper (Speech-to-Text)

Install "Whisper". This converts your spoken audio into text. Using the "medium-int8" model is a good balance of speed and accuracy for 2026 hardware.

B. The Voice: Piper (Text-to-Speech)

Install "Piper". This generates the voice reply. It's incredibly fast and sounds surprisingly human. Pick a voice model you like.

C. The Trigger: openWakeWord

Install "openWakeWord". This listens 24/7 for a specific phrase (like "Hey Jarvis") to start recording. It's far more private than cloud options.

3. Connect the Brain to Home Assistant

Now we need to tell Home Assistant to send complex text queries to your DeepSeek server instead of trying to handle them itself.

Go to Settings > Devices & Services > Add Integration.
Search for and install "Ollama".
URL: Enter the IP of your separate AI server (e.g., http://192.168.1.50:11434).
Model: Choose the model you downloaded (e.g., deepseek-r1:8b).
Prompt Template: This is crucial. You need to tell DeepSeek it is a smart home assistant and provide it context about your devices.

                        You are Jarvis, an intelligent and helpful smart home assistant. You control a Home Assistant instance. You are concise. Here is the current state of the home:

                        {{ ha_state }}

4. Build the Voice Pipeline

This is where we glue everything together. Go to Settings > Voice Assistants.

Home Assistant Voice Pipeline Configuration Screen

Create a new Assistant called "Jarvis".
Language: English.
Conversation Agent: Select the Ollama (DeepSeek) integration you just added.
Speech-to-Text: Select Whisper.
Text-to-Speech: Select Piper.
Wake Word: Select openWakeWord and choose your trigger phrase (e.g., "hey_jarvis").

Ensure your microphone device is selected at the top of the Voice Assistants page. Say "Hey Jarvis, turn on the lights." It should respond instantly.

Troubleshooting & FAQ

Jarvis is slow to respond (5+ seconds)

The Fix: Your AI server hardware might be struggling. Try using a smaller quantized model (like a 4-bit quantization instead of 8-bit) on your Ollama server, or ensure your network connection between Home Assistant and the AI server is wired (Ethernet).

Wake word triggers randomly

The Fix: In the openWakeWord add-on configuration, increase the "threshold" value. A higher value means it needs to be more confident that you said the phrase before activating.

Does this really work if the internet goes down?

Yes. Absolutely everything—listening, processing text, thinking, and speaking—happens on your local network hardware. You could unplug your modem and Jarvis would still work.

Can it play music like Alexa?

Yes, but it requires more setup. You need to integrate Music Assistant or Spotify into Home Assistant first. Then DeepSeek can understand commands like "Play jazz in the kitchen" by calling those services.