One of my 2026 goals was to build a smart home system that integrates an LLM for controlling IoT devices. Imagine saying "turn off all lights downstairs" and having it work through natural language processing.
I wrote that in January. By March, it was running.
This is the story of how I got there — the decisions, the dead ends, the moments where it actually worked, and what I'd do differently.
The Starting Point: Why Not Just Use ChatGPT?
The obvious question. Why go through all this trouble when ChatGPT exists?
Three reasons:
Privacy. Every command you send to a cloud AI is logged somewhere. "Turn off the bedroom fan" seems harmless. But over time, those logs paint a detailed picture of your life — when you sleep, when you leave home, what devices you own. I didn't want that.
Cost. API calls add up. A smart home that queries an LLM dozens of times a day would rack up a real bill.
Independence. If OpenAI changes their pricing, their API, or their terms — my smart home breaks. I wanted something I owned completely.
So the goal was clear: run everything locally. No cloud, no subscriptions, no data leaving my home.
The Hardware Reality Check
My first instinct was to rent a VPS. I looked into DigitalOcean's GPU droplets, which start around $0.76/GPU/hour for on-demand instances. Running that 24/7 would cost more than RM500/month. Too expensive for a home project.
Then I realised: I already had the hardware.
A GMK NucBox M6 sitting on my desk. AMD Ryzen 5 6640H, 16GB DDR5 RAM. Not a powerhouse, but enough for what I needed.
The honest truth about running LLMs on CPU: it's slow. Phi-4 14B takes about 20 to 30 seconds to generate a response on my hardware. That's not ideal, but it's workable — especially for a smart home that isn't handling thousands of requests per day.
I considered adding an eGPU via the USB4 port. I looked at AMD RX 6700 XT configurations around RM1200-1600. But the Linux + AMD USB4 + eGPU combination has mixed community support, and this setup is just for apartment testing anyway.
Decision: accept the slow response time now. Build it right at the family house later with a proper desktop tower and PCIe GPU.
Sometimes the right engineering decision is knowing when good enough is actually good enough.
The Stack
Here's what I ended up running, all on one mini PC:
Ollama — runs the LLM models
Open WebUI — ChatGPT-like browser interface
Home Assistant — smart home hub, device control
Whisper — speech to text (installed, WIP)
Piper — text to speech (installed, WIP)
Tailscale — secure remote access, no public IP needed
All containerised with Docker, except Ollama which runs as a native service. Everything accessible from anywhere via Tailscale.
Total monthly cost: electricity. Roughly RM7-10 based on Malaysian tariffs at ~15-20W average draw.
Building It in Stages
I didn't try to build everything at once. That's how projects die — you get overwhelmed, nothing works, you give up.
Instead, I broke it into six stages with a clear "win" at each step.
Stage 1: Just Get It Running
Install Ollama. Pull a model. Chat with it in the terminal.
curl -fsSL https://ollama.com/install.sh | sh
ollama pull mistral
ollama run mistral
That's it. Three commands and I had a working local LLM. The goal wasn't impressive — it was just to prove the foundation worked before building anything on top of it.
Stage 2: Make It Accessible
Installed Open WebUI via Docker. Now I had a proper browser interface accessible from any device on my network — phone, laptop, anything.
Hit a snag immediately: Open WebUI couldn't find Ollama. The fix was explicitly setting the Ollama URL when creating the container:
docker run -d \
--network=host \
-e OLLAMA_BASE_URL=http://127.0.0.1:11434 \
-v open-webui:/app/backend/data \
--name open-webui \
--restart always \
ghcr.io/open-webui/open-webui:main
Lesson learned: don't assume services can find each other. Be explicit.
With Tailscale, the whole setup became accessible from anywhere. Not just my home network — anywhere in the world, securely, without a public IP or port forwarding.
Stage 3: Give It a Personality
A generic LLM is useful. A personalised one is actually yours.
I wrote a system prompt that told the LLM exactly who it was talking to — a family home in Malaysia, smart devices currently installed, server specs, future plans. Now instead of generic smart home advice, it gives advice relevant to my setup.
I also tested three models: Mistral 7B, Llama 3.2, and Phi-4 14B. Mistral and Llama were faster but chattier. Phi-4 was slower but more concise and accurate.
For a smart home assistant, accurate beats fast. Phi-4 became my default for Open WebUI.
Stage 4: Give It Memory
This is where it stopped feeling like a demo and started feeling like a real tool.
I wrote a markdown document describing my entire home setup — devices, IPs, room layout, future plans — and uploaded it to Open WebUI's knowledge base. Now when I ask "what smart devices do I have?" it answers based on my actual setup, not generic training data.
### Smart Plug — TP-Link Tapo P304M (4-way power bar)
- IP: 192.168.100.74
- Slot 1: CCTV
- Slot 2: Blue Laptop
- Slot 3: Adapter / Mini PC
- Slot 4: Fan
One habit I'm building: update this document every time something changes. It's the single source of truth for my smart home.
Stage 5: Actually Control Things
Home Assistant was the bridge between the LLM and my physical devices. Install it, connect the TP-Link Tapo integration, and suddenly my smart plug and camera showed up as controllable entities.
I hit two interesting problems here.
Problem 1: Phi-4 doesn't support tool calling, which is what Home Assistant needs to control devices. The error was clear:
phi4:latest does not support tools (status code: 400)
Solution: use Llama 3.1:8b for Home Assistant's conversation agent, keep Phi-4 for Open WebUI chat. Different models for different jobs.
Problem 2: Even with the right model, natural language commands were unreliable. The LLM would say "Done!" but nothing happened physically.
The fix that actually worked: stop relying on the LLM for device control entirely. Use Home Assistant's built-in sentence triggers instead — explicit trigger phrases mapped directly to actions. Fast, reliable, zero LLM overhead for simple commands.
"Turn on the fan" → switch.turn_on → switch.fan ✅
The LLM handles conversation and questions. Home Assistant handles commands. Clean separation of concerns.
Stage 6: Automate and Notify
With device control working, I added the layer that makes it actually smart: automations.
Time-based:
- Fan off at 23:00
- CCTV on at 07:00, off at 23:00
Location-based (using phone GPS via Companion App):
- Fan off when I leave home
- Warning notification if fan turns on while I'm away
Notifications via Telegram — every automation sends me a message when it triggers. I can sit at the office and know exactly what's happening at home.
One note on Telegram setup: the YAML configuration method is deprecated in current Home Assistant versions. Set it up through the UI integrations page instead — I wasted time on this one.
What Broke Along the Way
Honest list, because the smooth version of this story isn't the real one:
Tapo camera authentication failed — cameras use a local device password, not your Tapo account credentials. Got locked out for 30 minutes from too many attempts. Read the docs before you try things repeatedly.
Phi-4 tool calling — spent time debugging before realising it's a model limitation, not a config issue. The error message was clear, I just didn't read it carefully enough.
Bluetooth errors in Home Assistant logs — alarming looking errors that turned out to be completely harmless. Docker containers don't have Bluetooth permissions by default. Doesn't affect anything smart home related.
YAML vs UI configuration — Home Assistant is moving integrations from YAML to UI setup. What worked in tutorials from 2024 doesn't always work now.
Model Comparison: What I Actually Found
After testing several models on my hardware:
| Model | Speed | Quality | Tool Support | Best For |
|---|---|---|---|---|
| Phi-4 14B | Slow (~20s) | Best | ❌ | Smart questions, knowledge base |
| Llama 3.1 8B | Medium | Good | ✅ | Home Assistant device control |
| Mistral 7B | Fast | Good | ✅ | Quick chat, fallback |
| Llama 3.2 3B | Fastest | Okay | ✅ | Simple commands only |
The right model depends on the job. I stopped looking for one model to rule them all.
The Architecture That Actually Works
After all the iterations, here's what's running:
You (anywhere via Tailscale)
│
├── Open WebUI → Phi-4 14B
│ └── Knowledge base (home-setup.md)
│ └── Smart questions, advice, context
│
└── Home Assistant → Llama 3.1 8B
└── Sentence triggers (reliable device control)
└── Automations (time + location based)
└── Telegram notifications
│
├── CCTV plug ✅
├── Laptop plug ✅
├── Mini PC plug ✅
└── Fan plug ✅
What's Next
This apartment setup is a testing ground, not the final product. When I move to setting up the family house, the upgrade list is clear:
Hardware:
- Proper desktop tower with PCIe GPU (RTX 3060 12GB is the target)
- USB microphone + speaker for voice control
- More devices: smart lights, door locks, temperature sensors
- Activate the TP-Link Deco X50 mesh properly
Software:
- Voice control pipeline (Whisper + Piper + mic) — installed but incomplete
- Home Assistant OS instead of Container for full add-on support
- Expanded knowledge base per room
- More sophisticated automations
The goal that started this: saying "turn off all lights downstairs" and having it work. I'm not there yet. But I can say "turn off the fan" and it works — reliably, locally, privately.
That's real progress.
What I'd Tell Myself at the Start
Don't start with the final vision. Start with "does Ollama run?" and build from there. Every stage taught me something that made the next stage easier.
Slow response times are fine for testing. 20 to 30 seconds feels painful at first. Then you stop noticing it. The functionality matters more than the speed at this stage.
Separate your concerns. LLM for intelligence, Home Assistant for reliability. Trying to make the LLM do everything leads to frustration.
Document as you go. The home-setup.md knowledge base started as a necessity and became genuinely useful. When something breaks at 11pm and you can't remember your device IPs, you'll be glad you wrote it down.
This is infrastructure, not a weekend project. It took multiple sessions, several debugging rabbit holes, and a lot of "let me check the logs again." That's normal. It's supposed to take time.
Final Thoughts
I have a private AI assistant running on hardware I own, accessible from anywhere, controlling physical devices in my home, with zero ongoing subscription cost and zero data leaving my network.
That sentence would have seemed ambitious in January. Now it's just Tuesday.
The smart home LLM goal from my 2026 resolutions isn't fully complete — voice control is still pending, the family house deployment hasn't started, and there are automations I haven't built yet. But the foundation is solid.
And unlike a lot of my 2025 projects, this one actually works.
Building something similar or have questions about the setup? Reach out at [email protected] or connect on LinkedIn. Always happy to talk infrastructure.