Making TACOS: Ted's AI Chatbot & Obsidian Sync

If you've been following my blog, you know I started with a simple Next.js portfolio, then moved to Obsidian for content management. Well, I've taken it even further - I built my own AI backend called TACOS: Ted's AI Chatbot & Obsidian Sync! (funny acronym I know... 😂)

This is how I turned my self-hosted setup into a RAG-powered chatbot that actually knows what I do.

The Problem: Beyond Basic Sync

My previous post covered migrating from git commits to Obsidian + CouchDB. It was cleaner, but my site's AI chatbot still felt basic. It could answer simple questions but didn't really felt like an actual Ted Support.

The old setup used:

AstraDB for vectors
Upstash for caching
OpenAI for everything AI

Every content update meant rebuilding the site, deleting and regenerating embeddings.

The Vision: Everything Mine

I wanted something that felt truly mine. Where:

Content updates instantly
Everything runs on my homelab
The AI understands ME
I control the whole content pipeline

So I started making TACOS. 🌮

Building the Backend

Stack Choices

I went with FastAPI againI'd used it before in my tradingview telegram bot and wanted to get better at it. The auto-docs and async support are perfect for chat.

For databases:

CouchDB: Already set up for Obsidian sync
Postgres + pgvector: For vector search (bye AstraDB)
No external caching: I'd handle it or just deal with latency

The Cool Part: CouchDB _changes

The best discovery was CouchDB's _changes feed. Instead of polling, I set up a continuous connection that streams changes in real-time.

When I edit a note in Obsidian, LiveSync pushes to CouchDB, which triggers an event. My FastAPI backend listens and automatically:

Parses the new content
Chunks it up
Generates embeddings via OpenAI
Stores everything in Postgres with pgvector

The result is near-instant edits anytime, anywhere.

Content Reconstruction Headache

CouchDB stores Obsidian notes in a weird way, it splits everything into parent-child leaves relationships. The main doc just points to child IDs, and the actual content lives in those children.

I had to write a ContentParser that walks through all the kids and rebuilds the full markdown. At first I kept getting empty posts and was totally confused until I figured out I needed to fetch and combine all the child docs.

Building the RAG System

The RAG (Retrieval-Augmented Generation) part was the most interesting part of the project. Here's the flow:

Semantic Search: Your question gets converted to an embedding and finds similar content chunks
Context Building: TACOS gathers relevant info from different sources: my blog, knowledge base, and portfolio
Smart Responses: Ted Support uses this context to answer specifically about my work

I added query expansion too. If you ask "what school did Ted studied at?", it automatically searches for university, college, polytechnic etc. This makes the scoring way better.

The Evolution

Basic FastAPI setup - Just serving posts from CouchDB
Image permalink - So blog posts can reference images
CouchDB listener - Real-time ingestion for content updates
Postgres + pgvector - Vector search implementation
RAG service - Chatbot with semantic search
Query expansion - Improve semantic scores

Each step felt like unlocking a new level. The CouchDB listener was probably the biggest "wow" moment - seeing content update instantly right from my MacOS and iOS Obsidian app.

Architecture

Here's how everything fits:

Stuff I'm Happy With

Real-time ingestion: The CouchDB listener runs in the background and processes changes as they happen. No more manual re-ingestion!

Multiple data sources: The chatbot knows about my blog posts, personal notes, and portfolio site content.

Image serving: The FastAPI app serves images straight from CouchDB, so I don't mess with a separate public/ folder.

Streaming responses: The chatbot streams answers token by token, just like ChatGPT - feels way more responsive.

What I Learned

The Wins

FastAPI is solid: Having used it before, I'm still impressed by the developer experience and how clean and simple everything feels.

CouchDB _changes feed is powerful: Real-time sync without polling is a game-changer.

RAG is addictive: Having a system that can reason with your own data feels like unlocking a new layer of intelligence. Once I got it working, I started seeing use-cases everywhere.

The Headaches

Semantic scoring is tough: My data's mostly structured markdown, so RAG's semantic search struggled to find context. It works best with natural sentences, which meant lots of tweaking.

Testing isn't free: I still use OpenAI embeddings even in dev, so every experiment costs money. A local model would fix that, but that's a later problem.

Streaming broke under self-hosting: Turns out token streaming needed proxy_buffering off in my Nginx config - took way too long to figure out.

What's Next?

Thinking about:

Open source models: Switch to self-hosted local models when OpenAI credits run out, but man, GPUs are expensive
Better chunking: Maybe semantic chunking instead of fixed sizes
Conversation memory: Remember chat history and caching?
Analytics: See what people are asking about and improve the chatbot answers

Check out the GitHub repo if you want to dive into the code.

Final Thoughts

What started as a basic portfolio site has become this ecosystem where I write in Obsidian and those thoughts instantly become part of an intelligent chatbot for visitors. It feels like I've built a little piece of the future for myself.

The best part? It's all mine - no vendor lock-in, no monthly fees (beyond domain and minimal API costs), and I control the whole experience.

If you're thinking about it, just do it: create something fun and custom for yourself without thinking about how it makes money.

The satisfaction of actually using something you built yourself is totally worth the effort.

-- Ted