How It Works
1
User submits natural language query
2
Semantic search retrieves relevant rule chunks from vector database
3
Retrieved context injected into prompt with grounding instructions
4
LLM generates response constrained to retrieved context
5
Response includes rule citation for user verification
Product Decisions
| Component | Technology | Product Rationale |
|---|---|---|
| Vector DB | Supabase (pgvector) | Semantic search surfaces relevant rules even with imprecise queries |
| Generation | Claude (Anthropic) | Strong instruction-following ensures responses stay grounded in context |
| Memory | Postgres | Conversation persistence enables follow-up questions without re-explaining |
| Orchestration | n8n | Visual pipeline simplifies debugging; modular design allows component swaps |
Why RAG over fine-tuning: Knowledge updates without retraining, citations enable user verification, and constrained generation eliminates hallucination risk.
What I Learned
- 💡 Without explicit instructions to say "I don't know," the model confidently makes up answers when the retrieved context doesn't have what it needs. Telling it when to admit uncertainty was key.
- 💡 For AI products where accuracy matters, users need to verify answers themselves. Citing the exact rule builds trust in a way that "the AI said so" never will.