Show HN: Kai – a private, offline AI "second brain" that remembers you(no cloud)

no_internet 10 hours ago

Local-first: offline by default; zero telemetry. Optional LLM API calls can be disabled; API keys stored locally.

Stack: Python 3.10, FastAPI (async), WebSockets UI.

Storage: SQLite for persistence + sqlite-vec (384-d cosine) for vector search; ChromaDB used as an in-memory hot cache.

Embeddings: all-MiniLM-L6-v2 (384-d), CPU ~50ms/query.

LLM options (Ollama/local): TinyLlama 1.1B / Phi-2 2.7B / Gemma 2B (4GB); Mistral 7B Q4 / Llama-2 7B Q4 (8GB); Llama-3 8B / Mixtral 8×7B (16GB+). GPU optional for 5–10× speedup.

Memory tiers:

Hot: in-memory (Chroma), LRU ~5k items with score decay.

Warm: SQLite + sqlite-vec persistent store; promotes on read.

Cold: archived to disk; periodically compacted.

Graph/Why: NetworkX graph for explainability; basic activation spreading + auto-linking on similarity threshold.

Security: data lives under ~/kai/ with user-only perms; no encryption at rest yet. Export (/memory/export) and full delete scripts included.

Roadmap (short): VS Code ext (MCP), Obsidian bridge via local REST, encryption-at-rest, and open-sourcing the graph/algorithms component. Pricing: free for personal use during early access; long-term pricing TBD.

Happy to go deeper on any of the above.