Skip to content

Stack & decisions

The thinking behind each choice. Excerpted from ARCHITECTURE.md §10 (decision log) — see the full doc on the repo for context per row.

Locked-in (from PRD)

LayerChoiceWhy
Agent orchestrationLangGraphRequired by PRD §8 — graded multi-agent demo
RAG frameworkLangChain + QdrantQdrant is Docker-native, schema-less, free
Open-datadata.gouv MCP officielMandatory deliverable in the PRD brief
CRMHubSpot Service KeysFree-tier compatible CRM API
APIFastAPI thinOnly /query, /health — minimal attack surface
CLITyper + RichRich gives the routing-trace tree the jury sees
ContainerDocker ComposeOne-host SMB stack
Securitysecure + slowapi + PydanticHelmet-equivalent + rate-limit + input validation

Open decisions resolved during planning

LLM provider — OpenAI (with Mistral toggle)

Considered:

  • OpenAI ✅ — best out-of-box, native function calling, ~$10–20 lifetime cost
  • Anthropic Claude — strong reasoning, citations native
  • Mistralsouveraineté FR / EU angle, kept as LLM_PROVIDER=mistral toggle
  • Ollama local — €0 but slow, weaker quality, big Docker image

Decision: OpenAI default + Mistral toggle via env var. The "votre data ne sort pas d'Europe" pitch is genuinely valuable for a CGP-targeted product, but ship the demo on quality.

Embeddings — text-embedding-3-small

  • 1536-d, $0.02/1M tokens, multilingual (French OK)
  • Corpus indexing one-shot cost: ~$0.01
  • Open-source alternative: bge-m3 via sentence-transformers (toggle for full local mode)

Vector DB — Qdrant

  • Free, self-hosted, Docker-native
  • Rejected: pgvector (extra Postgres complexity), Pinecone (paid, non-EU)

Web search — Tavily

  • Designed for AI agents (clean snippets, no HTML scraping)
  • Free tier 1,000 q/mo
  • Rejected: Brave (more setup), DuckDuckGo (lower quality)

Memory — Redis

Originally planned in-process. Switched to Redis once the 4GB droplet had headroom:

  • Tool result cache (DVF 24h, web 1h, INSEE 7d) — saves ~10× on repeat-query costs
  • LangChain RedisCache for LLM dedup (demo replays cost $0)
  • slowapi rate-limit backend (cluster-safe with multiple uvicorn workers)
  • LangGraph checkpointer (conversation survives docker compose restart)

Observability — Langfuse v2 (self-hosted)

OptionTradeoff
Langfuse v2 self-hosted~2.5GB on the droplet, fits 4GB
Langfuse v3 self-hosted ❌Needs ClickHouse, ~3.9GB on 4GB = first OOM
Langfuse CloudFree tier 50k traces/mo — saves RAM, loses self-host pitch

Frontend — CLI primary + thin Streamlit

  • CLI is the demo surface (routing trace visible)
  • Streamlit page (~1h to build) makes jury experience much better than tail -f
  • Next.js rejected — eats hardening time for marginal demo value

Hosting — DO Droplet 4GB AMS3

  • $24/mo (Basic 4GB)
  • AMS3 region: ~25ms latency from Lyon vs ~90ms from NYC3
  • $200 GitHub Student credit must be spent before 2026-06-26 expiration

What we explicitly chose NOT to do (v1)

  • No domain — droplet IP only. Caddy auto-TLS waits for v2.
  • No DO Spaces — DVF cache lives on droplet disk (only ~500MB needed).
  • No multi-tenant — single-tenant deployment, scaling debate deferred.
  • No HubSpot OAuth — Service Key is enough for v1; OAuth in v2.
  • No Next.js — Streamlit page is good enough to demo.

Cost summary

ItemCost
DO Droplet 4GB (AMS3)$24/mo
OpenAI tokens (lifetime, 44h project)~$10–20
HubSpot Free$0
data.gouv MCP$0 (public)
Tavily$0 (free tier)
Langfuse$0 (self-hosted)
GitHub$0
Total recurring~$24/mo

Burns ~$24 of the GitHub Student DO credit before June 26 expiration → saves ~$176 from being wasted.

Built as a 44h student project — multi-agent AI for CGP firms.