- Why not just count RAG retrievals in my application code?
- Because the questions are aggregations — retrievals per source, average relevance, sources that never surface — and pulling rows back to tally them in app code is fragile and slow as your corpus grows. Asking the LLM to count a log is worse: arithmetic over a list hallucinates. nlqdb runs the GROUP BY in Postgres and shows you the SQL it ran.
- How do the retrieval records get into the database?
- Write one row per retrieval — query id, source document, chunk id, relevance score, timestamp — with the deterministic `nlqdb_remember` MCP tool or a parameterised INSERT through `POST /v1/run` (`GLOBAL-015`). The row shape stays a trust boundary, built server-side, not LLM-guessed. Then ask the retrieval-quality questions in English over the same table.
- Does nlqdb do the vector search or retrieval itself?
- No — the embedding and similarity search that picks which chunks to retrieve stays in your vector store (Pinecone, pgvector, Chroma, Weaviate). nlqdb is the database half: you log what got retrieved, and you get a SQL query planner over that log for 'per source / per week' questions. They compose; nlqdb doesn't embed or rank your documents.
- Can I see the SQL behind the retrieval numbers?
- Always — every answer returns the result rows plus the compiled SQL under a trace toggle (`SK-WEB-005`), so you can check the grain (per retrieval vs per query) before trusting a usage figure. nlqdb never hides the SQL behind the answer.