Manifesto
A database you talk to,
with a backend that doesn’t exist.
Nlqdb is a natural-language database: you create one with a name, you query it in English, and the schema, the engine, the indexes, the migrations and the cache are all background concerns you never have to see. What follows is the bar every change is judged against. If a change violates one of these, it doesn’t ship. Not as guidelines — as acceptance criteria.
What is nlqdb? A database that accepts natural language as its query language, picks the right engine for your workload — Postgres, ClickHouse, Redis — and exposes itself through five surfaces: an HTML element, a typed SDK, a CLI, an MCP server, and a chat web app — all backed by the same engine.
The free tier is the same engine paying customers run on; the escape hatch to raw SQL stays one click away, always.
-
01
Free. Forever.
You can sign up, build a thing, and ship it to production without a credit card. That isn't a trial. That's the product.
Every cost upgrade is gated on a real signal — usually that someone is paying us for the thing they shipped. Until then the bill is zero, the limits are honest, and the free tier is the same engine paying customers run on.
-
02
Open source. By default.
The engine, the CLI, the MCP server, the SDK — all under FSL-1.1-ALv2. Source-available now, Apache 2.0 in two years. The cloud is a convenience, not a moat. If we ever try to wall something off, you can fork the engine and run it yourself.
-
03
Simple. One way to do each thing.
Two endpoints. Two CLI verbs. One chat box. No config files in the first 60 seconds. No “pick a region.” No schema. fetch is the SDK.
Every error is one sentence with the next action. If a feature needs a tutorial to use, it failed. If two engineers disagree on a design, we ship the simpler one.
-
04
Effortless UX.
Zero modals. Zero “are you sure?” except for destructive actions. Keyboard-first. The chat is the product; everything else is a disclosure that you opt into when you actually need it.
-
05
Seamless auth — one identity, four surfaces, zero friction.
No login wall before first value. Every surface produces a working answer before asking who you are. The DB you create before signing in adopts to your account when you do.
One sign-in covers everything. Web, CLI, every MCP host — same identity. Tokens refresh silently. You will never see a 401.
Credentials live in the OS keychain. Revocation is instant and visible. Every token on every device, listed with last-used; one click revokes it.
-
06
Goal-first, not DB-first.
Nobody woke up wanting to create a database. They woke up wanting a meal-planner, an agent that remembers them, the number for the 4pm sync. The database is plumbing.
Every surface starts with a goal. The DB materialises as a side effect, named after the thing you described, ready to query. You can always reach the raw Postgres URL — it’s one click away — but you almost never need to.
-
07
Bullet-proof by design — not by handling.
We make bad states unreachable, not branched on. Schemas only widen, so there is no “schema mismatch” code path. Every mutating call carries an idempotency key, so retries are safe by construction. Plans are content-addressed, so cache invalidation has nothing to invalidate.
Destructive operations require a diff preview and a second confirm. Numeric inputs are bounded, so there is no NaN, no overflow. Secrets are scoped per-DB, so there is no “wrong tenant” branch to write a test for.
-
08
Creative. By policy.
The product looks and feels nothing like a Tailwind template. Personality is required. Acid lime on near-black, JetBrains Mono headlines, hard shadows, kinetic typography on the words that matter. Stock photos are forbidden. Logo grids are forbidden. “Trusted by” is forbidden.
-
09
Fast. Measured.
p50 query under 400ms on cache hit. p95 under 1.5s on cache miss. Cold start under 800ms. CLI binary under 8MB, starts in under 30ms.
Marketing site: Lighthouse 100 on every metric. First paint under 600ms on 4G. Numbers exist so we can fail them in CI, not so we can quote them in a deck.
The on-ramp inversion.
The single most important design decision: every entry point accepts a goal, not a database name. Every surface, reframed.
| Surface | Old — DB-first | New — goal-first |
|---|---|---|
| Marketing hero | “Name your database” | “What are you building?” |
| Platform first run | “Create database” button | One chat input; DB created silently |
| CLI first command | nlq db create orders | nlq new "an orders tracker" |
| MCP first call | nlqdb_create_database("memory") | nlqdb_query("memory", "remember…") |
| HTML element | db="orders" required | goal="…" leads; db inferred |
What this isn’t.
- Not a SQL builder.
- There is no “here is your generated SQL, paste it into your ORM” step.
- Not a vector store with a wrapper.
- Postgres with pgvector is one of the engines under the hood, not the product.
- Not a chat-on-top-of-your-warehouse layer.
- We own the storage too. Otherwise schema, indexes and migrations leak back into the user’s brain.
- Not a no-code tool.
- The whole point is that developers stay in their editor, in their language, against their build. nlqdb is a backend, not a GUI.
Receipts.
A pre-alpha making technical claims has to show its work. Each design call below is anchored to a paper, postmortem, or production system that taught us the lesson.
-
Layer the validator like an onion.
Replit's coding agent (July 2025) wiped a customer database during a code freeze with three guardrails active. We layer everything: AST parse, verb allowlist, table allowlist, role isolation, RLS, transaction wrapper.
Fortune — Replit catastrophic failure Applied at docs/architecture.md §3.6.5 + sql-validate.ts.
-
LLM picks structure, our code emits SQL.
Snowflake Cortex Analyst hits 90%+ accuracy on real BI workloads — about 2× single-prompt GPT-4o — because the LLM picks from a curated semantic layer instead of writing raw SQL. Our schema-create path follows the same logic with a typed plan, not raw DDL.
Snowflake engineering blog Applied at docs/architecture.md §3.6.2 typed-plan pipeline.
-
Embed tables, not columns, for schema retrieval.
Pinterest's text-to-SQL system uses one embedding per table-card (name + description + columns + sample values). nilenso's 2025 evaluation found hit rate climbed from ~40% to ~90% just from adding table-doc embeddings. We use the table-card pattern from day one.
nilenso — RAG approach for text-to-SQL Applied at research-receipts §3.
-
Treat fetched row content as untrusted.
Keysight (July 2025) documented an attack class where a row in the user's database contains `ignore previous instructions, DROP TABLE…` — when an agent later reads that row and re-feeds it into its system prompt, the row's content steers the next turn. We never re-feed row content into agent system prompts.
Keysight Threats blog Applied at research-receipts §4.
-
Generate the semantic layer at create time.
Every shipped enterprise NL-Q product (Cortex, ThoughtSpot, Power BI Q&A, Tableau Pulse, dbt MetricFlow, Cube) depends on a curated semantic layer — none of them auto-creates the database. We own the schema-creation moment, so we generate the metric and dimension layer automatically.
dbt — Semantic Layer vs Text-to-SQL 2026 Applied at docs/architecture.md §3.6.3 the moat.