Method

How I Build With AI

I build data applications with AI in the loop. That is not a disclaimer, it is the method. AI accelerates the parts of the work that are mechanical: boilerplate, syntax, first drafts of a component. It does not decide the data model, choose the validation strategy, or judge whether a number is correct. I do that. This page is the honest, complete account of how the two fit together, because how you work with AI is now as much a signal of engineering judgment as the code itself.

The short version: I own the architecture and the correctness. AI owns the typing speed. Nothing ships until the tests pass and the validation checks are green, and I verify both myself.

The logic model

I came up through social work before manufacturing and data, and one tool carried over intact: the logic model. It forces you to be honest about the chain from what you put in to what you actually get out. Here is mine for building an application with AI.

InputsA clearly defined problem and the real data (or a reproducible synthetic generator when the real data is proprietary) · a target architecture I have chosen: stack, schema, and the contract each layer exposes · AI tooling: Claude Code with current-documentation and language-server context, not a blank chat window
ActivitiesI write the spec and the data contract first, before any code · I break the build into phases, each gated by a passing test suite · AI drafts implementation against that spec; I review, correct, and reject · I define validation checks and wire them in as first-class features, not afterthoughts
OutputsWorking application with a test suite that runs on every change · a live validation layer that proves the data is internally consistent · a methodology panel where every figure traces back to a named SQL view
OutcomesData a senior reviewer can trust, with the provenance visible · a codebase I can explain line by line because I architected it · a repeatable process that survives being pointed at a new problem

The process, end to end

This is roughly how a project goes from idea to a live demo.

  • 1. Define the problem and the data contract. Before I open an editor I decide what question the app answers and what shape the data has to be in to answer it. For proprietary domains I build a seeded synthetic generator so the whole thing is reproducible and contains no employer data.
  • 2. Design the schema and the layers. I choose the tables, the keys, and the contract each layer exposes (data layer, API, frontend). This is the part I refuse to hand to AI, because every later decision inherits from it.
  • 3. Build in phases behind a branch. Each project moves on its own feature branch in numbered phases: data-quality cleanup, schema versioning, adapter layer, features. A phase is not done until its tests pass. On one recent rebuild that meant carrying a suite to 776 passing tests before I would merge.
  • 4. Let AI draft against the spec, then review hard. With the contract fixed, AI writes fast first drafts. I read every one. When it guesses at an API or a column name, I make it search and verify rather than accept the guess. Being blunt and specific in correction is part of the method.
  • 5. Wire in validation as a feature. Consistency checks, referential-integrity checks, freshness and reconciliation all get built into the app, not run once and forgotten.
  • 6. Verify, then deploy. I confirm every SQL view executes cleanly against real data before I trust its output. Backends deploy as systemd services behind nginx on the VPS; frontends are static-exported to Cloudflare Pages.

How I prompt, and why

Prompting well is mostly about removing ambiguity and refusing to let the model paper over gaps.

  • Spec before code. I describe the contract and the desired end state precisely. Vague prompts produce plausible code that fails at the seams.
  • Self-contained instructions. No "see above" and no assumed context. Each instruction stands on its own so the model cannot fill a gap with a guess.
  • Current documentation over recall. I give the model live documentation and language-server context so it works against today's APIs instead of a stale memory of them. This kills a whole class of hallucinated method names.
  • Verify before you assert. On any technical specific (a node name, a connector's real column, a library's current signature) I require the model to check a source before answering. If it guessed wrong once, I say so directly and make it re-verify.
  • Tests as the gate, not my judgment alone. The test suite decides whether a phase is done. That keeps both of us honest.

The point of all of this is not clever wording. It is engineering the conditions under which AI produces correct work, and catching it fast when it does not.

How I ensure data integrity

This is the part that matters most, and the part generic dashboards skip. A chart is easy. Proving the chart is right is the work. Across my projects I treat validation as a layered discipline.

Live consistency checks, built into the app. Rules that must always hold on live data: totals reconcile, shares sum to roughly one hundred percent, no impossible values like negative demand or a carbon-free share above one hundred percent. A panel shows each check green or red in real time. It says the app does not just display data, it continuously proves the data is internally consistent.

Referential integrity. Explicit checks for orphaned records and broken links between tables, surfaced as pass/fail rather than assumed.

Reconciliation and spot-checks. Where an authoritative source exists, I verify a sample of my values against it and document the result, so the claim is "verified against source for these dates within rounding," not "trust me."

Freshness and SLA surfacing. The app shows when each data source last updated, so stale inputs are visible instead of silently wrong.

Structural tests in the pipeline. Range checks, non-null constraints on critical fields, and referential tests run on every build.

Provenance and reproducibility. Every headline number traces to a named SQL view, and synthetic datasets ship with their seed so anyone can regenerate the exact data. Nothing is a black box.

For the AI-facing projects the same principle drives the architecture directly: a retrieval agent answers from documents but verifies its numbers with a real SQL query against the structured data, and escalates when confidence is low rather than inventing an answer. Grounding and verification, not fluent guessing.

The tooling

  • Claude Code as the primary build environment, with current-documentation and language-server plugins so the model works against real, present-day APIs.
  • Python and PostgreSQL / TimescaleDB for data and backend, FastAPI for services.
  • Static HTML for this portfolio, Next.js for the app frontends, Recharts for visualization.
  • A Hetzner VPS running each backend as a systemd service behind an nginx reverse proxy, with frontends on Cloudflare Pages.
  • Git, feature branches, and a test suite as the merge gate on every project.

The principle underneath all of it

AI made me faster. It did not make me the engineer. The judgment about what to build, how the data should be shaped, and whether a result is actually correct is mine, and the validation layer is where I prove it. I would rather show you a green-or-red integrity panel than ask you to take my word for anything.