stackbilder.com / production-readiness-checklist

Before you ship to real users

AI App Production
Readiness Checklist

Q: What do teams most commonly miss before shipping?

In order of frequency: (1) webhook signature verification — Stripe webhooks parsed without signature check, allowing forged billing events; (2) cross-tenant data access — missing tenant_id predicates in multi-tenant queries; (3) server-side tier gating — billing tier enforced only in the UI, bypassed with a direct API call; (4) session revocation — logout does not invalidate the session server-side.

Q: Does passing this checklist mean I am SOC 2 compliant?

No. SOC 2 requires auditor review of your controls, not self-attestation. This checklist covers the technical controls that are prerequisites for a SOC 2 audit — without them, an audit would immediately surface gaps. Completing this checklist puts you in a defensible position to begin a formal compliance engagement. For teams needing SOC 2 readiness, our bespoke services include dedicated security review.

The 50-point checklist for taking an AI-generated app from demo to production. Cover auth, secrets, database access, billing, logging, tests, governance, and rollback before real users arrive.

✓50 checklist items ✓10 categories ✓Auth · Billing · Observability · Governance

Authentication & Sessions Secrets & Environment Database & Data Access API & Input Validation Billing & Payments Logging & Observability Error Handling Testing Governance Deployment & Rollback

Why this checklist exists

AI app builders — Lovable, Bolt, v0, Cursor, Claude Code — generate functional prototypes fast. They do not generate threat models, test plans, or deployment contracts. This checklist is the gap between "it works in my browser" and "it survives contact with real users."

How to use it

Work through it category by category before your first production deployment. If you cannot check an item, that is a conscious decision you are deferring — document it in your CHANGELOG. Do not skip items without writing down why.

What this does not cover

Physical security, network perimeter controls, and formal compliance audits (SOC 2, ISO 27001) are out of scope. This checklist addresses the application-layer controls that are the prerequisite for any of those programs.

Authentication & Sessions

7 items

Auth middleware applied to all protected routes — not just the ones you remembered

Review every route file. Auth applied to /dashboard but not /api/user is a common miss.

Session revocation implemented — logout actually invalidates sessions server-side

Deleting a cookie client-side without invalidating the server-side session token is not logout.

JWT algorithm allowlisted (HS256 or RS256 only, never 'none' or dynamic)

Libraries that accept algorithm from the JWT header are vulnerable to alg:none bypass. Severity: critical.

MFA available for high-privilege actions (billing changes, user deletions, role escalations)

Does not need to be enforced for all users — required for admin-level operations.

Password reset flow does not leak account existence

Both 'email found' and 'email not found' paths should return identical responses and timing.

OAuth redirect URI allowlisted explicitly — no wildcard or path-prefix matching

Wildcard redirect URIs allow open redirect attacks. Each URI must be exact-match.

Session tokens use appropriate max-age and HttpOnly + Secure + SameSite flags

HttpOnly prevents XSS token theft. Secure prevents transmission over HTTP. SameSite=Strict prevents CSRF.

Secrets & Environment

6 items

All secrets in environment variables, not source code or config files

Search git history for common patterns: sk_live_, AKIA, password =, token =.

Secrets rotation plan documented — who does it, how, and on what schedule

A secret that has never been rotated is a secret that has probably leaked.

.env file in .gitignore — and git history audited for any prior commit

Use git log --all --full-history -- '**/.env' to check history.

Webhook secrets verified on every incoming request, not just on first setup

Stripe-Signature header must be validated with constructEvent() on every webhook POST.

API keys scoped to minimum required permissions

A Stripe key used only for checkout does not need refund or dispute permissions.

Separate secrets for dev, staging, and production environments

Using production Stripe keys in development is a compliance and financial risk.

Database & Data Access

6 items

Row-level security (RLS) policies on all tables with multi-tenant data

In multi-tenant apps without RLS, every query must manually filter by tenant_id. Both approaches require explicit enforcement.

Parameterized queries everywhere — no string concatenation in SQL

db.query('SELECT * FROM users WHERE email = ' + email) is a SQL injection. No exceptions.

Database errors sanitized before reaching API responses

SQLite, Postgres, and D1 error messages often include table names, column names, and constraint details.

Migrations are reversible with a documented rollback path

DOWN migration or manual rollback script for every schema change. Test it before production.

No direct database access from client-side code

Even if your DB client 'works' from the browser, it means your credentials are in the bundle.

Backup strategy documented and tested — restore procedure verified at least once

An untested backup is not a backup. Restore from backup to a staging environment to confirm.

API & Input Validation

6 items

Input validation on all incoming request bodies — schema or runtime type checking

Zod, Valibot, or equivalent. Validate shape, type, length, and allowed values — not just presence.

Rate limiting on authentication endpoints (login, password reset, 2FA verify)

Unprotected login endpoints allow credential stuffing at scale. 5–10 requests per minute per IP is a reasonable default.

Rate limiting on expensive operations (AI calls, file uploads, email sends)

A single user triggering 10,000 AI API calls will cost you money before you notice.

CORS configured explicitly — not wildcard (*) in production

Allow-Origin: * on an authenticated API defeats cookie-based CSRF protections entirely.

Request size limits enforced — especially on file upload and JSON body endpoints

Without limits, a single 1GB POST can exhaust your worker's memory budget.

Content-Type validation on file uploads — MIME type checked server-side

Client-specified content type is trivially forged. Inspect magic bytes for file type verification.

Billing & Payments

5 items

Stripe webhook signature verified (not just parsing the JSON payload)

Stripe.webhooks.constructEvent() must be called with raw body and STRIPE_WEBHOOK_SECRET before any processing.

Webhook handler is idempotent — safe to replay any event

Stripe retries failed webhooks. INSERT OR IGNORE or idempotency key on event.id prevents double-billing.

Failed payment handling does not leave database in inconsistent state

If the webhook fails halfway through updating subscription state, is the tenant in limbo? Use transactions.

Tier gating enforced server-side — not just in the UI

Checking the user's tier only in a React component means any API client bypasses the gate entirely.

Checkout flow has server-side pre-flight check for existing active subscriptions

Double-checkout creates two active subscriptions in Stripe. Check before creating a Checkout Session.

Logging & Observability

5 items

Error logging to a monitoring service — not just console.log or wrangler tail

console.log is not persistent. Sentry, Datadog, or structured D1/KV logging survives process restarts.

No PII in logs — email, phone, SSN, payment data scrubbed before logging

Log user IDs and tenant IDs, not email addresses. A log file is not a GDPR-compliant data store.

Request tracing enabled — trace IDs propagated through all downstream calls

Without trace IDs, debugging a production failure means correlating timestamps across disconnected logs.

Alert on anomalous error rates — not just on downtime

A 20% error rate on /api/users that doesn't take the service fully down will not trigger an uptime monitor.

Health check endpoint available and returns build hash or deploy timestamp

GET /health → {"status":"ok","build":"abc123"}. Useful for confirming which version is deployed.

Error Handling

4 items

Generic error messages to users in production — no stack traces in API responses

Stack traces reveal file paths, library versions, and internal variable names to attackers.

All async operations have error boundaries or catch blocks

An unhandled promise rejection in a Cloudflare Worker crashes the request. Every await needs try/catch.

Unhandled promise rejections logged and alerted on

addEventListener('unhandledrejection') in the worker global catches escaping rejections.

Graceful degradation on dependency failures — external APIs, DB, KV

If your KV session store is unavailable, should all users get 500? Or can you degrade gracefully to read-only mode?

Testing

5 items

Auth flow tests: login, logout, session expiry, protected route rejection

The most important tests to have. Auth regressions are silent until a user reports data they shouldn't see.

Multi-tenant isolation test: confirm tenant A cannot read tenant B's data

Seed two tenants, cross-query from tenant A, assert 404 not 200. Run this test on every schema change.

Billing test: checkout, webhook receipt, tier upgrade, failed payment

Use Stripe's test webhook CLI to replay real webhook payloads in your test environment.

Rate limit test: confirm limits are enforced at the specified threshold

Don't just test that rate limiting exists — test that it fires at the right count and resets correctly.

Coverage target documented and measured in CI — not just locally

A coverage gate that only runs on your laptop is not a coverage gate.

Governance

4 items

Threat model exists and has been reviewed — even informally

A one-page STRIDE analysis for your specific architecture is better than generic security advice from a blog post.

ADRs document key architectural decisions — auth strategy, data isolation approach, third-party dependencies

ADRs prevent the next developer (or AI agent) from undoing decisions that have non-obvious reasons.

Architectural constraints documented for AI coding agents in machine-readable form

A .ai/constraints.yaml file read by Cursor or Claude Code prevents agents from bypassing auth or adding insecure patterns.

CHANGELOG maintained for significant changes — what shipped, when, and what it changed

A CHANGELOG is not a luxury. It is how you debug production incidents that correlate with recent deploys.

Deployment & Rollback

5 items

Rollback plan documented for every deployment — not improvised in an outage

Know which wrangler rollback command, git revert, or database migration to run before you deploy.

Feature flags in place for high-risk changes — especially billing and auth changes

A feature flag lets you disable a broken feature without a full rollback.

Database migrations tested on staging before production — not skipped

A migration that drops a column and adds a new one is a multi-step operation that can fail midway.

CDN and cache invalidation strategy documented — stale assets after deploy are a user-visible bug

Deploy a new API contract, forget to bust the CDN cache, watch old JS call new endpoints with wrong payloads.

Zero-downtime deployment verified — especially for schema migrations that run while traffic is live

Adding a NOT NULL column without a default to a live table will fail every inflight request during migration.

Field findings

What teams most commonly miss

These four gaps appear repeatedly in AI-generated SaaS apps. All four are non-obvious at build time and visible only when an attacker looks — or when your first enterprise prospect runs a security review.

01 critical

Webhook signature not verified

The Stripe webhook handler parses the event body as JSON and acts on it — without calling constructEvent() with the webhook secret. Any attacker can POST a forged invoice.paid or customer.subscription.updated event and trigger a free tier upgrade. This is the single most common billing security gap in AI-generated apps.

fix: src/routes/api/webhooks.ts — call Stripe.webhooks.constructEvent() with raw body before any processing.

02 critical

Missing tenant_id predicate in D1 queries

A multi-tenant app with a single shared D1 database and no RLS must filter every query by tenant_id. A single SELECT * FROM users WHERE id = ? without AND tenant_id = ? means any authenticated user can read any other tenant's data by guessing or enumerating UUIDs. AI builders generate the CRUD; they do not generate the isolation predicate.

fix: Audit every D1 query. Use a withTenant() helper that appends tenant_id binding to every query.

03 high

Tier gate enforced only in the UI

The Pro feature check is a conditional in a React component: if (user.tier === 'pro') return <ProFeature />. The API endpoint behind the feature has no server-side tier check. A free user with the browser dev tools can call the endpoint directly. Gating only in the UI means the gate does not exist.

fix: Every API endpoint for a gated feature must read and verify the tier from the session or database before returning data.

04 high

Logout does not invalidate the session server-side

The logout handler deletes the session cookie client-side. The session token in KV is never invalidated. The cookie is gone — but anyone who captured the token value (via XSS, log file, or shoulder-surfing) can still authenticate with it for however long the TTL lasts. Real revocation requires deleting the server-side session record.

fix: POST /api/auth/logout → KV.delete(sessionToken) before clearing the cookie.

The governance layer

Stackbilder generates the threat model, ADRs, and test plan that underpin this checklist.

The Governance section of this checklist — threat model, ADRs, architectural constraints — is generated automatically for every scaffold. The Testing section maps directly to the test plan. The Security items trace back to the STRIDE threat model.

Start free. Full governance suite with every scaffold. Cloudflare Workers scaffold with D1, KV, auth middleware, and Stripe webhook verification wired correctly from the first file.

Start free — no credit card See sample output →

What ships with every scaffold

.ai/threat-model.md

STRIDE-based security analysis for your specific architecture. Covers the Security section of this checklist.

.ai/adr-001.md + adr-002.md

Architectural decisions documented with context, alternatives, and consequences. The Governance section, generated.

.ai/test-plan.md

Integration test specs for auth, billing, multi-tenant isolation, and rate limiting. Covers the Testing section exactly.

.ai/constraints.yaml

Machine-readable guardrails for AI coding agents. Prevents the architectural regressions that the checklist protects against.

wrangler.toml

D1, KV, secrets, and triggers configured correctly. Environment separation built in.

generation time ~20ms

free tier scaffolds / month 3

credit card required no

Questions about this checklist

How should I use this checklist?

Work through it category by category before your first production deployment with real users or real money. Treat it as a gate, not a guideline — if you cannot check an item, that is a decision you are consciously deferring, not an oversight. Document what you are skipping and why. The Governance section is the right place to record those decisions.

What do teams most commonly miss before shipping?

In order of frequency: (1) Stripe webhook signature verification — webhooks parsed without checking the signature allow forged billing events; (2) cross-tenant data access — missing tenant_id predicates in multi-tenant queries; (3) server-side tier gating — billing tier enforced only in the UI; (4) session revocation — logout does not invalidate the session token server-side. All four are non-obvious at build time and appear invisible until exploited.

Does Stackbilder automate all of this?

Stackbilder generates the governance layer — threat model, ADRs, test plan, and architectural constraints — that covers the Governance section and directly informs the Testing and Security items. It does not replace your deployment pipeline, secret rotation tooling, or observability stack. The generated test plan tells you which tests to write; you still write and run them. The generated threat model identifies the risks; you still implement the mitigations.

Does passing this checklist mean I am SOC 2 compliant?

No. SOC 2 requires an independent auditor to review your controls over a trust service period — it cannot be achieved by self-attestation. This checklist covers the technical controls that are prerequisites for a SOC 2 engagement. Without them, an audit surfaces them as gaps immediately. Completing this checklist puts you in a defensible starting position. For teams preparing for formal compliance, our bespoke services include dedicated security review engagements.

AI app hardening → Sample scaffold output → Threat model generator → ADR generator → Test plan generator → Services → Pricing → Start free →

AI App ProductionReadiness Checklist

Authentication & Sessions

Secrets & Environment

Database & Data Access

API & Input Validation

Billing & Payments

Logging & Observability

Error Handling

Testing

Governance

Deployment & Rollback

What teams most commonly miss

Webhook signature not verified

Missing tenant_id predicate in D1 queries

Tier gate enforced only in the UI

Logout does not invalidate the session server-side

Stackbilder generates the threat model, ADRs, and test plan that underpin this checklist.

Questions about this checklist

How should I use this checklist?

What do teams most commonly miss before shipping?

Does Stackbilder automate all of this?

Does passing this checklist mean I am SOC 2 compliant?

AI App Production
Readiness Checklist