Back to Blog
tech/

The Web Site is the Demo

An exhaustive technical description of the Deal ex Machina website: stack, architecture, code quality, CI/CD, security, privacy, GDPR, EU AI Act, performance scores, roadmap, and why the web is the demo.

This site is the demo. Not a side project with a separate portfolio: the website you are reading is the proof of technical depth, from infrastructure and security to front-end performance and developer experience. It was fully hand-coded with Cursor [1] (AI-assisted IDE), with no low-code or page builders. What follows is a CTO-level, bottom-to-top walkthrough of how it is put together: runtimes, data, APIs, quality gates, CI/CD, security, and the performance numbers we hold ourselves to.

Why not a classic showcase site (WordPress, Wix, Squarespace)? Because for a technical consultancy, the site is the product: every dependency, every API, every header and cookie is a statement. A showcase site built with themes and plugins hides the very thing we sell—architecture, security, performance—behind a black box. Here, there is no theme to blame, no plugin to patch: one codebase, full control, type-safe from DB to UI, and a deployment that we can explain line by line. The narrative is simple: what you see is what we build.


1. Runtimes and Foundation

Node.js [2]: Pinned to >=20.19.6 (and npm >=10.8.2) via package.json engines. We use the LTS line and avoid floating majors in production.

TypeScript [3]: 5.9.x with strict mode (strict: true, noEmit: true, isolatedModules: true, moduleResolution: "bundler"). No any in production code; path aliases @/* for clean imports. The codebase is ESM-only where possible (see project conventions).

Framework: Next.js 16 [4] (App Router). We use the standalone output for the Docker/Koyeb deployment and support static export for Cloudflare Pages when NEXT_OUTPUT=export or CLOUDFLARE_PAGES=1. So: one codebase, two deployment targets (Node server vs static + edge).


2. Data and Backend

Database: PostgreSQL [5] accessed via Drizzle ORM [6] (drizzle-orm, drizzle-kit). Schema and migrations live in drizzle/; we use db:push, db:generate, db:migrate, and db:studio for development. Connection uses pooling (DATABASE_POOLING_URL_IP4 or equivalent); all writes go through a shared transaction helper (withTransaction) with rollback on failure and structured logging.

Auth and storage: Supabase [7] (SSR-compatible client and server utilities) for 6-digit email OTP sign-in and any Supabase-backed features. Sessions rely on Supabase cookies (HttpOnly, Secure, SameSite per your project settings).

Content: Content Collections [8] (@content-collections/core, @content-collections/markdown, @content-collections/next) for the blog. Markdown lives in content/blog/ with a Zod [9] schema (title/en/fr, excerpt, slug, category, accessLevel, etc.). We use remark-gfm [10] for GitHub Flavored Markdown. Build-time compilation only; no runtime markdown parsing on the client.


3. APIs and Application Logic

API routes (Next.js App Router):

  • POST /api/chat — streaming chat with the AI (Wagmi); session limits, rate limiting, and content moderation. Contact with the team is through the chat only — no separate contact form.
  • GET/POST /api/chat/status — chat status/availability.
  • GET /api/llm/status — LLM provider status.
  • GET /api/health — health check; minimal payload in production (status only), richer in development (uptime, memory, env).
  • POST /api/contacts/classification/request — contact classification (used by Wagmi for role-based behaviour).
  • GET /api/auth/callback — Supabase auth callback.

AI/LLM: Vercel AI SDK [11] (ai, @ai-sdk/openai, @ai-sdk/openai-compatible, @ai-sdk/react) with Assistant UI [12] (and react-ai-sdk, react-markdown) for the chat UI. LLM config is validated with Zod (LLM_API_URL, LLM_MODEL, LLM_INTERFACE, LLM_API_KEY); production requires LLM_API_URL. We support OpenAI-compatible endpoints (e.g. local Ollama).

Dual-tier LLM architecture: The chat operates two model tiers. Anonymous visitors are served by a small model (Qwen 2.5 1.5B on CPU via Koyeb) — fast, cheap, but limited in reasoning. Authenticated users unlock a larger GPU-backed model with deeper context handling. Fallback routing (GPU → CPU) ensures availability even when the GPU backend is down. Model selection is transparent: visitors see a notice that Wagmi is in "small model mode" and are nudged to authenticate.

Local RAG for small-model grounding: Because a 1.5B model hallucinates easily, we built a lightweight BM25-style RAG (local-rag.ts). At query time, the user message is tokenized (with bilingual EN/FR stopword removal), scored against pre-segmented chunks from wagmi-skills.md and ai.txt, and the top 4 matching snippets are injected into the system prompt. No vector database, no embeddings — just token overlap scoring. This is enough to ground answers on verified company facts and prevent the small model from inventing services, people, or partners that don't exist.

SFT dataset for fine-tuning: We generate a Supervised Fine-Tuning dataset (scripts/generate-wagmi-sft-dataset.ts) from the blog posts, the knowledge base, and ai.txt. The script produces 267 training examples and 47 eval examples in JSONL format, covering identity guardrails, service descriptions, authentication nudging, uncertainty expression, and conciseness requirements — all in both French and English. The dataset is designed for frameworks like Unsloth and targets the Qwen 1.5B model specifically.

Behavioural benchmark: A dedicated benchmark script (scripts/benchmark-rag-qwen15b.ts) runs 20+ test cases against the small model with and without RAG context, measuring factual accuracy, hallucination rate, auth-upsell compliance, and latency. This is the quality gate for the small model: if it fails the benchmark, it doesn't ship.

MaxTokens capping: Small LLMs (1.5B–3B) often fail to emit an end-of-sequence token, producing correct content followed by infinite repetition. We cap maxTokens at 300 for small models (~200 words, matching the prompt guidelines) and 1024 for larger models.

Validation: Zod everywhere — request bodies, env vars, and content-collections schema. Invalid input fails fast with typed error responses.

Errors: Custom ApiError hierarchy (api-error.ts): ValidationError, NotFoundError, etc., with toJSON() for consistent API responses and integration with a structured logger (request id, session id, module). No raw stack traces to the client in production.


4. Front End and UX

UI: React 19 [13] with Tailwind CSS [14] and Radix UI [15] primitives (Avatar, Dialog, Label, Slot, Tooltip). We use class-variance-authority and tailwind-merge for component variants. Icons: lucide-react (tree-shaken via Next.js optimizePackageImports). Zustand for client state where needed.

i18n: next-intl [16] (v4) for EN/FR: messages in src/i18n/messages/{en,fr}.json, locale in the path ([locale]), and shared metadata/alternates for SEO.

Routing: App Router with [locale] and route groups: (routes) for blog and pages, (legal) for legal pages. Canonical URLs and hreflang are generated in metadata (see metadata/utils.ts) so every page has correct <link rel="canonical"> and alternates.

Performance (front-end):

  • Critical CSS: Hero and above-the-fold CSS inlined; Critters [17] used in a postbuild script for static export to extract and inline critical CSS.
  • Code splitting: Chat and other heavy UI are loaded with dynamic(..., { ssr: false }) so the main bundle stays small; webpack splitChunks separate framework, lib (Radix, assistant-ui, lucide), vendor, and common.
  • Images: Next.js Image with AVIF/WebP, deviceSizes/imageSizes tuned, CSP for images; for static export we use unoptimized where required.
  • Fonts: next/font with adjustFontFallback to avoid CLS; display: swap and preload where appropriate.
  • Resource hints: preconnect/dns-prefetch for critical origins.

SEO and crawlers: robots.txt, llms.txt, and ai.txt for AI crawlers; Schema.org JSON-LD generated server-side; sitemaps for the site and blog. Auth/error pages are noindex.


5. Code Quality and DX

Linting and formatting: Biome [18] (v2). We use biome check and biome format; config enforces double quotes, 120 line width, LF, and organized imports. No ESLint/Prettier in this project.

Git hooks: simple-git-hooks + lint-staged. On pre-commit we run Biome on *.{js,jsx,ts,tsx} and actionlint on .github/workflows/*.{yml,yaml}. So no unformatted or lint-break commits.

Tests:

  • Vitest [19] (v4) for unit and integration: src/__tests__/unit, src/__tests__/integration, src/__tests__/security, and src/components/__tests__. Coverage with @vitest/coverage-v8; we run test:run in CI and test:ci with coverage.
  • Playwright [20] for E2E: e2e/ (chat, hydration, error boundaries, contacts recognition). No flaky patterns; tests are part of the definition of done.
  • Lighthouse CI [21] (@lhci/cli): lighthouserc.js defines assertions (FCP, LCP, TBT, CLS, Speed Index, accessibility, best practices, SEO). Runs on PRs to main/dev and on demand.

Type checking: tsc --noEmit as a separate step (type-check). CI runs it so type safety is enforced before merge.

Dependencies: Dependabot is enabled (npm weekly, GitHub Actions [22] monthly) with grouped minor/patch updates. We run npm audit --audit-level=critical in CI before build. Overrides for known issues (glob, rimraf, tar, cross-spawn) are declared in package.json.


6. CI/CD and Deployment

GitHub Actions:

  1. Deploy Staging (Koyeb) [23] — on push to dev (and workflow_dispatch): install deps, critical audit, lint, test:run, then Docker [24] build and push to Docker Hub (jeanbapt/deal-ex-machina-web). Second job: Koyeb CLI to update the deal-ex-machina-staging/web service (Docker image, env, health check on /api/health, 60s grace period). We wait for deployment HEALTHY and then hit the staging URL for a final health check.
  2. Lighthouse CI — on PRs to main/dev (and manual): checkout, npm ci, build (with placeholder env), then npm run lighthouse; results uploaded as artifacts (and optionally to LHCI server).
  3. Deploy Production (Cloudflare Pages) [25] — manual only (workflow_dispatch): optional staging health check, then build with NEXT_OUTPUT=export and CLOUDFLARE_PAGES=1, deploy out/ via Wrangler to Cloudflare Pages. Production and preview environments; custom domain and redirects documented.

Docker: Multi-stage Dockerfile (Debian Bookworm slim base, security updates, non-root user nextjs). We copy only .next/standalone, .next/static, and public; start with node --max-old-space-size=512 server.js on port 8000. No healthcheck in the image so Koyeb can own health checks. .dockerignore keeps build context small and excludes dev artifacts and Lighthouse output.

Secrets: No secrets in repo. We use GitHub Actions secrets (e.g. DOCKER_HUB_TOKEN, KOYEB_API_TOKEN, CLOUDFLARE_API_TOKEN_PAGE, CLOUDFLARE_ACCOUNT_ID) and env at runtime (Koyeb env vars for the service). Docs (e.g. KOYEB_DOCKER_HUB_SECRET) describe how to configure registry and deployment.


7. Security

Headers (production only, in next.config.mjs): Content-Security-Policy (default-src 'self', script/style/img/font/connect tailored, frame-ancestors 'none', object-src 'none', upgrade-insecure-requests), Strict-Transport-Security (max-age=31536000; includeSubDomains; preload), X-Frame-Options: DENY, X-Content-Type-Options: nosniff, Referrer-Policy: strict-origin-when-cross-origin, Permissions-Policy (camera, microphone, geolocation disabled). poweredByHeader: false.

CORS: Allowed origin is configurable; we validate Origin for state-changing requests and document allowed methods and headers. Security tests cover CSRF/CORS (origin allowlist, SameSite cookie requirements, POST-only for mutations).

Rate limiting: In-memory rate limiter (per identifier): chat 20 req/15 min. Applied to public APIs. (Scaling to Redis or platform rate limits is documented as a future step.)

Input: Zod for all API inputs; @2toad/profanity for content moderation in chat. SQL is only via Drizzle (parameterized). React escaping and CSP mitigate XSS. We have dedicated tests for CSRF, CORS, session fixation, and request validation.

Health endpoint: In production the health response is { "status": "ok" } only — no stack traces, no internal details.


8. Privacy, GDPR, and EU AI Act

We treat privacy and regulatory alignment as non-negotiable. The site is designed to be privacy-preserving by default and to comply with GDPR and the EU AI Act (Regulation 2024/1689).

Data minimization and consent: We collect only what is necessary. Chat: anonymous conversations are not persisted; we store messages only when the user submits an email and has given explicit consent (checkbox, logged with the submission). Stored chat data is retained for 7 days then removed. Technical data (e.g. IP, browser) is limited to what is needed for operation. All processing is justified under GDPR Article 6 (consent or legitimate interest). The chat email flow requires an explicit “I agree to data processing” and a link to the privacy policy; the API accepts a consent flag and rejects or does not persist when it is false.

Rights and transparency: The privacy policy (EN/FR) describes what we collect, why, and for how long. Users are informed of their rights (access, rectification, erasure, restriction, portability, objection) and can contact the designated contact (DPO-style) to exercise them. We respond within GDPR timeframes. Cookies: we use only essential cookies (e.g. session, auth); no tracking cookies, no third-party analytics. Cookie preferences and a short explanation are exposed in the UI.

EU AI Act: We align with the Act’s transparency and risk obligations. Users are clearly informed when they interact with an AI system (the chat is presented as an assistant). We maintain (or can produce) technical documentation and human oversight; we do not use AI for prohibited practices (social scoring, manipulative or subliminal techniques, emotion recognition in sensitive contexts). The chatbot is treated as a limited-risk system and meets the transparency requirements (disclosure, no misleading anthropomorphism). Incident reporting is possible via the contact address. Terms and privacy policy both reference the AI Act and disclaim reliance on AI-generated content.

Implementation: Consent is required before persisting any identifiable chat data; the chat API and front end enforce this. Privacy and legal pages are linked from the footer and from consent flows. This is not a one-off compliance pass: any new feature that touches personal data or AI behaviour is designed with GDPR and the AI Act in mind from the start.


9. Performance Scores and Targets

We treat Lighthouse as a quality gate. Current targets and results (from local production builds and CI):

Lighthouse scores (typical):

  • Performance: 96% (target ≥95%).
  • Accessibility: 95–100% (target ≥90%).
  • Best Practices: 89–96% (target ≥90%; we’re close and improving).
  • SEO: 92% (target ≥90%); canonical and hreflang implemented.

Core Web Vitals:

  • LCP: 2.5–2.7 s (target <2.5 s; we’re just above on some runs).
  • FCP: ~907–911 ms (target <1.8 s).
  • TBT: 0 ms (target <200 ms).
  • CLS: 0 (target <0.1).
  • Speed Index: <3 s.

Improvements that got us here: lazy-loaded ChatSection (~100 KB deferred), server-side structured data, critical CSS inlining, preconnect/preload, font optimization, and aggressive code splitting. Details are in docs/ (e.g. PERFORMANCE_RESULTS_FINAL.md, LIGHTHOUSE_RESULTS_96_PERCENT.md).


10. What We Explicitly Do Not Use

No low-code or no-code site builders; no WordPress or generic CMS for the main site. No arbitrary any; no disabling strict TypeScript. No console in production (compiler removes it). No secrets in the repo or in client bundles. No health endpoint leaking internals in production. We do not ship without lint, type-check, and tests in CI.


11. Roadmap and Enhancements: Towards AI-Native, One Step at a Time

Direction matters as much as current state. The roadmap is explicit: evolve this site into an AI-native experience, where AI is not a widget bolted onto a classic page but the primary way the product thinks, assists, and adapts. That shift is done incrementally — one step at a time — so each change is shippable, measurable, and reversible.

What “AI-native” means here: The site and its flows are designed with AI as a first-class actor. Content, navigation, contact (via Wagmi, the chat — no separate form), and discovery are shaped so that an assistant can understand context, act on the user’s intent, and improve with usage — without replacing the existing, stable core. Today we have a chat with role-based behaviour and session limits; tomorrow we add richer context (e.g. page, locale, prior messages) and tool use; later we introduce proactive suggestions, summarisation, or guided flows. Each step is a discrete enhancement, with the same quality bar: type safety, tests, security, and performance.

Concrete enhancement steps (illustrative, not exhaustive):

  • Context and tool use: Feed the model structured context (current page, locale, blog slug) and expose safe tools (e.g. “open blog post”, “scroll to section”) so the assistant can act, not only answer. (Partially delivered: the local RAG now injects verified context from the knowledge base into every small-model query; tool use is next.)
  • Proactive assistance: Use page and scroll state to offer help or next steps when relevant, without being intrusive.
  • Content and discovery: Let the assistant recommend or summarise blog posts, or answer from site content in a grounded way. (Delivered: the RAG retrieves from blog-derived knowledge base; the SFT dataset trains on actual blog content.)
  • Contact and intent: Contact is already unified in Wagmi (the chat). Roadmap: tighten the path from “I want to talk to someone” to calendar or meeting, with the assistant guiding and pre-filling where appropriate.
  • Observability and iteration: Log usage and quality (within privacy constraints), run A/B or model variants where it makes sense, and keep rate limits and safety in place.

None of this requires a rewrite. The current stack — Vercel AI SDK, Assistant UI, Zod-validated APIs, and a clear separation between server and client — is built to absorb these steps. The roadmap is a sequence of such steps: each one merged, deployed, and validated before the next. The goal is a site that feels AI-native because every addition is designed for it, not retrofitted.


12. Why the Web Is the Demo

This stack is chosen so that the site itself demonstrates:

  • Architecture: Clear layers (app, components, lib, i18n), domain-oriented modules (chat, db, config, errors), and a single codebase for two deployment modes.
  • Reliability: Transactions, structured errors, health checks, and deployment gates.
  • Security: CSP, HSTS, validation, rate limiting, and tests that encode security expectations.
  • Performance: Measured and asserted (Lighthouse CI), with documented optimizations and trade-offs.
  • Maintainability: TypeScript strict, Biome, hooks, tests, and Dependabot.

The site is fully hand-coded with Cursor [1]: every route, component, and config described above was written and refined in the editor, with AI assistance for implementation speed and consistency, but without sacrificing control over architecture, security, or performance. If you are evaluating technical depth, the repo and the running site are the deliverables to inspect.


References

Links to the key components of the stack (official sites or GitHub):

  1. Cursor — AI-assisted IDE
  2. Node.js — JavaScript runtime
  3. TypeScript — Typed JavaScript
  4. Next.js — React framework (GitHub)
  5. PostgreSQL — Database
  6. Drizzle ORM — TypeScript ORM (GitHub)
  7. Supabase — Auth and backend (GitHub)
  8. Content Collections — Content layer for Next.js
  9. Zod — Schema validation (GitHub)
  10. remark-gfm — GitHub Flavored Markdown
  11. Vercel AI SDK — AI/LLM integration (GitHub)
  12. Assistant UI — Chat UI components (GitHub)
  13. React — UI library (GitHub)
  14. Tailwind CSS — Utility-first CSS (GitHub)
  15. Radix UI — Unstyled primitives (GitHub)
  16. next-intl — Internationalization (GitHub)
  17. Critters — Critical CSS inlining
  18. Biome — Linter and formatter (GitHub)
  19. Vitest — Unit test runner (GitHub)
  20. Playwright — E2E testing (GitHub)
  21. Lighthouse CI — Performance auditing
  22. GitHub Actions — CI/CD
  23. Koyeb — App deployment (staging)
  24. Docker — Containers
  25. Cloudflare Pages — Static/hybrid hosting (production)

Summary of key technologies: Node 20, TypeScript 5.9 (strict), Next.js 16 (App Router), React 19, Tailwind CSS, Drizzle ORM, PostgreSQL, Supabase, Content Collections, Zod, Vercel AI SDK, Assistant UI, next-intl, Radix UI, Biome, Vitest, Playwright, Lighthouse CI, Docker, Koyeb (staging), Cloudflare Pages (production), GitHub Actions.