Back to portfolio
Neural Oversight · Build notes

A daily editorial pipeline disguised as a web app

Forty-eight RSS feeds in. One curated, risk-scored, entity-graphed, sentiment-tracked, regulation-aware briefing out — every morning, across web, email, push, Slack, and the Play Store, with the same Claude call doing the editorial work a newsroom would do at four in the morning.

~14 min readNext.js 14 · Supabase · Anthropic Claude · Resend · VercelProduction · daily cron at 08:00 UTC

What this is

§01

Neural Oversight exists to answer one question, every morning, for AI governance professionals, policy makers, safety researchers, CTOs and venture capitalists:

What happened in AI today that I need to know about — and what does it mean?

Where a typical reader gives you a chronological feed, Neural Oversight performs an editorial pipeline on your behalf. Every night (and on-demand) it pulls articles, de-duplicates them against everything it has already seen, asks Claude to act as a senior intelligence analyst, clusters related stories together, generates a structured newsletter, then broadcasts that across web, email, push and Slack — with a Trusted Web Activity wrapper shipping the same app to the Play Store.

Audience
AI governance professionals, policy makers, safety researchers, CTOs, and venture capitalists. People who need a 2-minute morning read, not a feed to scroll.
Business model
Free, no paywall. Hosted on Vercel Pro; Supabase + Resend on cloud tiers. Costs controlled by consolidating LLM work into a single selection call per ingest.
Stack
Next.js 14 (App Router) + TypeScript + Tailwind + Radix UI + Supabase (Postgres / Auth / Storage) + Anthropic Claude (Haiku 4.5) + Resend + Web Push (VAPID) + Bubblewrap TWA.
Position
Two products in one repo: the Next.js app (deployed to Vercel, hosted at app.neuraloversight.com + neuraloversight.com) and a Bubblewrap Android shell that wraps the PWA as a Trusted Web Activity for the Play Store.

The mental model that makes the rest make sense

§02

If you only remember one thing about Neural Oversight, remember this: it is a daily editorial pipeline disguised as a web app. Everything you see in the UI — the dashboard, the chat, the trends, the regulation tracker, the entity graph — is a view onto a single curated stream produced once a day by GET /api/ingest.

One endpoint runs the entire pipeline. Every other surface is downstream of that endpoint. Understand the cron, understand the product.

That single endpoint is gated by a timing-safe Bearer-token comparison against CRON_SECRET, declares maxDuration = 300 (Vercel Pro's hard cap), and is wired up to Vercel Cron at 08:00 UTC every day in vercel.json. It can also be triggered manually from a button on the dashboard via /api/ingest/trigger.

The ingestion pipeline, stage by stage

§03

The pipeline is twelve stages long, runs end-to-end in under five minutes, and is wrapped in try/catch at every fragile boundary so a flaky source, a bad AI response, or a 410-Gone push endpoint cannot block tomorrow's run.

  1. 01
    Load active sources

    Pulls every row from sources where is_active = true. Currently ~48 hand-picked AI feeds: Anthropic, OpenAI, DeepMind, MIT Tech Review, Wired, arXiv cs.AI, Stanford HAI, EU AI Act news, FLI, MIRI, Brookings, RAND, regulators, safety institutes, substacks.

  2. 02
    Fetch RSS in batches of 10

    rss-parser wrapped with an 8-second timeout, raced against a 10-second hard kill. HTML entities decoded; tags stripped into a clean snippet. Per-source errors are collected but never fail the job.

  3. 03
    Recency filter (48h)

    Articles older than 48 hours by RSS published_at are dropped. Missing dates pass through — let the AI decide.

  4. 04
    URL-level deduplication

    Queries articles in batches of 200 to find which URLs already exist. Only new URLs survive into the candidate pool.

  5. 05
    Single Claude call — selection + categorisation

    Up to 60 candidates sent in ONE prompt (title + first 200 chars + source + date). Claude (Haiku 4.5) acts as a senior AI intelligence analyst, returning up to 20 selected articles with exactly 5 marked as top stories, plus category, subcategory, 1–2 sentence summary, risk score (1–10), relevance score (1–10), sentiment score (-1.0…+1.0), up to 5 entities with salience, 1–3 topics, and a reasoning string.

  6. 06
    Upsert

    Each selected article is upserted into articles with onConflict: 'url', ignoreDuplicates: true. Idempotent — running the cron twice is safe.

  7. 07
    Post-processing (best-effort)

    Five passes run in parallel, each in its own try/catch: canonicalised entity extraction → article_entities link table; topic aggregation into topic_trends; sentiment-shift detection vs a 7-day baseline → sentiment_alerts; heuristic regulation detection (~20 keywords + jurisdiction + status) → regulations; trigram clustering of titles via find_similar_cluster RPC → story_clusters.

  8. 08
    Daily briefing generation

    Two parallel Claude calls. First — the legacy briefing (plain-text headline + 4–6 bullets + a 'Watch:' analysis). Second — full-text extraction of top-story URLs via Mozilla Readability (3000 chars max, 5 concurrent workers, 8s timeout), fed into generateStructuredBriefing to produce a 15-word headline, 200–300-word deep dive on the #1 story, top stories with why_it_matters, 5–8 quick hits, tools & launches, trend watch, and key themes.

  9. 09
    Email broadcast

    Paginates through all Supabase auth users; filters out anyone in email_preferences.unsubscribed. Per-recipient send (not BCC) via Resend, each with their own signed one-click unsubscribe URL. RFC 8058 List-Unsubscribe + List-Unsubscribe-Post headers so Gmail honours the one-click.

  10. 10
    Webhook + Web Push

    POSTs to every active Slack/webhook in notification_channels. Fans out to every row in push_subscriptions via web-push using VAPID. 404/410 endpoints auto-pruned in the same pass — no orphaned subscriptions.

  11. 11
    Weekly roundup (Mondays only)

    getPreviousWeekRange() returns null on non-Mondays. On Mondays: emerging/declining topics vs the previous week, entity movers, regulatory updates, AI-written executive summary and outlook. One row upserted into weekly_roundups, keyed by week_start.

  12. 12
    Response

    { success, new_articles, fetched, reasoning, errors, timestamp }. Errors collected but non-fatal — the response is the audit log for that run.

Why this design, not the obvious one

§04

There are three big architectural decisions in the pipeline, all of them counter-intuitive at first glance.

One AI call for selection, not one per article

The obvious design is per-article: send each candidate to the LLM, get back metadata, store. That's hundreds of calls a day with no shared context. Instead Neural Oversight sends up to 60 candidates in a single prompt and asks Claude to act as a global editor — selecting the top 20, marking exactly 5 as top stories, deduplicating across sources, balancing categories. One call is dramatically cheaper AND produces better editorial judgement because the model can see the whole front page at once.

Best-effort post-processing, not transactional

Entity extraction, trend aggregation, sentiment alerts, regulation detection and clustering are each wrapped in their own try/catch. A flaky migration or a malformed entity row cannot break the whole ingest. The article gets in; the metadata gets enriched if it can; tomorrow's run picks up the slack.

Idempotent by construction

onConflict: 'url' on articles. onConflict: 'date' on briefings. onConflict: 'topic,date' on topic_trends. unique(week_start) on weekly_roundups. The cron can fire twice in a row, or be manually re-triggered, with no duplicates, no double-emails, no broken state. This matters more than it sounds — it makes the pipeline safe to retry under any failure.

Cluster summaries regenerated at thresholds, not every insert

When a new article joins an existing cluster, the cluster's weighted-average sentiment is recomputed cheaply in SQL. But the AI-written cluster summary is only regenerated when the cluster crosses size thresholds of 2, 5, 10 or 20. A cluster of 7 articles uses the summary written when it had 5. Caps cost; keeps summaries stable; users don't see the wording flicker.

The data model

§05

Sixteen numbered migrations under supabase/migrations/. The Supabase database is the single source of truth — Next.js carries no ORM models, no shadow schema. The tables below are the ones load-bearing enough to know about.

articles
Core feed

id, title, url (unique), source, published_at, ingested_at, summary, raw_content, category, subcategory, risk_score, relevance_score, sentiment_score, sentiment_label, entities JSONB, topics TEXT[], cluster_id, is_top_story. Full-text GIN index on title+summary+content; partial indexes on is_top_story = true and risk_score >= 8; trigram via pg_trgm.

sources
RSS catalogue

type (rss/scrape), category, fetch_interval_hours, last_fetched_at, plus credibility_score, bias_label, factual_rating, credibility_notes (migration 011).

briefings
One row per day

date PK, content (legacy flat text), key_themes TEXT[], article_count, structured_content JSONB containing { headline, deep_dive, top_stories, quick_hits, tools_and_launches, trend_watch, key_themes }.

story_clusters
Related-story groups

Created by the find_similar_cluster RPC using pg_trgm trigram similarity against clusters created in the last 72 hours. Threshold 0.35. Carries weighted-average sentiment + an AI summary regenerated at size thresholds 2/5/10/20.

entities + article_entities
Canonical entity catalogue

entities deduplicates companies/people/orgs/technologies/legislation. article_entities is the many-to-many with per-occurrence salience and sentiment. entities.article_count + latest_sentiment are recomputed each ingest.

topic_trends
Daily roll-up

topic + date PK. Article count, avg sentiment, sample article IDs. Powers the trend dashboards and the sentiment-shift detector.

sentiment_alerts
Shift detection

Generated when a topic's sentiment shifts by >= 0.3 vs the 7-day baseline. Surfaced in the Alerts inbox.

regulations
Regulation tracker

Heuristic extraction (~20 keywords). Jurisdiction + status (proposed / committee / passed / enacted) + related article links.

weekly_roundups
One row per Monday

week_start unique. JSONB content payload with executive summary, top stories, emerging/declining topics, entity movers, regulatory updates, outlook.

flags + reading_queue
Per-user state

User saves with priority (low/normal/high/urgent) and status (open/reviewing/resolved/dismissed). Strict RLS — read public, write only your own.

chat_conversations + chat_messages
Ask the Feed persistence

Stores conversation history with cited_article_ids per assistant turn. Demo user gated by per-session + global daily caps.

push_subscriptions
Web Push endpoints

endpoint + p256dh + auth keyed to user_id. Daily cron fans out to all rows via web-push; 410/404 endpoints auto-pruned.

RLS — strict by default

Every table has RLS enabled. Reads are gated by authenticated; writes are gated by service_role (the pipeline) or auth.uid() = user_id (flags, queue, push subs). On top of that, migration 016 adds RESTRICTIVE deny policies for the demo user — defence in depth against any API path forgetting to check.

Two domains, one deployment

§06

One Next.js deployment serves both a marketing site and a gated app from different hosts. src/middleware.ts is the routing brain.

neuraloversight.com
Marketing. / is rewritten (not redirected) to /marketing — URL bar stays clean. /privacy, /terms, /about, /sources-directory rewrite to /marketing/*. Anything else redirects to app.<host>.
app.neuraloversight.com
The app. /marketing/* is blocked (redirects to /). Public paths: /login, /auth/*, /unsubscribed, /demo. Every other path runs Supabase SSR auth — no user, no entry, straight to /login.
Security headers
STRICT_HEADERS on /login, /auth, /marketing, /api/*: X-Frame-Options: DENY + CSP frame-ancestors 'none'. FRAMABLE_HEADERS elsewhere with frame-ancestors 'self' https://tomphillips.uk https://www.tomphillips.uk https://*.vercel.app — which is exactly what lets this portfolio embed Neural Oversight in an iframe.

The dashboard — fourteen queries in parallel

§07

The signed-in homepage is a server component that fires fourteen Supabase queries via Promise.all([...]) and assembles a rich editorial layout. Most queries use count: 'exact', head: true so they're constant-time — the page renders in a single round trip.

  • Time-of-day-aware personalised greeting, picked deterministically from day-of-epoch so the same greeting sticks for the whole day. First name resolved from user_metadata.first_name → full_name → name → email local-part.
  • Colour temperature — the page subtly tints based on today's avgRiskScore: calm (emerald), moderate (blue), warm (amber), elevated (red). Border, glow, accent dot all shift together.
  • Breaking news banner — polls /api/breaking for any article with risk_score ≥ 8 ingested in the last 6 hours.
  • Ticker tape — articles today, delta vs yesterday, risk level, high-risk count, open flags, active sources, top source, all-time total. The 'Articles' cell carries a 7-day sparkline.
  • High-risk alert card — appears whenever any article scores ≥ 7. Lists the top 3 with inline score badges.
  • Editorial top-stories block — hero (story #1) + two secondary + a compact row of #4–5.
  • Daily Briefing card — renders the rich structured_content if present (headline → deep dive teaser → quick hits → tools & launches → trend watch → key themes), or falls back to the legacy flat text format.
  • Sentiment Pulse — live 7-day aggregate sentiment chart.
  • Latest articles — 20 most recent non-top-story articles, each rendered as an ArticleCard, with staggered fade-in animations.

The API surface — 27 routes

§08

All API responses carry Cache-Control: no-store and the strict CSP/X-Frame-Options. A small handful do the interesting work; the rest are thin Supabase views.

MethodRoutePurpose
GET/api/ingestFull pipeline (cron-authenticated, 5min cap)
POST/api/ingest/triggerAuthenticated manual ingest from dashboard
GET/api/articlesFiltered feed; /export for CSV/JSON
GET/api/breakingrisk_score ≥ 8, last 6h — drives the banner
GET/api/briefingToday's briefing (legacy + structured)
GET/api/clustersStory clusters with summaries
GET/api/trendsTopic timeseries
GET/api/entitiesCatalogue + per-entity drill-down
GET/api/graphEntity co-occurrence graph data
GET/api/regulatoryRegulation tracker
GET/api/intelligenceFrontier-lab competitive aggregates
GET/api/sentiment-pulse7-day sentiment for dashboard widget
GET/api/credibilityPer-source credibility metadata
GET PATCH/api/sourcesList + toggle sources (admin)
POST PATCH DELETE/api/flagsUser flags (RLS-protected)
GET POST DELETE/api/queueReading queue CRUD
GET PATCH/api/alertsSentiment-shift alerts (read/mark-read)
GET/api/weeklyWeekly roundup retrieval
GET/api/videosCurated YouTube videos
POST/api/chatStreaming SSE Ask the Feed
GET POST DELETE/api/notificationsSlack/webhook channels
POST/api/push/subscribePersist Web Push subscription
GET/api/unsubscribeOne-click email unsubscribe (RFC 8058)
GET/api/proxyServer-side fetch proxy for cross-origin assets
GET/api/proxy/extractReadability full-text for in-app viewer
GET/api/healthLiveness probe
POST/api/account/deleteGDPR-compliant deletion

Ask the Feed — the chat that knows what was reported today

§09

The most interesting endpoint is /api/chat. It's a streaming SSE interface that turns today's articles into context and lets the user ask questions over them.

  1. 01Verify the user via Supabase SSR cookies.
  2. 02If demo user — enforce per-session cap (default 3 messages, tracked via demo-session-id cookie) AND global daily cap (default 500/day) using demo_chat_usage + demo_chat_daily_cap. Increment BEFORE the Anthropic call to prevent race conditions under concurrency.
  3. 03Validate the body: message required, ≤ 2000 chars.
  4. 04Pull today's 50 most recent articles. Build a context block per article — title, source, category, sentiment label, summary, entities, topics.
  5. 05Resolve or create chat_conversations for the user; append the user message to chat_messages.
  6. 06Stream Claude (Haiku 4.5) back to the browser as Server-Sent Events. Emit data: { conversation_id } first, then data: { text } chunks, then data: { done: true, full_text }. Persist the assistant turn after streaming completes.

Why SSE rather than WebSockets? One-direction streams, no handshake complexity, plays nicely with Vercel's serverless functions, and the Anthropic SDK already speaks it. WebSockets would have been overkill for what is functionally an unbounded HTTP response.

Demo mode — the shared, read-only sandbox

§10

The demo is a clever piece of plumbing that lets this portfolio embed a fully working Neural Oversight in an iframe, without giving every visitor an account. It's also the model used for the Narrate demo on this same site.

  1. 01A real Supabase Auth user exists with email demo@neuraloversight.com. Its UUID is stored in demo_config.demo_user_id and process.env.DEMO_USER_ID.
  2. 02Visiting /demo (directly or via iframe) triggers a server-side magic-link generation: supabaseAdmin.auth.admin.generateLink({ type: 'magiclink', email }) returns a token_hash without sending an email.
  3. 03supabase.auth.verifyOtp({ type: 'magiclink', token_hash }) exchanges the token for a session.
  4. 04All auth cookies are written with SameSite=None; Secure; HttpOnly; Partitioned so they survive cross-origin iframe requests. A demo-session-id UUID cookie is set for per-visitor rate limiting.
  5. 05Redirect to / with a valid session. Layout shows the DemoBanner. The user is in.

Write protection — three layers deep

  • App layer — lib/demo.ts exports isDemoUser() and helper response builders. Every mutating endpoint checks and returns 403 DEMO_READONLY if the demo user attempts a write.
  • Database layer — RLS RESTRICTIVE policies on flags and push_subscriptions use the is_demo_user(auth.uid()) function to block writes regardless of which API path called.
  • Chat — rate-limited per session AND per day, with the counter incremented BEFORE the Anthropic call to defeat concurrent demo attempts.

The demo is never reset. Writes are blocked, so seeded state (a few flagged articles, a sample conversation) lives forever and every visitor sees the same curated view.

The email pipeline

§11

The daily email goes out once per recipient (never BCC), each with their own signed one-click unsubscribe URL. Three files orchestrate it: email/send.ts (the loop), email/template.ts (HTML + plain text), and email/unsubscribe.ts (the signed URL mint).

  • Per-recipient send — each user gets a personalised unsubscribe URL embedded in their email. BCC would have been simpler; this is correct.
  • RFC 8058 one-click — List-Unsubscribe + List-Unsubscribe-Post: List-Unsubscribe=One-Click. Gmail honours this without a round-trip; deliverability stays high.
  • /api/unsubscribe flips email_preferences.unsubscribed = true and redirects to /unsubscribed.
  • Subject format: 'Neural Oversight - 20 May 2026'.
  • Templates are aware of both formats — structured_content if present (rich newsletter), legacy flat text otherwise.
  • Resend From address is configurable via RESEND_FROM_EMAIL; default is briefing@neuraloversight.com.

Push notifications, and the Android TWA

§12

VAPID keys live in NEXT_PUBLIC_VAPID_PUBLIC_KEY and VAPID_PRIVATE_KEY, generated via scripts/generate-vapid-keys.mjs.

  • PushNotifications component (in layout.tsx) registers public/sw.js and calls /api/push/subscribe to persist the { endpoint, p256dh, auth } triple.
  • Daily ingest calls sendBriefingPushNotification(articleCount, topStories) which fans out to every row via Promise.allSettled so one bad endpoint cannot fail the broadcast.
  • 404/410 endpoints are pruned in the same pass — no orphan subscriptions.
  • Android TWA inherits notifications because Bubblewrap proxies them straight to Android's native notification system.

The Android wrapper

neural-oversight-android/ is a Bubblewrap-generated Trusted Web Activity. It is a minimal Android shell that boots into a Chrome Custom Tab pointed at app.neuraloversight.com. The whole UX is the PWA. There is no native code. The wrapper exists for one reason — to ship to the Play Store.

  • twa-manifest.json defines package id (com.neuraloversight.app), host, theme colour (#37352F), icon URLs, signing key reference.
  • app-release-bundle.aab is the Play Store upload artifact. app-release-signed.apk is sideload-ready.
  • Digital Asset Links served at /.well-known/assetlinks.json by Next.js prove domain ownership to Chrome — that's what hides the URL bar in the TWA.
  • scripts/setup-android.sh regenerates the Bubblewrap project when the web manifest changes.

Security posture

§13

The app handles AI governance news, not customer data — but governance professionals don't trust tools that don't look like they understand security. Every layer is hardened deliberately.

  • Auth — Supabase magic-link, SSR cookies handled by @supabase/ssr. Middleware blocks anonymous access to every non-public path.
  • RLS — strict policies on every table. Reads gated by authenticated; writes by service_role or auth.uid() = user_id. Demo user has RESTRICTIVE deny.
  • CSP — frame-ancestors allowlists strictly limit which origins can iframe the app. Strict (no framing) on /login, /auth, /api, /marketing.
  • Headers everywhere — X-Content-Type-Options: nosniff, Referrer-Policy: strict-origin-when-cross-origin, Permissions-Policy: camera=(), microphone=(), geolocation=().
  • Cron auth — timing-safe (crypto.timingSafeEqual) Bearer-token comparison on /api/ingest. Header manipulation cannot bypass it.
  • API cache — Cache-Control: no-store on every /api/* response.
  • DOMPurify sanitises any AI-generated HTML before render.
  • Idle timeout signs users out after extended inactivity.
  • RFC 8058 one-click unsubscribe is respected BEFORE any send — not after.

Performance and reliability notes

§14
  • Dashboard fires ~14 queries in parallel; most are count-only (head: true) and constant-time. Page renders in one round trip.
  • articles has partial indexes on the hot paths — is_top_story = true and risk_score ≥ 8.
  • RSS fetches batch in groups of 10 with a 10-second hard timeout per source. One slow feed cannot block the whole job.
  • Full-text extractor caps at 3000 chars and 5MB response size, with an 8-second abort.
  • AI calls are deliberately consolidated — one selection call, one briefing call, one structured briefing call per ingest. Cluster summaries only regenerate at thresholds (2/5/10/20).
  • Push notifications use Promise.allSettled so one bad endpoint cannot fail the broadcast. 410/404 pruned in the same pass.
  • Every ingestion stage is wrapped in try/catch with errors collected and returned in the response — a flaky AI response does not block tomorrow's run.

The signed-in surfaces

§15

Beyond the dashboard, the sidebar gives access to fourteen views onto the same curated stream. They split into three groups.

Menu
  • Dashboard — editorial home
  • Feed — full feed + category filters
  • Flagged — your flagged articles
  • Digest — full daily briefing
  • Weekly — weekly roundup
  • Sources — manage RSS + credibility scores
  • Watch — saved watchlist / topic monitor
Intelligence
  • Trends — topic dashboards from topic_trends
  • Entities — catalogue + sentiment
  • Regulatory — by jurisdiction + status
  • Connections — entity co-occurrence graph
  • Competitive — frontier-lab aggregates
  • Alerts — sentiment-shift inbox
Tools
  • Ask the Feed — Claude chat over today's articles
  • Reading Queue — save for later

Deployment, cost, and the cron

§16
Hosting
Vercel Pro. Two domains point at one project: neuraloversight.com (marketing) and app.neuraloversight.com (the app). Middleware routes them to the right pages — one deployment, two products.
Cron
vercel.json registers one job: 0 8 * * * → GET /api/ingest. Vercel automatically attaches Authorization: Bearer $CRON_SECRET. The timing-safe comparison inside the endpoint accepts only that exact bearer.
Function limits
maxDuration = 300 on /api/ingest. maxDuration = 60 on /api/chat. Both rely on Vercel Pro's extended execution limits.
Database
Hosted Supabase (free or pro tier). All persistent state lives here. Migrations applied in order via Supabase SQL editor.
Email
Resend for all transactional + briefing email. Per-recipient send, RFC 8058 one-click, signed unsubscribe URLs.
Cost lever
The single AI selection call per ingest is the cost lever. Everything else is downstream of that call.

Where to make changes

§17

A working map. If the next change is X, the file to open is Y.

Add a new RSS source
supabase/migrations/008_add_more_sources.sql (or a new migration)
Change the AI selection prompt
src/lib/ai/categorise.ts
Change the briefing format
src/lib/ai/categorise.ts (generateStructuredBriefing) + src/lib/email/template.ts + src/app/page.tsx
Add a new dashboard widget
src/app/page.tsx + src/components/
Add a sidebar item
src/components/Sidebar.tsx + src/app/<route>/page.tsx
Add an API endpoint
src/app/api/<name>/route.ts
Adjust the database schema
New numbered migration in supabase/migrations/
Adjust the ingest schedule
vercel.json
Change demo rate limits
DEMO_CHAT_PER_SESSION / DEMO_CHAT_PER_DAY env vars
Update the Android wrapper
neural-oversight-android/twa-manifest.json + re-run Bubblewrap