pyaar project
← All artifacts
Portfolio

🩺DDx Daily

A daily clinical-reasoning game for USMLE Step 1 and Step 2 prep. Instead of multiple-choice recall, you interview an AI standardized patient grounded on a hidden case fact sheet, order labs and imaging, then name the diagnosis, scored on accuracy AND efficiency (fewer, sharper questions earn more points). Built with Astro 7, React 19, and Turso, a Gemini-powered patient, a MedQA case-import pipeline, and real PubMed-grounded references.

FeaturedProjectHealthcare
DDx Daily title card

Live Demo: ddx-daily.vercel.app · Code: github.com/prahlaadr/ddx-daily


What it is

DDx Daily is a daily clinical-reasoning game for USMLE prep. Each day there is one shared case. You meet a standardized patient, take a history in plain language, order exams, labs, and imaging, then commit to a single diagnosis from a multiple-choice list. The twist: you are scored on accuracy and efficiency. Reaching the discriminating findings in fewer, sharper questions earns a higher score. After you lock in, the game tears the case down: why the answer is right, why each distractor is wrong, the teaching point, and real literature references. Streaks and a per-case leaderboard keep you coming back.

Think of it as the difference between a flashcard and a real clinic encounter, compressed into a two-minute daily ritual.


The problem it solves

USMLE Step 1 and Step 2 prep is dominated by static question banks. You read a finished vignette where someone else already did the history and the workup, then you pick A through E. That trains pattern recall, but it skips the actual skill of diagnosis: deciding what to ask next, building a differential, and narrowing it efficiently without ordering everything.

Real diagnosis is an active, sequential process. DDx Daily makes that process the game:

  • You start with only a one-line chief complaint, not the full vignette.
  • You have to gather the findings yourself, by asking.
  • The score rewards hypothesis-driven workup, not shotgunning every test.
  • The teardown closes the loop on why, not just what.

It turns passive recognition into active reasoning.


Who it is built for

  • Medical students preparing for USMLE Step 1 and Step 2 CK, the core audience. Cases carry a Step level and a specialty tag.
  • More broadly, anyone learning clinical reasoning: PA and NP students, IMGs, and early residents who want low-stakes daily differential practice.
  • It is explicitly a study aid, not a clinical tool. Every case shows a safety banner until a clinician has reviewed it.

The flow of UI screens (architecture)

The whole game lives in one React island, CaseConsole.tsx, that moves through phases. The screen flow:

[1] Daily Case Console  (home, /)
      patient header: name, age, sex, specialty, Step level
      one-line opening stem (the only thing you see first)
                   |
                   v
[2] Interview phase
      chat with the standardized patient (free text):
        history / exam / labs / imaging
      right rail: questions asked, findings uncovered,
        one-tap suggested clues, Reveal answer
                   |  "I am ready to diagnose"
                   v
[3] Diagnose phase
      "What is the diagnosis?" multiple-choice options
      shown ALONGSIDE the live chat, so you can keep
        interviewing while you narrow the list
      Lock in diagnosis
                   |
                   v
[4] Result / teardown
      correct or not, points, streak, key findings found
      why this is the diagnosis (teaching point)
      why each alternative is wrong
      real PubMed references (clickable)
      reviewed:false safety banner if not clinician-signed
      Next case
                   |
                   v
[5] Leaderboard  (/leaderboard?date=...)
      per-case ranking by points

Behind the screens are three API seams:

  • POST /api/ask drives the interview. It is the single seam the future voice version plugs into.
  • POST /api/submit scores the attempt and reveals the teardown.
  • GET /api/leaderboard serves per-case rankings.

A recent UX change merged screens [2] and [3]: the diagnosis options now render below the live chat input instead of replacing it, so you can ask one more clarifying question without leaving the diagnosis view.


The databases and data sources behind it

Four distinct data stores support the app, each with a clear job:

1. Turso / libSQL (SQLite): the persistence layer. Stores users, attempts, leaderboard rows, and streaks. A local SQLite file in development, Turso cloud in production so it survives Vercel's serverless runtime. Persistence is best-effort: if the database is unavailable, the game still plays, you just do not get recorded.

2. The Astro Content Collection (src/data/cases/*.json): the case database. Each case is a git-versioned JSON document holding the hidden fact sheet (the single ground truth the patient is grounded on), the answer, the distractors, key findings, and the teaching point. Because it lives in git, every case and every generated explanation is reviewable in a pull request before a student ever sees it. This is the medical-accuracy safety net.

3. MedQA-USMLE (HuggingFace datasets-server): the source corpus. Board-exam vignettes are pulled from the MedQA-USMLE-4-options dataset over the HuggingFace REST API, filtered to single-best-diagnosis questions, and reshaped by Gemini into the fact-sheet schema. This is how the case library scales beyond hand-authoring. Imports land in a drafts/ folder and never go live until a human spot-checks and promotes them.

4. Europe PMC / PubMed: the reference layer. For each case, a grounding step queries Europe PMC (a free aggregator over PubMed and MEDLINE, no API key) by title for on-topic systematic reviews, and writes back real, verifiable citations with PMID, DOI, journal, year, and a resolvable URL. This replaced the earlier pipeline that had the model invent plausible-looking but fake references.

A fifth component, Gemini (CLI, no API keys), is the reasoning engine rather than a data store. It powers the fluent patient and reshapes imported vignettes. It is not the source of truth: the correct diagnosis always comes from the MedQA answer key, set in code and never trusted from the model.


The core architecture idea: a transport-agnostic patient

The patient is a service grounded entirely on the hidden case fact sheet (patient-sim.ts). It answers your questions truthfully but only from that sheet, never invents abnormal findings, and never names the diagnosis. Text is today's transport. The next iteration is voice (speech to text, the same patientRespond, text to speech) and the game core does not change, only the thin layer wrapping it does. /api/ask is the one seam the voice layer will plug into.

Behind that seam, production runs on the Gemini API (a free AI Studio key, called directly from Vercel), so the deployed patient is fluent 24/7 with no server to keep awake. The other backends:

  • gemini CLI (Google-account OAuth, no key): powers local dev and the build-time content pipeline.
  • Deterministic keyword matcher: a no-LLM fallback that keeps the game playable if the API is ever unreachable. It matches your words against each finding's topic keywords.

Medical safety (non-negotiable)

  • Cases are hand-authored or human-promoted JSON, git-versioned so every change is reviewable.
  • Each case stays reviewed: false with a visible warning banner until a clinician signs off.
  • Imported and model-generated content (distractor teardowns, reshaped vignettes) is reviewed in the diff before publishing. Model output is never auto-published as fact.
  • References are retrieved from PubMed, not invented.

Tech stack

  • Framework: Astro 7 (server output) with React 19 islands
  • Styling: Tailwind 4
  • Data: Turso / libSQL (SQLite)
  • AI: Gemini via CLI, OAuth, no API keys
  • Content pipeline: HuggingFace MedQA import, Europe PMC reference grounding
  • Deployment: Vercel

What is next

  • Usage scaling (paid Gemini tier) if play ever outgrows the free API limits.
  • Voice patient over the same core (the architecture is already built for it).
  • FSRS spaced repetition to resurface the reasoning patterns you miss.
  • Clinician review to flip cases from reviewed: false to signed-off.

Built at the intersection of clinical reasoning, AI, and product thinking.