← All projects
Portfolio

📋FDA CRL Analyzer

Analyzes 202 FDA Complete Response Letters (2020-2024) to extract regulatory intelligence — deficiency patterns, therapeutic area trends, and sentiment. Uses DuckDB for fast SQL analytics and Parquet for efficient storage. 8 deficiency categories, 7 therapeutic areas.

ProjectFeaturedHealthcareData
📊
202 FDA Complete Response Letters analyzed for regulatory intelligence — deficiency patterns, therapeutic area trends, and sentiment scoring.

GitHub: prahlaadr/fda-crl-analyzer


What It Does

Parses FDA Complete Response Letters (CRLs) — rejection letters sent to drug sponsors — and extracts structured data: deficiency categories, therapeutic areas, severity, and sentiment. Enables trend analysis across 5 years of FDA decisions.

Key Findings

  • 8 deficiency categories — Clinical, CMC, Safety, Labeling, Statistical, Nonclinical, Regulatory, Other
  • 7 therapeutic areas — Oncology, Neurology, Infectious Disease, Cardiology, Metabolic, Rare Disease, Other
  • Clinical deficiencies are the #1 reason for CRLs across all therapeutic areas

Stack

  • DuckDB — SQL analytics over Parquet files
  • Polars — Data transformation and aggregation
  • Parquet — Columnar storage for 202 analyzed letters