← All projectsPortfolio
📋FDA CRL Analyzer
Analyzes 202 FDA Complete Response Letters (2020-2024) to extract regulatory intelligence — deficiency patterns, therapeutic area trends, and sentiment. Uses DuckDB for fast SQL analytics and Parquet for efficient storage. 8 deficiency categories, 7 therapeutic areas.
ProjectFeaturedHealthcareData
📊
202 FDA Complete Response Letters analyzed for regulatory intelligence — deficiency patterns, therapeutic area trends, and sentiment scoring.
GitHub: prahlaadr/fda-crl-analyzer
What It Does
Parses FDA Complete Response Letters (CRLs) — rejection letters sent to drug sponsors — and extracts structured data: deficiency categories, therapeutic areas, severity, and sentiment. Enables trend analysis across 5 years of FDA decisions.
Key Findings
- 8 deficiency categories — Clinical, CMC, Safety, Labeling, Statistical, Nonclinical, Regulatory, Other
- 7 therapeutic areas — Oncology, Neurology, Infectious Disease, Cardiology, Metabolic, Rare Disease, Other
- Clinical deficiencies are the #1 reason for CRLs across all therapeutic areas
Stack
- DuckDB — SQL analytics over Parquet files
- Polars — Data transformation and aggregation
- Parquet — Columnar storage for 202 analyzed letters