pyaar project
โ† All projects
Portfolio

FDA CRL Analyzer

Analyzes 202 FDA Complete Response Letters (2020-2024) to extract regulatory intelligence, deficiency patterns, therapeutic area trends, and sentiment. Uses DuckDB for fast SQL analytics and Parquet for efficient storage. 8 deficiency categories, 7 therapeutic areas.

ProjectFeaturedHealthcareData
FDA CRL Analyzer preview
๐Ÿ“Š
202 FDA Complete Response Letters analyzed for regulatory intelligence, deficiency patterns, therapeutic area trends, and sentiment scoring.

GitHub: prahlaadr/fda-crl-analyzer


What It Does

Parses FDA Complete Response Letters (CRLs), rejection letters sent to drug sponsors, and extracts structured data: deficiency categories, therapeutic areas, severity, and sentiment. Enables trend analysis across 5 years of FDA decisions.

Key Findings

  • 8 deficiency categories, Clinical, CMC, Safety, Labeling, Statistical, Nonclinical, Regulatory, Other
  • 7 therapeutic areas, Oncology, Neurology, Infectious Disease, Cardiology, Metabolic, Rare Disease, Other
  • Clinical deficiencies are the #1 reason for CRLs across all therapeutic areas

Stack

  • DuckDB, SQL analytics over Parquet files
  • Polars, Data transformation and aggregation
  • Parquet, Columnar storage for 202 analyzed letters