Welcome to DATASCI w266 Project Repository for John Bang and Mohammad Paracha!
In this paper, we classify medical exam questions from the MedMCQA dataset, which contains over 193,000 expert-authored questions from AIIMS and NEET PG exams across 21 medical sub- jects. We evaluate five models: Naive Bayes, logistic regression, BERT, BioBERT, and Gem- ini 1.5 Flash. Unlike previous work that focuses on answer prediction or open-domain medical reasoning, our study reframes the problem as subject-level classification—an underexplored yet critical task for downstream applications. Our results show that BioBERT achieves high precision scores, highlighting the effectiveness of domain-specific pretraining for understand- ing medical language.