Skip to content

Latest commit

 

History

History
21 lines (8 loc) · 548 Bytes

File metadata and controls

21 lines (8 loc) · 548 Bytes

PubMed TopicModeling

Latent Dirichlet Allocation (LDA) applied to 2,000 PubMed abstracts using collapsed Gibbs sampling to discover 20 latent biomedical topics.

output quality is evaluated using topic coherence metrics (c_v & c_npmi), indicating interpretable biomedical topics.

See the full implementation and results in LDA.ipynb.

Learned Topics

Word Clouds

Acknowledgment

Collapsed Gibbs sampling implementation borrowed & adapted from CS179 course material by Prof. Alexander Ihler.