This repository includes two Natural Language Processing (NLP) projects designed for medical text analysis and improving information retrieval.
In this section, the GPT-2 model is fine-tuned on specialized skin cancer articles to enhance its ability to understand medical texts.
✅ Data collection from PubMed and other scientific sources
✅ Preprocessing of medical texts to optimize model learning
✅ Fine-tuning the GPT-2 model using Hugging Face Transformers
✅ Model evaluation and testing for generating medical text responses
Libraries used in this project:
- torch
- transformers
- numpy
- pandas
- matplotlib
- datasets
- biopython
- scikit-learn
- tqdm
First, install the necessary packages:
pip install torch transformers datasets biopythonThen, run the notebook in Jupyter Notebook.
jupyter notebook fine-tuning-gpt-2-on-skin-cancer-articles.ipynb
In this section, the Retrieval-Augmented Generation (RAG) method is combined with a graph-based structure to extract more accurate and relevant information for text generation.
✅ Intelligent Information Retrieval: Using graphs to optimize knowledge search
✅ Stronger Generative Models: Combining RAG and GPT to improve response accuracy
✅ Processing Complex Data: Suitable for scientific and medical search systems
Libraries used in this project:
- torch
- transformers
- faiss-cpu
- networkx
- numpy
- scipy
- pandas
- matplotlib
First, install the necessary packages. Then, run the notebook in Jupyter Notebook.
jupyter notebook graphrag.ipynb
✅ Medical Applications: Analyzing scientific articles, aiding disease diagnosis
✅ Search Systems: Optimizing scientific and medical search engines
✅ Research Assistance: Gathering more accurate information for medical research
✅ Improving Chatbots: Enhancing medical chatbots for more precise responses
- If you have suggestions for improving the project, please submit a Pull Request.
- To report issues, please open an Issue.