Skip to content

Latest commit

 

History

History
48 lines (40 loc) · 1.87 KB

File metadata and controls

48 lines (40 loc) · 1.87 KB

Data Mining Project

Context Aware Category Prediction Using Semantic Article Graphs

How to Run Demo

  1. Download the dataset from huggingface
  2. Place the downloaded dataset in the demo/input directory.
  3. Download the models from:
    • GCN model: GCN
    • GCN without edges: GCN_no_edges
    • Place the downloaded models in the demo/models/ directory.
    • Rename the folders as undirected_gnn and no_edge_gnn respectively.
  4. Set PYTHONPATH to the parent directory:
    export PYTHONPATH=demo/
    
  5. Navigate to the demo directory in your terminal.
  6. Install the required dependencies using:
    pip install -r requirements.txt
    
  7. Run the Streamlit app using:
    streamlit run src/demo.py
    

Alternatively you can visit the hosted demo at: Demo

How to Train the GNN model or Run Heuristic Methods

  1. Download the dataset from huggingface
  2. Place the downloaded dataset in the modeling/ directory.
  3. Install the required dependencies using:
    pip install -r demo/requirements.txt
    
  4. Navigate to the modeling directory in your terminal.
  5. Run gnn.ipynb or heuristic_methods.ipynb using Jupyter Notebook or Jupyter Lab.

How to view Preprocessing Steps

  1. Navigate to the preprocessing directory in your terminal.
    • preprocess.ipynb: Contains all the preprocessing steps to create the dataset from raw wikipedia articles collected using Wikipedia API.
    • generate_embeddings.ipynb: Contains steps to generate document embeddings for the articles using a pre-trained embedding model.

  • Metin Usta
  • 504251504