Skip to content

Latest commit

 

History

History
113 lines (90 loc) · 3.35 KB

File metadata and controls

113 lines (90 loc) · 3.35 KB

SWE-Python-AI

Description

Installation

Follow these steps to install the project locally:

  1. Clone the repository:

    git clone https://github.com/manhtdd/SWE-Python-AI.git
    cd SWE-Python-AI
  2. Install dependencies:

    pip install -r requirements.txt

    🛑 Note:
    ./requirements.txt is only for processing the data
    ./7_inference/requirements.txt is required to run LLMs

Usage

  1. Collect AI-related Python projects that use PyTest

    (cd 0_scrape-python-ai && python -m main)
  2. Create task instances for new projects

    bash setup.sh <your-github-token>
    export GITHUB_TOKENS=<your-github-token>
    bash collect.sh <your-github-token> <original-owner>/<project-name> <default-branch>

    Example:

    bash setup.sh <your-github-token>
    export GITHUB_TOKENS=<your-github-token>
    bash collect.sh <your-github-token> langgenius/dify main

    🛑 Note:
    Replace <your-github-token> with your GitHub token

    🛑 Note:
    The above steps cover both 1_mirror-repo and 2_create-task-instances

  3. Versioning - match your task instances with their version for installation

    cd 3_versioning
    python -m main
    python -m postprocess_versioning
  4. Specification (manually)

    Please checkout SWE-bench's official document about collection

  5. Get FAIL-2-PASS (F2P) & PASS-2-PASS (P2P) of your task instances

    cd 5_F2P-P2P
    mkdir logs
    mkdir temp
    export GITHUB_TOKEN=<your-github-token>
    bash run_validation.sh <path/to/your/task/instance> <path/to/logs/dir> <path/to/temp/dir>

    Example:

    cd 5_F2P-P2P
    mkdir logs
    mkdir temp
    export GITHUB_TOKEN=<your-github-token>
    bash run_validation.sh input/dify-task-instances.jsonl ./logs ./temp
  6. Evaluate the outputs of step 4

    Run the notebook 5_F2P-P2P-evaluate/validation.ipynb

  7. Harness - Evaluate the whole installation and running PyTest inside docker's container (Evaluate from step 2 to 4) First you have to process the output of phase 5 with ./notebooks/6_harness-postprocess-input.ipynb
    then run the following command:

    ## Script should be ran at '.' directory
    bash bash_scripts/6_harness

    To parse the output of above command, check ./notebooks/6_harness-parse-output.ipynb
    and because of all these output are for a project only, you'll end up with multiple outputs for each project, use ./notebooks/6_harness-merge-jsonl.ipynb to merge all these output into a single file

  8. Inference
    You have to first make the dataset and upload it to HuggingFace with ./notebooks/7_make-dataset.ipynb
    then you can execute the following commands:

    ## These scripts should be ran at '.' directory
    
    ## Create HuggingFace-style dataset from your task instances
    bash bash_scripts/7_make-dataset.sh
    
    ## Tokenize the dataset
    bash bash_scripts/7_tokenize.sh
    
    ## Inference
    bash bash_scripts/7_inference.sh
  9. Evaluate the outputs of setup 7

    bash bash_scripts/8_evaluation.sh

🛑 Note
Every scripts, both .sh or .py files, can be customized inside their scripts