Follow these steps to install the project locally:
-
Clone the repository:
git clone https://github.com/manhtdd/SWE-Python-AI.git cd SWE-Python-AI -
Install dependencies:
pip install -r requirements.txt
🛑 Note:
./requirements.txtis only for processing the data
./7_inference/requirements.txtis required to run LLMs
-
Collect AI-related Python projects that use
PyTest(cd 0_scrape-python-ai && python -m main) -
Create task instances for new projects
bash setup.sh <your-github-token> export GITHUB_TOKENS=<your-github-token> bash collect.sh <your-github-token> <original-owner>/<project-name> <default-branch>
Example:
bash setup.sh <your-github-token> export GITHUB_TOKENS=<your-github-token> bash collect.sh <your-github-token> langgenius/dify main
🛑 Note:
Replace<your-github-token>with your GitHub token🛑 Note:
The above steps cover both1_mirror-repoand2_create-task-instances -
Versioning - match your task instances with their version for installation
cd 3_versioning python -m main python -m postprocess_versioning -
Specification (manually)
Please checkout SWE-bench's official document about collection
-
Get FAIL-2-PASS (F2P) & PASS-2-PASS (P2P) of your task instances
cd 5_F2P-P2P mkdir logs mkdir temp export GITHUB_TOKEN=<your-github-token> bash run_validation.sh <path/to/your/task/instance> <path/to/logs/dir> <path/to/temp/dir>
Example:
cd 5_F2P-P2P mkdir logs mkdir temp export GITHUB_TOKEN=<your-github-token> bash run_validation.sh input/dify-task-instances.jsonl ./logs ./temp
-
Evaluate the outputs of step 4
Run the notebook
5_F2P-P2P-evaluate/validation.ipynb -
Harness - Evaluate the whole installation and running
PyTestinsidedocker's container (Evaluate from step 2 to 4) First you have to process the output of phase 5 with./notebooks/6_harness-postprocess-input.ipynb
then run the following command:## Script should be ran at '.' directory bash bash_scripts/6_harnessTo parse the output of above command, check
./notebooks/6_harness-parse-output.ipynb
and because of all these output are for a project only, you'll end up with multiple outputs for each project, use./notebooks/6_harness-merge-jsonl.ipynbto merge all these output into a single file -
Inference
You have to first make the dataset and upload it to HuggingFace with./notebooks/7_make-dataset.ipynb
then you can execute the following commands:## These scripts should be ran at '.' directory ## Create HuggingFace-style dataset from your task instances bash bash_scripts/7_make-dataset.sh ## Tokenize the dataset bash bash_scripts/7_tokenize.sh ## Inference bash bash_scripts/7_inference.sh
-
Evaluate the outputs of setup 7
bash bash_scripts/8_evaluation.sh
🛑 Note
Every scripts, both.shor.pyfiles, can be customized inside their scripts