Clone tools locally#697
Open
paulzierep wants to merge 21 commits into
Open
Conversation
- clone_repositories() clones/pulls repos locally instead of PyGithub API - Proper XML macro expansion via galaxy.util.xml_macros - Supports non-GitHub URLs (GitLab, self-hosted) - Shallow clones (--depth 1) by default for CI efficiency - Repo URL deduplication prevents wasted clones - --workers N for parallel parsing - No --api flag or GITHUB_API_KEY required - Fixes: 74 previously missed tools found, 36 conda packages gained
This reverts commit 41fb91c.
This reverts commit b2094c2.
- Suite parsed folder: use GitHub repo URL + tree/master + relative path instead of local filesystem path - Suite version: resolve combined macro tokens (e.g. @TOOL_VERSION@) using values from macro XML files before stripping +galaxy suffix - Suite first commit date: deepen shallow clones (git fetch --deepen 1000) when no commit history is found for a tool folder
Collaborator
Author
|
Comparison using IUC, looks good, @scorreard I think this one is ready, we can only fully test in production due to the PAT.
|
- _normalize_repo_url: trailing slash, .git, whitespace - _repo_name_from_url: org-repo name extraction - get_first_commit_for_local_folder: shallow clone deepening, empty output
- Install scholarly (already in requirements.txt) instead of try/except hack - Fix test_extract_galaxy_workflows.py import path to work without PYTHONPATH
- Replace synthetic test fixtures with real wrappers from CI test repo - Test get_tool_metadata_from_local with fastp (macros) and 2d_auto_threshold (no macros) - Add tests for 22 previously untested functions - Fix lint: ruff, black, isort, mypy all passing
327d5e6 to
d34deca
Compare
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
This PR changes the extract_all_tools logic, instead of using the github API it clones each repo and crawles the repos.
This way we do not need a github API secret and can run the full pipeline --test for every PR, which makes breaking things much harder.
Also it reduced the time to run the full tool fetching to less then 10 minutes intead of hours.
It also uses the galaxy-utils macro expansion, which improves many tool information, like better inputs / outputs and overall 70 tools more.
I am currently running the full workflow to check if there are any negative effects. Only merge when we are sure.
Full description in the new changelog.md
https://github.com/galaxyproject/galaxy_codex/actions/runs/28222587334
I made the PR from the galaxyproject repo, so I can run the CI quickly, my repo is hitting a limit :)