Add explain flag and merged config dump by nuwang · Pull Request #184 · galaxyproject/total-perspective-vortex

nuwang · 2026-02-19T16:59:24Z

This PR adds support for

tpv dump command to view merged config
--explain flag to tpv dry-run so admins can trace how a particular decision was made

closes: #153

coveralls · 2026-02-20T07:20:45Z

Pull Request Test Coverage Report for Build 22773699455

Warning: This coverage report may be inaccurate.

This pull request's base commit is no longer the HEAD commit of its target branch. This means it includes changes from outside the original pull request, including, potentially, unrelated coverage changes.

For more information on this, see Tracking coverage changes with pull request builds.
To avoid this issue with future PRs, see these Recommended CI Configurations.
For a quick fix, rebase this PR at GitHub. Your next report should be accurate.

Details

355 of 360 (98.61%) changed or added relevant lines in 8 files are covered.
No unchanged relevant lines lost coverage.
Overall coverage increased (+0.1%) to 95.681%

Changes Missing Coverage	Covered Lines	Changed/Added Lines	%
tpv/core/entities.py	15	16	93.75%
tpv/core/explain.py	117	119	98.32%
tpv/core/mapper.py	61	63	96.83%

Totals
Change from base Build 22251655819:	0.1%
Covered Lines:	1726
Relevant Lines:	1757

💛 - Coveralls

mvdbeek · 2026-02-26T14:09:50Z

@pauldg is also interested in this

cat-bro · 2026-03-02T02:10:26Z

Hi @nuwang , this is going to be very useful!

tpv dump is perfect.

For tpv dry-run --explain I have some notes. I’m going to paste the output into the next comment with line numbers so that I can comment on parts of it.

cat-bro · 2026-03-02T02:27:19Z

     1	tpv dry-run --tool=toolshed.g2.bx.psu.edu/repos/bgruening/antismash/antismash/6.1.1+galaxy1
        --input-size=6 --job-conf tpv_check/explain_job_conf.yml --explain
     2
     3	========================================================================
     4	TPV SCHEDULING DECISION TRACE
     5	========================================================================
     6
     7	--- Configuration Loading ---
     8	  [1] Loaded config: https://gxy.io/tpv/db-v2.yml
     9
    10	  [2] Loaded config:
        /Users/cat/dev/infrastructure/tpv_check/total_perspective_vortex/default_tool.yml
    11
    12	  [3] Loaded config:
        /Users/cat/dev/infrastructure/tpv_check/total_perspective_vortex/tools.yml
    13
    14	  [4] Loaded config:
        /Users/cat/dev/infrastructure/tpv_check/total_perspective_vortex/tool_pulsar_scores.yml
    15
    16	  [5] Loaded config:
        /Users/cat/dev/infrastructure/tpv_check/total_perspective_vortex/users.yml
    17
    18	  [6] Loaded config:
        /Users/cat/dev/infrastructure/tpv_check/total_perspective_vortex/destinations.yml
    19
    20	--- Entity Matching ---
    21	  [7] Tool 'toolshed.g2.bx.psu.edu/repos/bgruening/antismash/antismash/6.1.1+galaxy1':
        matched entity 'toolshed.g2.bx.psu.edu/repos/bgruening/antismash/antismash/*'
    22
    23	  [8] No user specified
    24
    25	--- Entity Combining ---
    26	  [9] Combining entities: Tool(toolshed.g2.bx.psu.edu/repos/bgruening/antismash/antismash/*)
    27	        cores=10, mem=24, gpus=0
    28	        scheduling: require=[], prefer=[], reject=['offline']
    29
    30	--- Rule Evaluation ---
    31	  [10] Rule 'login_required_rule' (if: require_login and user is None) -> not matched
    32
    33	  [11] Rule 'minimum_singularity_version_positive_rule' (if: minimum_singularity_version is
        not None and helpers.tool_version_gte(tool, minim) -> not matched
    34
    35	  [12] Rule 'minimum_singularity_version_negative_rule' (if: minimum_singularity_version is
        not None and helpers.tool_version_lt(tool, minimu) -> not matched
    36
    37	  [13] Rule 'max_concurrent_job_count_for_tool_rule' (if: total_limit_exceeded = False
    38	user_limit_exceeded = False
    39	if max_concurrent_job_c) -> not matched
    40
    41	  [14] Rule 'pulsar_score_prefer_pulsar_rule' (if: result = pulsar_score is not None and (
    42	  helpers.tag_values_match(entity, ['pul) -> MATCHED
    43
    44	  [15] Rule 'pulsar_score_prefer_slurm_rule' (if: from numbers import Number
    45	result = pulsar_score is not None and isinstance(enti) -> not matched
    46
    47	--- Resource Evaluation ---
    48	  [16] Evaluated resource expressions
    49	        cores=10, mem=24, gpus=0
    50
    51	--- Destination Matching ---
    52	  [17] tpvdb_local: REJECTED
    53	        destination is abstract
    54
    55	  [18] tpvdb_slurm: REJECTED
    56	        destination is abstract
    57
    58	  [19] default: REJECTED
    59	        destination is abstract
    60
    61	  [20] _slurm_destination: REJECTED
    62	        destination is abstract
    63
    64	  [21] _pulsar_destination: REJECTED
    65	        destination is abstract
    66
    67	  [22] slurm: MATCHED
    68	        capacity: max_cores=32, max_mem=125
    69
    70	  [23] slurm-training: REJECTED
    71	        tag mismatch - entity requires [], rejects ['offline'] dest tags are ['training',
        'docker', 'singularity', 'slurm', 'gtdbtk_database', 'bakta_database', 'funannotate',
        'eggnog', 'verkko_venv', 'phastest', 'medaka_venv_211', 'tool_type_user_defined']
    72
    73	  [24] interactive_pulsar: REJECTED
    74	        tag mismatch - entity requires [], rejects ['offline'] dest tags are
        ['interactive_pulsar', 'docker', 'singularity', 'tool_type_user_defined']
    75
    76	  [25] pulsar-mel2: REJECTED
    77	        cores 10 exceeds max_accepted_cores 8
    78
    79	  [26] pulsar-mel3: MATCHED
    80	        capacity: max_cores=32, max_mem=62.5
    81
    82	  [27] pulsar-high-mem1: REJECTED
    83	        mem 24 below min_accepted_mem 62.51
    84
    85	  [28] pulsar-high-mem2: REJECTED
    86	        tag mismatch - entity requires [], rejects ['offline'] dest tags are ['pulsar',
        'pulsar-high-mem2', 'docker', 'singularity', 'cvmfs_cache_100plus', 'cvmfs_cache_800plus',
        'phastest', 'tool_type_user_defined']
    87
    88	  [29] pulsar-mel-blast: REJECTED
    89	        tag mismatch - entity requires [], rejects ['offline'] dest tags are ['pulsar',
        'pulsar-blast', 'offline', 'docker', 'singularity', 'pulsar-mel-blast',
        'cvmfs_cache_100plus', 'cvmfs_cache_800plus', 'tool_type_user_defined']
    90
    91	  [30] pulsar-qld-high-mem0: REJECTED
    92	        mem 24 below min_accepted_mem 400
    93
    94	  [31] pulsar-qld-high-mem1: REJECTED
    95	        mem 24 below min_accepted_mem 62.51
    96
    97	  [32] pulsar-qld-high-mem2: REJECTED
    98	        mem 24 below min_accepted_mem 58
    99
   100	  [33] pulsar-nci-training: REJECTED
   101	        tag mismatch - entity requires [], rejects ['offline'] dest tags are ['pulsar',
        'training', 'docker', 'singularity', 'pulsar-nci-training', 'cvmfs_cache_100plus',
        'pulsar-blast', 'bakta_database', 'funannotate', 'eggnog', 'medaka_venv_211',
        'tool_type_user_defined']
   102
   103	  [34] pulsar-qld-blast: REJECTED
   104	        tag mismatch - entity requires [], rejects ['offline'] dest tags are ['pulsar',
        'pulsar-blast', 'docker', 'singularity', 'pulsar-qld-blast', 'cvmfs_cache_100plus',
        'cvmfs_cache_800plus', 'tool_type_user_defined']
   105
   106	  [35] pulsar-QLD: MATCHED
   107	        capacity: max_cores=16, max_mem=62.5
   108
   109	  [36] pulsar-azure: REJECTED
   110	        tag mismatch - entity requires [], rejects ['offline'] dest tags are ['pulsar',
        'pulsar-azure', 'offline', 'docker', 'singularity', 'tool_type_user_defined']
   111
   112	  [37] pulsar-azure-gpu: REJECTED
   113	        tag mismatch - entity requires [], rejects ['offline'] dest tags are ['pulsar',
        'pulsar-azure-gpu', 'offline', 'docker', 'singularity', 'tool_type_user_defined']
   114
   115	  [38] pulsar-azure-1-gpu: REJECTED
   116	        tag mismatch - entity requires [], rejects ['offline'] dest tags are ['pulsar',
        'pulsar-azure-1-gpu', 'offline', 'docker', 'singularity', 'tool_type_user_defined']
   117
   118	  [39] _pulsar_qld_gpu: REJECTED
   119	        destination is abstract
   120
   121	  [40] pulsar-qld-gpu1: REJECTED
   122	        tag mismatch - entity requires [], rejects ['offline'] dest tags are ['pulsar',
        'pulsar-qld-gpu', 'docker', 'singularity', 'pulsar-qld-gpu1', 'pulsar-qld-gpu-alphafold',
        'tool_type_user_defined']
   123
   124	  [41] pulsar-qld-gpu2: REJECTED
   125	        tag mismatch - entity requires [], rejects ['offline'] dest tags are ['pulsar',
        'pulsar-qld-gpu', 'docker', 'singularity', 'pulsar-qld-gpu2', 'pulsar-qld-gpu-alphafold',
        'tool_type_user_defined']
   126
   127	  [42] pulsar-qld-gpu3: REJECTED
   128	        tag mismatch - entity requires [], rejects ['offline'] dest tags are ['pulsar',
        'pulsar-qld-gpu', 'docker', 'singularity', 'pulsar-qld-gpu3', 'pulsar-qld-gpu-alphafold',
        'tool_type_user_defined']
   129
   130	  [43] pulsar-qld-gpu4: REJECTED
   131	        tag mismatch - entity requires [], rejects ['offline'] dest tags are ['pulsar',
        'pulsar-qld-gpu', 'docker', 'singularity', 'pulsar-qld-gpu4', 'pulsar-qld-gpu-other',
        'tool_type_user_defined']
   132
   133	  [44] pulsar-qld-gpu5: REJECTED
   134	        tag mismatch - entity requires [], rejects ['offline'] dest tags are ['pulsar',
        'pulsar-qld-gpu', 'docker', 'singularity', 'pulsar-qld-gpu5', 'pulsar-qld-gpu-other',
        'tool_type_user_defined']
   135
   136	--- Destination Ranking ---
   137	  [45] #1 slurm (score: -9)
   138
   139	  [46] #2 pulsar-mel3 (score: -1)
   140
   141	  [47] #3 pulsar-mel3 (score: -1)
   142
   143	--- Destination Evaluation ---
   144	  [48] Evaluating destination 'slurm'
   145
   146	--- Rule Evaluation ---
   147	  [49] Rule 'slurm_destination_singularity_rule' (if:
        entity.params.get('singularity_enabled')) -> MATCHED
   148
   149	  [50] Rule 'slurm_destination_docker_rule' (if: entity.params.get('docker_enabled')) -> not
        matched
   150
   151	--- Final Result ---
   152	  [51] Destination: slurm
   153	        runner: slurm
   154	        cores: 10, mem: 24, gpus: 0
   155	        params: {'singularity_enabled': True, 'tpv_cores': '10', 'tpv_gpus': '0', 'tpv_mem':
        '24', 'nativeSpecification': '--nodes=1 --ntasks=10 --ntasks-per-node=10 --mem=24576
        --partition=main', 'metadata_strategy': 'extended', 'singularity_volumes':
        '$job_directory:rw,$galaxy_root:ro,$tool_directory:ro,/mnt/user-data-volA:ro,/mnt/user-data-
        volB:ro,/mnt/user-data-volD:ro,/mnt/user-data-qld:ro,/mnt/custom-indices:ro,/cvmfs/data.gala
        xyproject.org:ro,/tmp:rw', 'singularity_default_container_id':
        '/cvmfs/singularity.galaxyproject.org/all/python:3.8.3'}
   156	        env: [{'name': 'HDF5_USE_FILE_LOCKING', 'value': 'FALSE'}, {'name':
        'SINGULARITYENV_HDF5_USE_FILE_LOCKING', 'value': 'FALSE'}, {'name': '_JAVA_OPTIONS',
        'value': '-Xmx24G -Xms1G'}, {'name': 'SINGULARITYENV__JAVA_OPTIONS', 'value': '-Xmx24G
        -Xms1G'}]
   157
   158	========================================================================
   159	!!python/object:galaxy.jobs.JobDestination
   160	converted: false
   161	env:
   162	- {name: HDF5_USE_FILE_LOCKING, value: 'FALSE'}
   163	- {name: SINGULARITYENV_HDF5_USE_FILE_LOCKING, value: 'FALSE'}
   164	- {name: _JAVA_OPTIONS, value: -Xmx24G -Xms1G}
   165	- {name: SINGULARITYENV__JAVA_OPTIONS, value: -Xmx24G -Xms1G}
   166	id: slurm
   167	legacy: false
   168	params: {metadata_strategy: extended, nativeSpecification: --nodes=1 --ntasks=10
   169	    --ntasks-per-node=10 --mem=24576 --partition=main,
   170	    singularity_default_container_id:
   171	    /cvmfs/singularity.galaxyproject.org/all/python:3.8.3, singularity_enabled:
   172	    true, singularity_volumes:
        '$job_directory:rw,$galaxy_root:ro,$tool_directory:ro,/mnt/user-data-volA:ro,/mnt/user-data-
        volB:ro,/mnt/user-data-volD:ro,/mnt/user-data-qld:ro,/mnt/custom-indices:ro,/cvmfs/data.gala
        xyproject.org:ro,/tmp:rw',
   173	  tpv_cores: '10', tpv_gpus: '0', tpv_mem: '24'}
   174	resubmit: []
   175	runner: slurm
   176	shell: null
   177	tags: [registered_user_concurrent_jobs_12]
   178	url: null

cat-bro · 2026-03-02T02:40:58Z

Line 25: entity combining - the combined entity has accept: [‘pulsar’] which is left out and is important for later matchmaking.

Lines 52-68: Maybe the abstract destinations could be left out. There is a lot of other good info in here and the fact that entities do not match with abstract destinations is not interesting.

Line 70: slurm-training is rejected for a tag mismatch but it’s not clear why. It would be better if all tag categories (accept/prefer/require/reject) were listed for the entity, and if tags were separated into categories for the destination. The reason that there is a tag mismatch is that slurm-training requires the ’training’ tag and the entity does not have the ‘training' tag, but it is not obvious from this explanation.

Lines 137-141: There is something odd here because pulsar-mel3 is listed twice and pulsar-QLD also matched. The ranking function being used by the job conf in this case is weighted_random_sampling.

Everything else looks fantastic!

…fined

nuwang · 2026-03-06T17:10:56Z

Thanks @cat-bro, that was super useful feedback. I think all the issues you highlighted have been addressed now. The last one was particularly interesting - because it's a consequence of using weighted random sampling without weights being defined. As a result, the same destination is considered again when making the next random choice. I've changed it so that, if weights are not defined, it falls backs to standard random sampling (without replacement).

cat-bro

Thank you @nuwang !

nuwang requested a review from cat-bro February 19, 2026 17:21

nuwang added the enhancement New feature or request label Feb 21, 2026

nuwang added 12 commits February 21, 2026 11:41

Add explainability support

370bd18

Handle scheduling exceptions during explain

e70c361

Resolve configs relative to job_conf

31b27d8

Refactor code and move to TPVConfigDumper

4d9f8ac

Improve dump output

028ebd9

Better handling of multiline fields

42bd502

Add docs on explain and dump

f46d81a

Refactor dumper and explaincollector

262f8af

Fix mypy errors

9310130

Refactor dry-run return type and add tests

1171af1

Increase test coverage

ae1884c

Fix some type errors during rebase

f2e27f5

nuwang force-pushed the add_explainability branch from d5bdeaf to f2e27f5 Compare February 21, 2026 06:44

nuwang added 5 commits March 6, 2026 18:39

Make sure all scheduling tags are shown

3db734f

Show a summarization of skipped abstract destinations

df04c8b

Improve mismatch explanation on tag mismatch

8633cf1

Make sure abstract destinations are listed first

7cdc2a0

Use weighted random sampling without replacement if no weights are de…

4429eca

…fined

cat-bro approved these changes Mar 11, 2026

View reviewed changes

nuwang merged commit 06d65a1 into main Mar 11, 2026
3 checks passed

nuwang deleted the add_explainability branch March 11, 2026 08:07

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Add explain flag and merged config dump#184

Add explain flag and merged config dump#184
nuwang merged 17 commits intomainfrom
add_explainability

nuwang commented Feb 19, 2026 •

edited

Loading

Uh oh!

coveralls commented Feb 20, 2026 •

edited

Loading

Uh oh!

mvdbeek commented Feb 26, 2026

Uh oh!

cat-bro commented Mar 2, 2026 •

edited

Loading

Uh oh!

cat-bro commented Mar 2, 2026

Uh oh!

cat-bro commented Mar 2, 2026

Uh oh!

nuwang commented Mar 6, 2026

Uh oh!

cat-bro left a comment

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

4 participants

Conversation

nuwang commented Feb 19, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

coveralls commented Feb 20, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Pull Request Test Coverage Report for Build 22773699455

Warning: This coverage report may be inaccurate.

Details

💛 - Coveralls

Uh oh!

mvdbeek commented Feb 26, 2026

Uh oh!

cat-bro commented Mar 2, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

cat-bro commented Mar 2, 2026

Uh oh!

cat-bro commented Mar 2, 2026

Uh oh!

nuwang commented Mar 6, 2026

Uh oh!

cat-bro left a comment

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

4 participants

nuwang commented Feb 19, 2026 •

edited

Loading

coveralls commented Feb 20, 2026 •

edited

Loading

cat-bro commented Mar 2, 2026 •

edited

Loading