Skip to content

Include and Exclude optional inputs not handled properly #13

@kanika-arora

Description

@kanika-arora

Description of the bug

As of August 18, 2025 - the include and exclude optional input handling in the main and dev branches is not correct.

The design was to

  1. use patient IDs and go through directories and key files to get a list of all ACCESS research (CMO) sample IDs, ACCESS clinical sample IDs, and IMPACT clinical sample IDs.
  2. Then use the list of CMO sample IDs (research ACCESS) in the "include file" IF provided to subset ACCESS research samples to only those in the list -> hence I've called this "access research samples to include" in the lucidchart flowchart. Perhaps better name for this would have been "access research samples to keep"
  3. Use the list of any sample ID (research or clinical) in the exclude list IF provided to exclude any research or clinical ACCESS or IMPACT sample in the list.

In the current version of the pipeline

  • It cannot handle setting these optional input and output parameters to null
  • Even if the include list is provided, the infer sample doesn't restrict the CMO samples to that list but continues to use all the found samples

In fix/pipeline_enhancements_and_fixes_20250818 I have replaced bin/infer_samples.py, and made additional changes to correctly handle the include file, and allow for these optional parameters to be set as null. @shguturu please test and finalize.

Command used and terminal output

Relevant files

No response

System information

No response

Metadata

Metadata

Assignees

No one assigned

    Labels

    bugSomething isn't working

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions