Skip to content

Using the demonstration_data, gives error: ValueError: The truth value of an array with more than one element is ambiguous. Use a.any() or a.all() #19

@sbridgett

Description

@sbridgett

Thank you for developing and publishing IsoTools 2.0.

I've downloaded the demonstartion_dataset and run isotools (using the command given on the Tutorial -> Command Line Interface (CLI), which process data but then stops with error:

ValueError: The truth value of an array with more than one element is ambiguous. Use a.any() or a.all()

In more detail:

I downloaded all the demonstration data from: https://nc.molgen.mpg.de/cloud/index.php/s/Mf2zMePGBzFWFk8

and unzipped its files into a "demonstration_dataset" subdirectory (which contains the "encode_samples.tsv", "GRCh38.p13.genome_chr8.fa", "gencode.v42.chr_patch_hapl_scaff.annotation_sorted_chr8.gff3.gz", "ENCFF450VAU_aligned_mm2_chr8.bam", etc).

Then I ran the isotools2 "Tutorial" -> "Command Line Interface (CLI)", command (numbered [3]) from: https://isotools.readthedocs.io/en/latest/notebooks/02_api_vs_cli.html#Command-Line-Interface-(CLI)
as follows (I had copied and pasted from the above page so should be same):

cd demonstration_dataset
samples='encode_samples.tsv'
anno='gencode.v42.chr_patch_hapl_scaff.annotation_sorted_chr8.gff3.gz'
genome='GRCh38.p13.genome_chr8.fa'

run_isotools \
    --anno $anno \
    --log INFO \
    --progress_bar \
    --genome $genome \
    --samples $samples \
    --file_prefix ./PacBio_isotools_substantial \
    --custom_filter_tag "COVERED=any(gene.coverage[:,transcript_id] > 2)"  "HIGH_COVER=gene.coverage.sum(0)[transcript_id] >= 7" \
    --filter_query "(COVERED and FSM) or (HIGH_COVER and SUBSTANTIAL and not INTERNAL_PRIMING)" \
    --gtf_out --transcript_table

However, I get the following exception (full output is further below):

2025-06-06 13:55:02 INFO: replaced existing filter rule HIGH_COVER in transcript context
2025-06-06 13:55:02 INFO: writing transcript table to ./PacBio_isotools_substantial_transcripts.csv
  0%|  | 0/10705 [00:00<?, ?genes/s]2025-06-06 13:55:02 ERROR: error when evaluating filter COVERED with arguments {'gene': <isotools.gene.Gene object at 0x7fd4f90d5190>, 'trid': 0, 'exons': [[15417190, 15417305], [15483385, 15483483], [15486626, 15486764], [15623079, 15623249], [15650696, 15650814], [15659506, 15659647], [15662155, 15662296], [15673746, 15673836], [15730665, 15730729], [15743537, 15743612], [15748374, 15748465], [15764202, np.int64(15764645)]], 'strand': '+', 'coverage': {'GM12878_b': 1}, 'TSS': {'GM12878_b': {15417190: 1}}, 'PAS': {'GM12878_b': {15764476: 1}}, 'annotation': (3, {'novel exon': [[15486626, 15486764]]}), 'novel_splice_sites': [3, 4], 'TSS_unified': {'GM12878_b': {15417190: 1}}, 'PAS_unified': {'GM12878_b': {np.int64(15764645): 1}}, 'direct_repeat_len': [2, 4, 4, 6, 6, 3, 4, 5, 4, 6, 4], 'downstream_A_content': 0.3333333333333333}: The truth value of an array with more than one element is ambiguous. Use a.any() or a.all()
  0%|    | 3/10705 [00:00<00:02, 4366.03genes/s]
.....etc.....

I'm using python 3.12.11 and am running inside conda, using:

conda create -n IsoTools_py3.12 python=3.12
conda activate IsoTools_py3.12
pip install isotools

conda list output includes the following versions:

....
isotools                   2.0.0
...
numpy                      2.2.6
....
pandas                     2.3.0

etc...

(I got same error when I used Python 3.10.17 with an "IsoTools_py3.10" environment)

I assume this error is related to the: --custom_filter_tag "COVERED=any(gene.coverage[:,transcript_id] > 2)"

The full output is:

2025-06-06 13:53:40 INFO: This is isotools version 2.0.0
2025-06-06 13:53:40 INFO: loading transcriptome from ./PacBio_isotools_substantial_isotools.pkl
2025-06-06 13:53:40 INFO: importing reference from gff3 file gencode.v42.chr_patch_hapl_scaff.annotation_sorted_chr8.gff3.gz
100%|██████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████▉| 2.82M/2.82M [00:01<00:00, 2.71MB/s]
2025-06-06 13:53:41 INFO: skipped the following categories: dict_keys(['CDS', 'five_prime_UTR', 'three_prime_UTR'])
2025-06-06 13:53:41 INFO: collapsed 0 immunoglobulin loci and 0 T-cell receptor loci
2025-06-06 13:53:41 INFO: adding sample GM12878_a from file ENCFF417VHJ_aligned_mm2_chr8.bam
100%|███████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████| 53.0k/53.0k [00:06<00:00, 8.45kreads/s, chr=KI270757.1]
2025-06-06 13:53:47 INFO: skipped 110 reads aligned fraction of less than 0.75.
2025-06-06 13:53:47 INFO: skipped 10972 secondary alignments (0x100), alignment that failed quality check (0x200) or PCR duplicates (0x400)
2025-06-06 13:53:47 WARNING: ignored 533 chimeric alignments with only one part aligned to specified chromosomes.
2025-06-06 13:53:47 INFO: ignoring 2231 chimeric alignments with less than 2 reads
2025-06-06 13:53:47 INFO: imported 40182 nonchimeric reads (including  14 chained chimeric alignments) and 73 chimeric reads with coverage of at least 2.
2025-06-06 13:53:47 INFO: adding sample GM12878_b from file ENCFF450VAU_aligned_mm2_chr8.bam
100%|███████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████| 68.4k/68.4k [00:06<00:00, 10.7kreads/s, chr=KI270757.1]
2025-06-06 13:53:54 INFO: skipped 71 reads aligned fraction of less than 0.75.
2025-06-06 13:53:54 INFO: skipped 12700 secondary alignments (0x100), alignment that failed quality check (0x200) or PCR duplicates (0x400)
2025-06-06 13:53:54 WARNING: ignored 484 chimeric alignments with only one part aligned to specified chromosomes.
2025-06-06 13:53:54 INFO: ignoring 1273 chimeric alignments with less than 2 reads
2025-06-06 13:53:54 INFO: imported 54853 nonchimeric reads (including  12 chained chimeric alignments) and 7 chimeric reads with coverage of at least 2.
2025-06-06 13:53:54 INFO: adding sample GM12878_c from file ENCFF694DIE_aligned_mm2_chr8.bam
100%|███████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████| 90.7k/90.7k [00:07<00:00, 12.1kreads/s, chr=KI270757.1]
2025-06-06 13:54:01 INFO: skipped 85 reads aligned fraction of less than 0.75.
2025-06-06 13:54:01 INFO: skipped 17261 secondary alignments (0x100), alignment that failed quality check (0x200) or PCR duplicates (0x400)
2025-06-06 13:54:01 WARNING: ignored 455 chimeric alignments with only one part aligned to specified chromosomes.
2025-06-06 13:54:01 INFO: ignoring 1410 chimeric alignments with less than 2 reads
2025-06-06 13:54:01 INFO: imported 72451 nonchimeric reads (including  38 chained chimeric alignments) and 12 chimeric reads with coverage of at least 2.
2025-06-06 13:54:01 INFO: adding sample K562_a from file ENCFF429VVB_aligned_mm2_chr8.bam
100%|█████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████| 107k/107k [00:10<00:00, 10.1kreads/s, chr=KI270757.1]
2025-06-06 13:54:12 INFO: skipped 297 reads aligned fraction of less than 0.75.
2025-06-06 13:54:12 INFO: skipped 23990 secondary alignments (0x100), alignment that failed quality check (0x200) or PCR duplicates (0x400)
2025-06-06 13:54:12 WARNING: ignored 2160 chimeric alignments with only one part aligned to specified chromosomes.
2025-06-06 13:54:12 INFO: ignoring 7445 chimeric alignments with less than 2 reads
2025-06-06 13:54:12 INFO: imported 76692 nonchimeric reads (including  57 chained chimeric alignments) and 415 chimeric reads with coverage of at least 2.
2025-06-06 13:54:12 INFO: adding sample K562_b from file ENCFF696GDL_aligned_mm2_chr8.bam
100%|███████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████| 78.0k/78.0k [00:07<00:00, 9.78kreads/s, chr=KI270757.1]
2025-06-06 13:54:20 INFO: skipped 165 reads aligned fraction of less than 0.75.
2025-06-06 13:54:20 INFO: skipped 15026 secondary alignments (0x100), alignment that failed quality check (0x200) or PCR duplicates (0x400)
2025-06-06 13:54:20 WARNING: ignored 1142 chimeric alignments with only one part aligned to specified chromosomes.
2025-06-06 13:54:20 INFO: ignoring 4530 chimeric alignments with less than 2 reads
2025-06-06 13:54:20 INFO: imported 59118 nonchimeric reads (including  43 chained chimeric alignments) and 284 chimeric reads with coverage of at least 2.
2025-06-06 13:54:20 INFO: adding sample K562_c from file ENCFF634YSN_aligned_mm2_chr8.bam
100%|█████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████| 117k/117k [00:10<00:00, 11.4kreads/s, chr=KI270757.1]
2025-06-06 13:54:30 INFO: skipped 294 reads aligned fraction of less than 0.75.
2025-06-06 13:54:30 INFO: skipped 30231 secondary alignments (0x100), alignment that failed quality check (0x200) or PCR duplicates (0x400)
2025-06-06 13:54:30 WARNING: ignored 2528 chimeric alignments with only one part aligned to specified chromosomes.
2025-06-06 13:54:30 INFO: ignoring 8019 chimeric alignments with less than 2 reads
2025-06-06 13:54:30 INFO: imported 80343 nonchimeric reads (including  46 chained chimeric alignments) and 371 chimeric reads with coverage of at least 2.
100%|██████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████| 10705/10705 [00:31<00:00, 341.75genes/s]
2025-06-06 13:55:02 INFO: adding new filter rule COVERED in transcript context
2025-06-06 13:55:02 WARNING: Some attributes not present in transcript context, please make sure there is no typo: transcript_id
This can happen for correct filters when there are no or only a few transcripts loaded into the model.
2025-06-06 13:55:02 WARNING: Some attributes not present in transcript context, please make sure there is no typo: transcript_id
This can happen for correct filters when there are no or only a few transcripts loaded into the model.
2025-06-06 13:55:02 INFO: replaced existing filter rule HIGH_COVER in transcript context
2025-06-06 13:55:02 INFO: writing transcript table to ./PacBio_isotools_substantial_transcripts.csv
  0%|                                                                                                                                                                                                                       | 0/10705 [00:00<?, ?genes/s]2025-06-06 13:55:02 ERROR: error when evaluating filter COVERED with arguments {'gene': <isotools.gene.Gene object at 0x7fd4f90d5190>, 'trid': 0, 'exons': [[15417190, 15417305], [15483385, 15483483], [15486626, 15486764], [15623079, 15623249], [15650696, 15650814], [15659506, 15659647], [15662155, 15662296], [15673746, 15673836], [15730665, 15730729], [15743537, 15743612], [15748374, 15748465], [15764202, np.int64(15764645)]], 'strand': '+', 'coverage': {'GM12878_b': 1}, 'TSS': {'GM12878_b': {15417190: 1}}, 'PAS': {'GM12878_b': {15764476: 1}}, 'annotation': (3, {'novel exon': [[15486626, 15486764]]}), 'novel_splice_sites': [3, 4], 'TSS_unified': {'GM12878_b': {15417190: 1}}, 'PAS_unified': {'GM12878_b': {np.int64(15764645): 1}}, 'direct_repeat_len': [2, 4, 4, 6, 6, 3, 4, 5, 4, 6, 4], 'downstream_A_content': 0.3333333333333333}: The truth value of an array with more than one element is ambiguous. Use a.any() or a.all()
  0%|                                                                                                                                                                                                             | 3/10705 [00:00<00:02, 4366.03genes/s]
Traceback (most recent call last):
  File "/home/stephen/.local/bin/run_isotools", line 10, in <module>
    sys.exit(main())
             ^^^^^^
  File "/mnt/DATA/miniconda2/envs/IsoTools_py3.12/lib/python3.12/site-packages/isotools/run_isotools.py", line 192, in main
    df = isoseq.transcript_table(
         ^^^^^^^^^^^^^^^^^^^^^^^^
  File "/mnt/DATA/miniconda2/envs/IsoTools_py3.12/lib/python3.12/site-packages/isotools/_transcriptome_io.py", line 2293, in transcript_table
    for gene, transcript_ids, transcripts in self.iter_transcripts(
                                             ^^^^^^^^^^^^^^^^^^^^^^
  File "/mnt/DATA/miniconda2/envs/IsoTools_py3.12/lib/python3.12/site-packages/isotools/_transcriptome_filter.py", line 455, in iter_transcripts
    filter_result = tuple(
                    ^^^^^^
  File "/mnt/DATA/miniconda2/envs/IsoTools_py3.12/lib/python3.12/site-packages/isotools/_transcriptome_filter.py", line 612, in _filter_transcripts
    tag: _eval_filter_fun(f, tag, gene=gene, trid=i, **filter_transcript)
         ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/mnt/DATA/miniconda2/envs/IsoTools_py3.12/lib/python3.12/site-packages/isotools/_transcriptome_filter.py", line 576, in _eval_filter_fun
    return fun(**args)
           ^^^^^^^^^^^
  File "<string>", line 1, in <lambda>
ValueError: The truth value of an array with more than one element is ambiguous. Use a.any() or a.all()

Whereas the expected output given on Command-Line-Interface-(CLI) (which starts with "This is isotools version 0.3.5rc10" (rather than version 2.0), doesn't give the above ValueError exception, nor the warnings: "WARNING: Some attributes not present in transcript context, please make sure there is no typo: transcript_id. This can happen for correct filters when there are no or only a few transcripts loaded into the model."

2024-08-13 17:14:37 INFO: This is isotools version 0.3.5rc10
....
....
2024-08-13 17:16:19 INFO: ignoring 8023 chimeric alignments with less than 2 reads
2024-08-13 17:16:19 INFO: imported 80338 nonchimeric reads (including  46 chained chimeric alignments) and 369 chimeric reads with coverage of at least 2.
100%|██████████| 10801/10801 [01:12<00:00, 149.26genes/s]
2024-08-13 17:17:31 INFO: adding new filter rule COVERED in transcript context
2024-08-13 17:17:32 INFO: replaced existing filter rule HIGH_COVER in transcript context
2024-08-13 17:17:32 INFO: writing transcript table to ./PacBio_isotools_substantial_transcripts.csv
100%|██████████| 10801/10801 [00:02<00:00, 4251.79genes/s]
2024-08-13 17:17:34 INFO: writing gtf file to ./PacBio_isotools_substantial_transcripts.gtf
100%|██████████| 10801/10801 [00:02<00:00, 5087.24genes/s]
2024-08-13 17:17:37 INFO: saving transcripts as pickle file
2024-08-13 17:17:37 INFO: saving transcriptome to ./PacBio_isotools_substantial_isotools.pkl

Metadata

Metadata

Assignees

No one assigned

    Labels

    No labels
    No labels

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions