Thank you for developing and publishing IsoTools 2.0.
I've downloaded the demonstartion_dataset and run isotools (using the command given on the Tutorial -> Command Line Interface (CLI), which process data but then stops with error:
ValueError: The truth value of an array with more than one element is ambiguous. Use a.any() or a.all()
In more detail:
I downloaded all the demonstration data from: https://nc.molgen.mpg.de/cloud/index.php/s/Mf2zMePGBzFWFk8
and unzipped its files into a "demonstration_dataset" subdirectory (which contains the "encode_samples.tsv", "GRCh38.p13.genome_chr8.fa", "gencode.v42.chr_patch_hapl_scaff.annotation_sorted_chr8.gff3.gz", "ENCFF450VAU_aligned_mm2_chr8.bam", etc).
Then I ran the isotools2 "Tutorial" -> "Command Line Interface (CLI)", command (numbered [3]) from: https://isotools.readthedocs.io/en/latest/notebooks/02_api_vs_cli.html#Command-Line-Interface-(CLI)
as follows (I had copied and pasted from the above page so should be same):
cd demonstration_dataset
samples='encode_samples.tsv'
anno='gencode.v42.chr_patch_hapl_scaff.annotation_sorted_chr8.gff3.gz'
genome='GRCh38.p13.genome_chr8.fa'
run_isotools \
--anno $anno \
--log INFO \
--progress_bar \
--genome $genome \
--samples $samples \
--file_prefix ./PacBio_isotools_substantial \
--custom_filter_tag "COVERED=any(gene.coverage[:,transcript_id] > 2)" "HIGH_COVER=gene.coverage.sum(0)[transcript_id] >= 7" \
--filter_query "(COVERED and FSM) or (HIGH_COVER and SUBSTANTIAL and not INTERNAL_PRIMING)" \
--gtf_out --transcript_table
However, I get the following exception (full output is further below):
2025-06-06 13:55:02 INFO: replaced existing filter rule HIGH_COVER in transcript context
2025-06-06 13:55:02 INFO: writing transcript table to ./PacBio_isotools_substantial_transcripts.csv
0%| | 0/10705 [00:00<?, ?genes/s]2025-06-06 13:55:02 ERROR: error when evaluating filter COVERED with arguments {'gene': <isotools.gene.Gene object at 0x7fd4f90d5190>, 'trid': 0, 'exons': [[15417190, 15417305], [15483385, 15483483], [15486626, 15486764], [15623079, 15623249], [15650696, 15650814], [15659506, 15659647], [15662155, 15662296], [15673746, 15673836], [15730665, 15730729], [15743537, 15743612], [15748374, 15748465], [15764202, np.int64(15764645)]], 'strand': '+', 'coverage': {'GM12878_b': 1}, 'TSS': {'GM12878_b': {15417190: 1}}, 'PAS': {'GM12878_b': {15764476: 1}}, 'annotation': (3, {'novel exon': [[15486626, 15486764]]}), 'novel_splice_sites': [3, 4], 'TSS_unified': {'GM12878_b': {15417190: 1}}, 'PAS_unified': {'GM12878_b': {np.int64(15764645): 1}}, 'direct_repeat_len': [2, 4, 4, 6, 6, 3, 4, 5, 4, 6, 4], 'downstream_A_content': 0.3333333333333333}: The truth value of an array with more than one element is ambiguous. Use a.any() or a.all()
0%| | 3/10705 [00:00<00:02, 4366.03genes/s]
.....etc.....
I'm using python 3.12.11 and am running inside conda, using:
conda create -n IsoTools_py3.12 python=3.12
conda activate IsoTools_py3.12
pip install isotools
conda list output includes the following versions:
....
isotools 2.0.0
...
numpy 2.2.6
....
pandas 2.3.0
etc...
(I got same error when I used Python 3.10.17 with an "IsoTools_py3.10" environment)
I assume this error is related to the: --custom_filter_tag "COVERED=any(gene.coverage[:,transcript_id] > 2)"
The full output is:
2025-06-06 13:53:40 INFO: This is isotools version 2.0.0
2025-06-06 13:53:40 INFO: loading transcriptome from ./PacBio_isotools_substantial_isotools.pkl
2025-06-06 13:53:40 INFO: importing reference from gff3 file gencode.v42.chr_patch_hapl_scaff.annotation_sorted_chr8.gff3.gz
100%|██████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████▉| 2.82M/2.82M [00:01<00:00, 2.71MB/s]
2025-06-06 13:53:41 INFO: skipped the following categories: dict_keys(['CDS', 'five_prime_UTR', 'three_prime_UTR'])
2025-06-06 13:53:41 INFO: collapsed 0 immunoglobulin loci and 0 T-cell receptor loci
2025-06-06 13:53:41 INFO: adding sample GM12878_a from file ENCFF417VHJ_aligned_mm2_chr8.bam
100%|███████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████| 53.0k/53.0k [00:06<00:00, 8.45kreads/s, chr=KI270757.1]
2025-06-06 13:53:47 INFO: skipped 110 reads aligned fraction of less than 0.75.
2025-06-06 13:53:47 INFO: skipped 10972 secondary alignments (0x100), alignment that failed quality check (0x200) or PCR duplicates (0x400)
2025-06-06 13:53:47 WARNING: ignored 533 chimeric alignments with only one part aligned to specified chromosomes.
2025-06-06 13:53:47 INFO: ignoring 2231 chimeric alignments with less than 2 reads
2025-06-06 13:53:47 INFO: imported 40182 nonchimeric reads (including 14 chained chimeric alignments) and 73 chimeric reads with coverage of at least 2.
2025-06-06 13:53:47 INFO: adding sample GM12878_b from file ENCFF450VAU_aligned_mm2_chr8.bam
100%|███████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████| 68.4k/68.4k [00:06<00:00, 10.7kreads/s, chr=KI270757.1]
2025-06-06 13:53:54 INFO: skipped 71 reads aligned fraction of less than 0.75.
2025-06-06 13:53:54 INFO: skipped 12700 secondary alignments (0x100), alignment that failed quality check (0x200) or PCR duplicates (0x400)
2025-06-06 13:53:54 WARNING: ignored 484 chimeric alignments with only one part aligned to specified chromosomes.
2025-06-06 13:53:54 INFO: ignoring 1273 chimeric alignments with less than 2 reads
2025-06-06 13:53:54 INFO: imported 54853 nonchimeric reads (including 12 chained chimeric alignments) and 7 chimeric reads with coverage of at least 2.
2025-06-06 13:53:54 INFO: adding sample GM12878_c from file ENCFF694DIE_aligned_mm2_chr8.bam
100%|███████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████| 90.7k/90.7k [00:07<00:00, 12.1kreads/s, chr=KI270757.1]
2025-06-06 13:54:01 INFO: skipped 85 reads aligned fraction of less than 0.75.
2025-06-06 13:54:01 INFO: skipped 17261 secondary alignments (0x100), alignment that failed quality check (0x200) or PCR duplicates (0x400)
2025-06-06 13:54:01 WARNING: ignored 455 chimeric alignments with only one part aligned to specified chromosomes.
2025-06-06 13:54:01 INFO: ignoring 1410 chimeric alignments with less than 2 reads
2025-06-06 13:54:01 INFO: imported 72451 nonchimeric reads (including 38 chained chimeric alignments) and 12 chimeric reads with coverage of at least 2.
2025-06-06 13:54:01 INFO: adding sample K562_a from file ENCFF429VVB_aligned_mm2_chr8.bam
100%|█████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████| 107k/107k [00:10<00:00, 10.1kreads/s, chr=KI270757.1]
2025-06-06 13:54:12 INFO: skipped 297 reads aligned fraction of less than 0.75.
2025-06-06 13:54:12 INFO: skipped 23990 secondary alignments (0x100), alignment that failed quality check (0x200) or PCR duplicates (0x400)
2025-06-06 13:54:12 WARNING: ignored 2160 chimeric alignments with only one part aligned to specified chromosomes.
2025-06-06 13:54:12 INFO: ignoring 7445 chimeric alignments with less than 2 reads
2025-06-06 13:54:12 INFO: imported 76692 nonchimeric reads (including 57 chained chimeric alignments) and 415 chimeric reads with coverage of at least 2.
2025-06-06 13:54:12 INFO: adding sample K562_b from file ENCFF696GDL_aligned_mm2_chr8.bam
100%|███████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████| 78.0k/78.0k [00:07<00:00, 9.78kreads/s, chr=KI270757.1]
2025-06-06 13:54:20 INFO: skipped 165 reads aligned fraction of less than 0.75.
2025-06-06 13:54:20 INFO: skipped 15026 secondary alignments (0x100), alignment that failed quality check (0x200) or PCR duplicates (0x400)
2025-06-06 13:54:20 WARNING: ignored 1142 chimeric alignments with only one part aligned to specified chromosomes.
2025-06-06 13:54:20 INFO: ignoring 4530 chimeric alignments with less than 2 reads
2025-06-06 13:54:20 INFO: imported 59118 nonchimeric reads (including 43 chained chimeric alignments) and 284 chimeric reads with coverage of at least 2.
2025-06-06 13:54:20 INFO: adding sample K562_c from file ENCFF634YSN_aligned_mm2_chr8.bam
100%|█████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████| 117k/117k [00:10<00:00, 11.4kreads/s, chr=KI270757.1]
2025-06-06 13:54:30 INFO: skipped 294 reads aligned fraction of less than 0.75.
2025-06-06 13:54:30 INFO: skipped 30231 secondary alignments (0x100), alignment that failed quality check (0x200) or PCR duplicates (0x400)
2025-06-06 13:54:30 WARNING: ignored 2528 chimeric alignments with only one part aligned to specified chromosomes.
2025-06-06 13:54:30 INFO: ignoring 8019 chimeric alignments with less than 2 reads
2025-06-06 13:54:30 INFO: imported 80343 nonchimeric reads (including 46 chained chimeric alignments) and 371 chimeric reads with coverage of at least 2.
100%|██████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████| 10705/10705 [00:31<00:00, 341.75genes/s]
2025-06-06 13:55:02 INFO: adding new filter rule COVERED in transcript context
2025-06-06 13:55:02 WARNING: Some attributes not present in transcript context, please make sure there is no typo: transcript_id
This can happen for correct filters when there are no or only a few transcripts loaded into the model.
2025-06-06 13:55:02 WARNING: Some attributes not present in transcript context, please make sure there is no typo: transcript_id
This can happen for correct filters when there are no or only a few transcripts loaded into the model.
2025-06-06 13:55:02 INFO: replaced existing filter rule HIGH_COVER in transcript context
2025-06-06 13:55:02 INFO: writing transcript table to ./PacBio_isotools_substantial_transcripts.csv
0%| | 0/10705 [00:00<?, ?genes/s]2025-06-06 13:55:02 ERROR: error when evaluating filter COVERED with arguments {'gene': <isotools.gene.Gene object at 0x7fd4f90d5190>, 'trid': 0, 'exons': [[15417190, 15417305], [15483385, 15483483], [15486626, 15486764], [15623079, 15623249], [15650696, 15650814], [15659506, 15659647], [15662155, 15662296], [15673746, 15673836], [15730665, 15730729], [15743537, 15743612], [15748374, 15748465], [15764202, np.int64(15764645)]], 'strand': '+', 'coverage': {'GM12878_b': 1}, 'TSS': {'GM12878_b': {15417190: 1}}, 'PAS': {'GM12878_b': {15764476: 1}}, 'annotation': (3, {'novel exon': [[15486626, 15486764]]}), 'novel_splice_sites': [3, 4], 'TSS_unified': {'GM12878_b': {15417190: 1}}, 'PAS_unified': {'GM12878_b': {np.int64(15764645): 1}}, 'direct_repeat_len': [2, 4, 4, 6, 6, 3, 4, 5, 4, 6, 4], 'downstream_A_content': 0.3333333333333333}: The truth value of an array with more than one element is ambiguous. Use a.any() or a.all()
0%| | 3/10705 [00:00<00:02, 4366.03genes/s]
Traceback (most recent call last):
File "/home/stephen/.local/bin/run_isotools", line 10, in <module>
sys.exit(main())
^^^^^^
File "/mnt/DATA/miniconda2/envs/IsoTools_py3.12/lib/python3.12/site-packages/isotools/run_isotools.py", line 192, in main
df = isoseq.transcript_table(
^^^^^^^^^^^^^^^^^^^^^^^^
File "/mnt/DATA/miniconda2/envs/IsoTools_py3.12/lib/python3.12/site-packages/isotools/_transcriptome_io.py", line 2293, in transcript_table
for gene, transcript_ids, transcripts in self.iter_transcripts(
^^^^^^^^^^^^^^^^^^^^^^
File "/mnt/DATA/miniconda2/envs/IsoTools_py3.12/lib/python3.12/site-packages/isotools/_transcriptome_filter.py", line 455, in iter_transcripts
filter_result = tuple(
^^^^^^
File "/mnt/DATA/miniconda2/envs/IsoTools_py3.12/lib/python3.12/site-packages/isotools/_transcriptome_filter.py", line 612, in _filter_transcripts
tag: _eval_filter_fun(f, tag, gene=gene, trid=i, **filter_transcript)
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
File "/mnt/DATA/miniconda2/envs/IsoTools_py3.12/lib/python3.12/site-packages/isotools/_transcriptome_filter.py", line 576, in _eval_filter_fun
return fun(**args)
^^^^^^^^^^^
File "<string>", line 1, in <lambda>
ValueError: The truth value of an array with more than one element is ambiguous. Use a.any() or a.all()
Whereas the expected output given on Command-Line-Interface-(CLI) (which starts with "This is isotools version 0.3.5rc10" (rather than version 2.0), doesn't give the above ValueError exception, nor the warnings: "WARNING: Some attributes not present in transcript context, please make sure there is no typo: transcript_id. This can happen for correct filters when there are no or only a few transcripts loaded into the model."
2024-08-13 17:14:37 INFO: This is isotools version 0.3.5rc10
....
....
2024-08-13 17:16:19 INFO: ignoring 8023 chimeric alignments with less than 2 reads
2024-08-13 17:16:19 INFO: imported 80338 nonchimeric reads (including 46 chained chimeric alignments) and 369 chimeric reads with coverage of at least 2.
100%|██████████| 10801/10801 [01:12<00:00, 149.26genes/s]
2024-08-13 17:17:31 INFO: adding new filter rule COVERED in transcript context
2024-08-13 17:17:32 INFO: replaced existing filter rule HIGH_COVER in transcript context
2024-08-13 17:17:32 INFO: writing transcript table to ./PacBio_isotools_substantial_transcripts.csv
100%|██████████| 10801/10801 [00:02<00:00, 4251.79genes/s]
2024-08-13 17:17:34 INFO: writing gtf file to ./PacBio_isotools_substantial_transcripts.gtf
100%|██████████| 10801/10801 [00:02<00:00, 5087.24genes/s]
2024-08-13 17:17:37 INFO: saving transcripts as pickle file
2024-08-13 17:17:37 INFO: saving transcriptome to ./PacBio_isotools_substantial_isotools.pkl
Thank you for developing and publishing IsoTools 2.0.
I've downloaded the demonstartion_dataset and run isotools (using the command given on the Tutorial -> Command Line Interface (CLI), which process data but then stops with error:
ValueError: The truth value of an array with more than one element is ambiguous. Use a.any() or a.all()
In more detail:
I downloaded all the demonstration data from: https://nc.molgen.mpg.de/cloud/index.php/s/Mf2zMePGBzFWFk8
and unzipped its files into a "demonstration_dataset" subdirectory (which contains the "encode_samples.tsv", "GRCh38.p13.genome_chr8.fa", "gencode.v42.chr_patch_hapl_scaff.annotation_sorted_chr8.gff3.gz", "ENCFF450VAU_aligned_mm2_chr8.bam", etc).
Then I ran the isotools2 "Tutorial" -> "Command Line Interface (CLI)", command (numbered [3]) from: https://isotools.readthedocs.io/en/latest/notebooks/02_api_vs_cli.html#Command-Line-Interface-(CLI)
as follows (I had copied and pasted from the above page so should be same):
However, I get the following exception (full output is further below):
I'm using python 3.12.11 and am running inside conda, using:
conda list output includes the following versions:
(I got same error when I used Python 3.10.17 with an "IsoTools_py3.10" environment)
I assume this error is related to the: --custom_filter_tag "COVERED=any(gene.coverage[:,transcript_id] > 2)"
The full output is:
Whereas the expected output given on Command-Line-Interface-(CLI) (which starts with "This is isotools version 0.3.5rc10" (rather than version 2.0), doesn't give the above ValueError exception, nor the warnings: "WARNING: Some attributes not present in transcript context, please make sure there is no typo: transcript_id. This can happen for correct filters when there are no or only a few transcripts loaded into the model."