Hello! Thanks for writing such an excellent tool!
Currently, Heterogenesis represents haplotypes in generated VCFs using letters A, B, etc, but in the headers of the VCFs, the data type of this field is specified as "Integer" like so:
##FORMAT=<ID=HS,Number=1,Type=Integer,Description="Haplotypes">
This is written in heterogenesis_varincorp.py line 149
This causes type-aware parsers (e.g. vcfR) to incorrectly read in the haplotypes. I've submitted a pull request to address this, which merely changes "Integer" to "Character" (i.e. ##FORMAT=<ID=HS,Number=1,Type=Character,Description="Haplotypes">), which avoids the issue.
Thanks!
Hello! Thanks for writing such an excellent tool!
Currently, Heterogenesis represents haplotypes in generated VCFs using letters A, B, etc, but in the headers of the VCFs, the data type of this field is specified as "Integer" like so:
##FORMAT=<ID=HS,Number=1,Type=Integer,Description="Haplotypes">
This is written in
heterogenesis_varincorp.pyline 149This causes type-aware parsers (e.g. vcfR) to incorrectly read in the haplotypes. I've submitted a pull request to address this, which merely changes "Integer" to "Character" (i.e. ##FORMAT=<ID=HS,Number=1,Type=Character,Description="Haplotypes">), which avoids the issue.
Thanks!