Skip to content

AraTha map read in with zero rate #903

@apragsdale

Description

@apragsdale

Short description of the problem:
Creating a contig with AraTha Salome map has zero rate, because the cM column of the map is all zeros even though the rates are positive.

In [118]: cat ~/.cache/stdpopsim/genetic_maps/AraTha/SalomeAveraged_TAIR7/arab_chr4_map_loess.txt
Chromosome Position(bp) Rate(cM/Mb) Map(cM)
chr4 208650 10.379124 0
chr4 434712 7.015134 0
chr4 945976 5.078784 0
chr4 1512987 8.739854 0
chr4 1782389 2.500659 0
...

How to reproduce the problem:

In [115]: species = stdpopsim.get_species("AraTha")

In [116]: contig = species.get_contig("4", genetic_map="SalomeAveraged_TAIR7")

In [117]: contig.recombination_map.rate
Out[117]: 
array([0., 0., 0., 0., 0., 0., 0., 0., 0., 0., 0., 0., 0., 0., 0., 0., 0.,
       0., 0., 0., 0., 0., 0., 0., 0., 0., 0., 0., 0., 0., 0., 0., 0., 0.,
       0., 0., 0., 0., 0., 0., 0., 0., 0., 0., 0., 0., 0., 0., 0., 0., 0.,
       0., 0., 0.])

Suggested fixes:
The Map(cM) column needs to be populated. But more generally, we need to make sure all maps are properly formatted, and test for that.

Now in 1.0, msprime.RateMap.read_hapmap() by default uses the Map column, it looks like. The rate column can be used by specifying rate_col=2. One possible test could be to load a map based on rate_col and then load it again based on map_col and ensure that they are equivalent.

E.g.

In [127]: species = stdpopsim.get_species("HomSap")

In [128]: m1 = msprime.RateMap.read_hapmap("~/.cache/stdpopsim/genetic_maps/HomSap/HapMapII_GRCh37/genetic_map_GRCh37_chr22.txt", rate_col=2)

In [129]: m2 = msprime.RateMap.read_hapmap("~/.cache/stdpopsim/genetic_maps/HomSap/HapMapII_GRCh37/genetic_map_GRCh37_chr22.txt", map_col=3)

In [131]: m1.rate
Out[131]: 
array([0.000000e+00, 8.096992e-08, 8.131520e-08, ..., 6.439380e-09,
       6.487070e-09, 6.552390e-09])

In [132]: m2.rate
Out[132]: 
array([0.00000000e+00, 8.09677419e-08, 8.13220676e-08, ...,
       6.43979058e-09, 6.48666233e-09, 6.55155642e-09])

In [133]: m1.total_mass
Out[133]: 0.7410956239614878

In [134]: m2.total_mass
Out[134]: 0.74109562

In [141]: np.allclose(m1.rate, m2.rate)
Out[141]: True

Metadata

Metadata

Assignees

No one assigned

    Labels

    bugSomething isn't working

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions