Skip to content

phase_parent issues #7

@yangjl

Description

@yangjl

There are two other issues with phase_parent() that just came to my attention.

Issue 1: See the following errors:

set.seed(13567)
GBS.array <- sim.array(size.array=50, numloci=100, hom.error=0.02, het.error=0.8, rec=0.25, selfing=0.1, imiss=0.5, misscode=3)
GBS.array <- get_true_GBS(GBS.array)
probs <- error_mx(hom.error=0.02, het.error=0.8, imiss=0.5)
system.time(phase <- phase_parent(GBS.array, win_length=2, join_length=2, verbose=F))
user system elapsed
0.31 0.00 0.31

set.seed(13567)
GBS.array <- sim.array(size.array=50, numloci=500, hom.error=0.02, het.error=0.8, rec=0.25, selfing=0.1, imiss=0.5, misscode=3)
GBS.array <- get_true_GBS(GBS.array)
probs <- error_mx(hom.error=0.02, het.error=0.8, imiss=0.5)
system.time(phase <- phase_parent(GBS.array, win_length=2, join_length=2, verbose=F))

>>> join chunks [ 1 and 2, total:2] ...

Error in log(tem) : non-numeric argument to mathematical function
Timing stopped at: 2.02 0 2.02

The only difference between the two runs is the numloci, and I think that whenever there is one or more heterozygous sites that cannot be phased, phase_parent() automatically results in an error. When we increase numloci, we start to get more simulations that result in heterozygous sites that cannot be phased.

Issue 2: Phasing runtime
I originally thought that phasing is quick, but I forgot that I was dealing with small number of sites. When I extrapolated the number of sites to 500,000, the number of hours seems astronomical. See following:

win_length=2, size_array=50, numloci=100, time=0.98s, extrapolate into real data: 1.5hours
win_length=5, size_array=50, numloci=100, time=6.89s, extrapolate into real data: 11.5hours
win_length=10, size_array=50, numloci=100, time=434s, extrapolate into real data: 30days

The only win_length that seems feasible to me is 2. But I am not sure if win_length=2 is good enough.

Thanks,
CJ

Metadata

Metadata

Assignees

No one assigned

    Labels

    No labels
    No labels

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions