Skip to content

Regarding Optimizing Meta-MultiSKAT for Large Genome-Wide Genotype Files #1

@neeteshpandey

Description

@neeteshpandey

Hi Diptavo,
I hope you're doing well. I have been using Meta-MultiSKAT for meta-analysis on a large-scale genome-wide dataset. To improve computational efficiency, I have already chunked each chromosome into 150 smaller parts. However, I am still facing issues with certain chunks where the genotype file size is large- these runs are significantly slow, and in some cases, they fail.

I came across the computation time estimates in your paper (attached) and wanted to ask if you have any suggestions on optimizing runtime for large datasets. Specifically:

Time Adjustment Strategies – Based on your reported CPU time, how can we efficiently scale the runtime when working with significantly larger genotype files?

Memory and Computational Load – Are there optimal hardware configurations (e.g., memory allocation, high-performance computing settings) that you recommend?

Further Parallelization – Would increasing the number of parallel jobs or adjusting certain parameters help improve speed?

Alternative Workarounds – Are there preprocessing steps or modifications to the Meta-MultiSKAT pipeline that could make it more scalable for genome-wide analysis?

I would greatly appreciate any insights you could provide on making the workflow more efficient. I have attached a section from your paper for reference. Looking forward to your thoughts.

Sincerely,
Neetesh

Image

Metadata

Metadata

Assignees

No one assigned

    Labels

    No labels
    No labels

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions