Skip to content

bc.tl.rc.recluster() seems to calculate HVG from previous ones #384

@llumdi

Description

@llumdi

When reclustering a given subset of cells with bc.tl.rc.recluster() function, it seems that the HVG are only computed from previous HVG genes (adata.var) instead of from all genes (adata.raw.var). Going back to the full set of genes is needed to give the algorithm a chance to find the genes that are most variable specifically within the subset of cells.

The problematic part of the function seems this one:
cluster_subset.raw = cluster_subset in line 97

Suggested fix:

# Create a new anndata object for the subcluster analysis.
# IMPORTANT: We use adata.raw to get the expression data for ALL genes.
cluster_subset = cluster_subset.raw.to_adata() 
cluster_subset.raw = cluster_subset # create back a raw layer before running HVG. cluster_subset.raw.var will contain all genes, whereas cluster_subset.var only the newly identified HVG

Thanks for checking and fixing it
Best,
Llucia

Metadata

Metadata

Assignees

No one assigned

    Labels

    No labels
    No labels

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions