Code is available under the MIT license.
Title: Investigation of downstream pathways involved in decision-making between neurogenesis and gliogenesis
Many thanks to the Supervisors: Dr. rer. nat. Anne Gregor and Prof. Dr. med. Christiane Zweier. Many thanks to the person in the Human Genetics lab.
This project aims to investigate the downstream pathways involved in decision-making between neurogenesis (differentiation from neural progenitor cells into neurons) and astrogenesis (differentiation into astrocytes).
The data are coming from organoids derived from human induced pluripotent stem cells and were sequenced using single-cell sequencing. Organoids are also called "mini-organs" because of their ability to mimick human organ, as here, mimicking the human brain (Organoids, 2017).
To briefly introduce the LHX2 gene (LIM-homeobox gene 2), it is a well-known gene present in the brain and conserved across species (Bachy et al., 2001; Chou & Tole, 2019), because of its critical role during brain development (Roy et al., 2014). This gene is also responsible of neural developmental disorder (NDD) causing severe phenotypes such as microcephaly.
- Mapping to reference genome, gene-level counts using Space Ranger
- Quality control step
- Data integration, clustering, cell-type annotation using SCINA
- Downstream analyses: DEG, trajectory analyses using Monocle3 with timepoint 0 = radial glial cells
Monocle3 is a pipeline developed by Trapnell et al,(2014).
This pipeline is dependent on other packages that need to be installed.
- Trajectory reconstruction analysis: analyses performed with default parameters for PCA-based pre-processing and UMAP dimensionality reduction. Clustering of cells with default resolution parameters and then trajectory learning using learn_graph(). This function is time-consuming
- Pseudo-time analysis: it assigns an artificial temporal metric to each cell reflecting its relative position along the reconstructed trajectory. Timepoint 0 was manually set to neural progenitor populations. Pseudo-time ordering was performed using order_cells() functions
- Downstream analyses: use of Moran's I spatial autocorrelation test setting a threshold at 0.05, then GO analyses on significant genes (at BP and MF levels). Cross-condition comparisons were also assessed by comparing Moran's I statistics, e.g., Morans' I and q-values. Paired Wilcoxon signed-rank test was also performed with alpha = 0.05. Visualizations using dot plots, cnetplots, bar plots, histogram and quadrant plots. Statistical summary tables were also created.
pySCENIC pipeline is the python implementation of SCENIC tool (Aibar et al., 2017; Bravo González-Blas et al., 2023; Van de Sande et al., 2020).
This pipeline contains three main steps:
- Co-expression module detection: identification of TFs and their target genes (together defining a regulon). Time-consuming step. Importantly, set the number of workers to 8 to speed up the process. Tool: GRNBoost2
- Regulon refinement based on pruning targets that don't have an enrichment for a corresponding motif of the TF. Tool: RcisTarget.
- Cellular Activity Scoring: calculates regulon activity scores. Return an important output, a AUC matrix which can be further analyzed. Tool: AUCell.
Parameters used (per condition): 24 CPU, 192G of memory, time limit of 4 days. Based on an Apptainer (container platforms).
Before running the pipeline, make sure that the input file is a .mtx format, necessary files (motif2TF annotation, motif scores databases, human TF list) are downloaded and then converted it to a loom file. For these steps and the visualization step, you will need several python libraries such as NumPy, Pandas, Scanpy, and Loompy.
Downstream analyses: visualization using UMAP representation, top cluster ranking, violin plots and dot plots with hierarchical clusters. Differential activity between genotypes (WT vs KO, WT vs HET) was computed using Seurat:FindMarkers() with a threshold of 0.05.
The LochNESS score calculation and analysis is part of a more global study of MMCA (Mouse Mutant Cell Atlas - Github https://github.com/shendurelab/MMCA.git, as indicated in the paper written by Huang et al., 2023). This score quantifies the local phenotypic divergence between two genotypes by evaluating how the nearest-neighbor cells deviates from the global distribution. Several steps:
- Calculate PCA reduction and L2-normalized
- Calculate k = 0.5 * sqrt(D) with D the number of PCA dimensions after L2-normalization. Here k = 114
- For each genotype comparison, the k-nearest neighbors of each cell were identified using Euclidean distance, using FNN:get.knnx() function
- Calculate the LochNESS score as follow: LochNESS score=(plocal/pglobal)-1 with plocal the proportion of neighbors belonging to the mutant genotype (KO or HET) and pglobal the correction for the global proportion of that genotype in the compared groups. This formula follows the same logic as in the paper (Huang et al., 2023) but is not the exact same formula.
- Artifical assignation of scores: 0 is the reference, 1 is the other tested condition.
- Downstream analyses: plot of the distribution of LochNESS scores, cell-type specific summary statistics were computed and cells were classified by low-, moderate- and high-impact groups based on their mean score. Two-sided one-sample t-tests were computed using a threshold of 0.05.
- Based on Schmid et al (2023) article, LHX2-deficiency promotes astrogenesis and represses neurogenesis, which is also consistent with the observations of Subramanian et al., (2011)
- Intermediate phenotype (HET) is more similar to KO than WT, suggesting a concentration-dependent influence of LHX2
- LHX2 seems to be essential for the preservation of neuronal identity
- Without LHX2, there is an activation of the immune system and more non-neuronal cell fate (fibroblasts, macrophages enrichment)
- Some factors (e.g., SOX, LHX3, PITX3, HOX and RA signalling pathway) are specifically expressed in KO organoids.
