Hi CHESS,
First of all, thanks so much for letting the users updated about this beautiful software! Like many, I was also recently trying to use CHESS to analyze Hi-C data that our lab has generated using mutagenized zebrafish embryo, but today I came across with this bioRxiv preprint which seems to raise a major concern about this software (Lee, H., Blumberg, B., Lawrence, M. S., and Shoida, T. "Revisiting the Use of Structural Similarity Index in Hi-C" bioRxiv (2021).).
"...here we show that the primary outputs of CHESS–namely, the structural similarity index (SSIM) profiles–are nearly identical regardless of the input matrices, even when query and reference reads were shuffled to destroy any significant differences. This issue stems from the dominance of the regional counting noise arising from stochastic sampling in chromatin-contact maps, reflecting a fundamentally incorrect assumption of the CHESS algorithm. Therefore, biological interpretation of SSIM profiles generated by CHESS requires considerable caution."
I am not a bioinformatician and therefore do not fully understand the technical details presented in their preprint...
Should the users be concerned about this problem? It seems like #34 and #48 are quite related to the concerns raised by the authors of the preprint, but my impression was that the authors were arguing that ssim is unable to measure similarities between Hi-C matrices from the same genomic locus and is worsening the differential contact analysis that is actually done instead by the signal-to-noise ratio.
Is there any method that users can use this software without confronting the concern raised by H. Lee et al.? Or do we might have to wait for further major updates on either the software or the manuscript?
Thanks in advance,
Hi CHESS,
First of all, thanks so much for letting the users updated about this beautiful software! Like many, I was also recently trying to use CHESS to analyze Hi-C data that our lab has generated using mutagenized zebrafish embryo, but today I came across with this bioRxiv preprint which seems to raise a major concern about this software (Lee, H., Blumberg, B., Lawrence, M. S., and Shoida, T. "Revisiting the Use of Structural Similarity Index in Hi-C" bioRxiv (2021).).
"...here we show that the primary outputs of CHESS–namely, the structural similarity index (SSIM) profiles–are nearly identical regardless of the input matrices, even when query and reference reads were shuffled to destroy any significant differences. This issue stems from the dominance of the regional counting noise arising from stochastic sampling in chromatin-contact maps, reflecting a fundamentally incorrect assumption of the CHESS algorithm. Therefore, biological interpretation of SSIM profiles generated by CHESS requires considerable caution."
I am not a bioinformatician and therefore do not fully understand the technical details presented in their preprint...
Should the users be concerned about this problem? It seems like #34 and #48 are quite related to the concerns raised by the authors of the preprint, but my impression was that the authors were arguing that ssim is unable to measure similarities between Hi-C matrices from the same genomic locus and is worsening the differential contact analysis that is actually done instead by the signal-to-noise ratio.
Is there any method that users can use this software without confronting the concern raised by H. Lee et al.? Or do we might have to wait for further major updates on either the software or the manuscript?
Thanks in advance,