Skip to content

Commit 4d4ec69

Browse files
committed
Adjust the skiver post; add some interns
1 parent e6d72e9 commit 4d4ec69

10 files changed

Lines changed: 48 additions & 8 deletions

README.md

Lines changed: 3 additions & 3 deletions
Original file line numberDiff line numberDiff line change
@@ -2,10 +2,10 @@
22

33
### Documentation
44

5-
Refer to [this documentation](https://csb5-page.github.io/documentation/) for guidelines on contributing to the website, including how to add blog posts and people, and how to preview the website before deployment.
5+
Refer to [this documentation](https://csb5.github.io/documentation/) for guidelines on contributing to the website, including how to add blog posts and people, and how to preview the website before deployment.
66

7-
If you are a website developer, refer to [this documentation for developers](https://csb5-page.github.io/documentation_developer/) on steps to update content/styles of the pages.
7+
If you are a website developer, refer to [this documentation for developers](https://csb5.github.io/documentation_developer/) on steps to update content/styles of the pages.
88

99
### Blog post elements
1010

11-
Refer to [this page](https://csb5-page.github.io/elements/) for examples on how to format the text, images and videos in the blog posts.
11+
Refer to [this page](https://csb5.github.io/elements/) for examples on how to format the text, images and videos in the blog posts.

_authors/seah-kah-yen.md

Lines changed: 10 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,10 @@
1+
---
2+
username: seah-kah-yen
3+
name: Seah Kah Yen
4+
position: Intern
5+
is_intern: true
6+
image: '/images/members/seah-kah-yen.jpg'
7+
linkedin: https://www.linkedin.com/in/seah-kah-yen/?originalSubdomain=sg
8+
---
9+
10+
Kah Yen did an internship with us in mid 2024, where she worked with Rafael on machine learning for viral classification.

_authors/sheryl-li-lynn-wong.md

Lines changed: 10 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,10 @@
1+
---
2+
username: sheryl-li-lynn-wong
3+
name: Sheryl Li Lynn Wong
4+
position: Intern
5+
is_intern: true
6+
image: '/images/members/sheryl-wong.jpg'
7+
linkedin: https://www.linkedin.com/in/sheryl-wong-3bab85230/?originalSubdomain=sg
8+
---
9+
10+
Sheryl did an internship with us in mid 2024, where she worked with Rafael on machine learning for viral classification.

_authors/steven-huang.md

Lines changed: 10 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,10 @@
1+
---
2+
username: steven-huang
3+
name: Steven Huang
4+
position: Intern
5+
is_intern: true
6+
image: '/images/members/steven-huang.jpg'
7+
linkedin: https://www.linkedin.com/in/stevenhuang73/
8+
---
9+
10+
Steven did an internship with us in late 2025, where he worked with Rafael on a machine learning tool for viral binning and taxonomic classification.

_authors/taha-jalali.md

Lines changed: 10 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,10 @@
1+
---
2+
username: taha-jalali
3+
name: Taha Jalali
4+
position: Intern
5+
is_intern: true
6+
image: '/images/members/taha-jalali.jpg'
7+
linkedin: https://www.linkedin.com/in/taha-jalali-170a302ab/?originalSubdomain=ir
8+
---
9+
10+
Taha did an internship with us in late 2025, where he worked with Rafael on adaptive sampling and machine learning methods to improve viral classification.

_posts/2026-01-20-estimating-sequencing-error.markdown

Lines changed: 5 additions & 5 deletions
Original file line numberDiff line numberDiff line change
@@ -10,18 +10,18 @@ code: https://github.com/GZHoffie/skiver
1010
author: [gu-zhenhao, niranjan-nagarajan]
1111
---
1212

13-
Given a set of sequenced reads, how to determine if the sequencing run is good or not? Finding the quality of the reads, which includes estimating sequencing error rates and bias, has been an important first step in numerous Bioinformatics pipelines.
13+
Given a set of sequenced reads, how can we determine if the sequencing run is good or not? Finding the quality of the reads, which includes estimating sequencing error rates and bias, has been an important first step in numerous Bioinformatics pipelines.
1414

15-
Previous ways of estimating sequencing error rates include **mapping the reads** to reference genomes and inferring error rates from **Phred quality scores**. Unfortunately, the reference genomes may be missing or different from the genomes that are actually sequenced, especially in metagenomic samples. On the other hand, Phred quality scores can produce biased estimates if they are uncalibrated.
15+
Previous ways of estimating sequencing error rates include **mapping the reads** to reference genomes and inferring error rates from **Phred quality scores**. Reference genomes may however be missing or different from the genomes that are actually sequenced, especially in metagenomic samples. On the other hand, Phred quality scores can produce biased estimates if they are uncalibrated.
1616

17-
We therefore propose a new framework of estimating sequencing error and bias, called *skiver*, which works without the need for reference genome or relying on Phred scores.
17+
We therefore propose a new framework for estimating sequencing error and bias, called *skiver*, which works without the need for reference genome or relying on Phred scores.
1818

1919
![](/images/posts/2026-02-14-estimating-sequencing-error/workflow.png)
2020
*Workflow of skiver.*
2121

22-
The key ideas of *skiver* is to use (*k*, *v*)-mer sketches to represent the large amount of sequencing reads. A (*k*, *v*)-mer is a segment of length *k*+*v*, where the first *k* bases are the *key* and the last *v* bases are the *value*. By grouping the (*k*, *v*)-mers with the same key together, we can identify the consensus value, as well as estimate the frequency of sequencing errors.
22+
The key ideas in *skiver* is to use (*k*, *v*)-mer sketches to represent the large amount of sequencing reads. A (*k*, *v*)-mer is a segment of length *k*+*v*, where the first *k* bases are the *key* and the last *v* bases are the *value*. By grouping the (*k*, *v*)-mers with the same key together, we can identify the consensus value, as well as estimate the frequency of sequencing errors.
2323

24-
Experiments on various real datasets show that skiver is able to accurately estimate the sequencing error rate and infer the percentage of *k*-mers in the read set that are free of sequencing errors. In addition, skiver can estimate the substitution, insertion, and deletion rates, revealing the bias of various sequencing platforms.
24+
Experiments on various real datasets show that skiver is able to accurately estimate the sequencing error rate and infer the percentage of *k*-mers in the read set that are free of sequencing errors. In addition, skiver can estimate the substitution, insertion, and deletion rates, revealing the biases of various sequencing platforms.
2525

2626
![](/images/posts/2026-02-14-estimating-sequencing-error/results.png)
2727
*Skiver's estimation of error rates and error spectra on various metagenomic samples.*

images/members/seah-kah-yen.jpg

105 KB
Loading

images/members/sheryl-wong.jpg

14 KB
Loading

images/members/steven-huang.jpg

141 KB
Loading

images/members/taha-jalali.jpg

179 KB
Loading

0 commit comments

Comments
 (0)