A fast trajectory-based sequence aligner for long reads (ONT, PacBio) and short reads (Illumina). freemap uses a novel DP-free CIGAR generation approach that adds only ~2-3% overhead versus ~200%+ for traditional dynamic programming methods.
cargo build --releaseFor maximum performance on your CPU:
RUSTFLAGS="-C target-cpu=native" cargo build --releaseThe binary is at target/release/freemap.
# ONT long reads
freemap -x map-ont -a ref.fa reads.fq out.sam
# PacBio HiFi
freemap -x map-hifi -a ref.fa reads.fq out.sam
# PacBio CLR
freemap -x map-pb -a ref.fa reads.fq out.sam
# Illumina short reads (single-end)
freemap -x sr -a ref.fa reads.fq out.sam
# Illumina short reads (paired-end)
freemap -x sr -a -1 R1.fq -2 R2.fq ref.fa out.sam
# PAF output (default)
freemap -x map-ont ref.fa reads.fq out.paf| Preset | Technology | Description |
|---|---|---|
map-ont |
ONT | k=15, w=10, trajectory CIGAR |
map-hifi |
PacBio HiFi | k=19, w=19, trajectory CIGAR |
map-pb |
PacBio CLR | k=19, w=10, trajectory CIGAR |
sr |
Illumina | k=21, w=11, polish mode |
| Flag | Description |
|---|---|
-1 FILE |
First read file (R1) |
-2 FILE |
Second read file (R2) |
-I MIN:MAX |
Expected insert size range [0:1000] |
| Flag | Description |
|---|---|
-k INT |
k-mer size [19] |
-w INT |
Minimizer window size [19] |
-f INT |
Max k-mer frequency [200] |
-L INT |
Chaining lookback limit [16] |
-G INT |
Max gap difference for penalty [50] |
-S INT |
Gap penalty scaling factor [5] |
-t INT |
Threads [all] |
| Flag | Description |
|---|---|
-g |
Trajectory mode: CIGAR from geometry (no DP) |
-p, --polish |
Polish mode: base-level indel detection (DP-free) |
-H |
Homopolymer compression (recommended for ONT) |
-r |
Refine boundaries with micro-anchors |
-R |
Short-read mode |
-u |
Ultralong mode: relaxed chaining for ONT ultralong reads |
| Flag | Description |
|---|---|
-a |
SAM output (default: PAF) |
-c |
Generate detailed CIGAR (heuristic gap alignment) |
--multi |
Output secondary and supplementary alignments |
--max-secondary N |
Max secondary alignments per read [5] |
-q |
Quiet mode |
| Flag | Description |
|---|---|
-d FILE |
Save pre-built index to disk |
-i FILE |
Load pre-built index from disk |
| Flag | Description |
|---|---|
--ransac-threshold FLOAT |
RANSAC inlier threshold for trajectory regression [25.0] |
--polish-max-indel INT |
Max indel size for CIGAR polishing [4] |
--band INT |
Custom chaining band width |
--lookback INT |
Override chaining lookback limit |
--no-tiebreaker |
Disable primary-chromosome tiebreaker |
Run freemap -h for the full flag reference.
To verify your installation, you can run freemap on publicly available E. coli K-12 data:
# Download E. coli K-12 MG1655 reference
wget -O ecoli.fa.gz "https://ftp.ncbi.nlm.nih.gov/genomes/all/GCF/000/005/845/GCF_000005845.2_ASM584v2/GCF_000005845.2_ASM584v2_genomic.fna.gz"
gunzip ecoli.fa.gz
# Simulate 1000 HiFi reads with pbsim3 (optional — or use your own reads)
pbsim --strategy wgs --method qshmm --qshmm QSHMM-RSII.model \
--depth 5 --genome ecoli.fa --prefix ecoli_test
# Build index and align
freemap -x map-hifi -a ecoli.fa ecoli_test_0001.fastq out.samExpected output: out.sam — a standard SAM file with header lines (@SQ, @PG) followed by one alignment record per read, including trajectory-based CIGAR strings. You can verify with:
# Check mapped reads
samtools flagstat out.sam
# View coverage
samtools sort out.sam -o out.bam && samtools index out.bam
samtools depth -a out.bam | head| Dataset | Source | Accession / URL |
|---|---|---|
| E. coli K-12 MG1655 | NCBI | NC_000913.3 |
| Human GRCh38 | NCBI | GCF_000001405.40 |
| GIAB HG002 (HiFi, ONT) | GIAB | HG002 data |
| GIAB HC regions v4.2.1 | GIAB | v4.2.1 benchmark |
Simulated reads were generated with pbsim3 v3.0.0 using fixed seeds for reproducibility. Generation scripts and parameters are documented in benchmark/scripts/.
# Reproduce all tables and figures
bash benchmark/scripts/reproduce_paper.sh /path/to/data
# Reproduce a single section
bash benchmark/scripts/reproduce_paper.sh /path/to/data table1See benchmark/README.md for data layout and requirements.
If you use freemap in your research, please cite:
de Ruvo A., Radomski N., Flammini M., Di Pasquale A. (2026). freemap: DP-free CIGAR generation for long reads via trajectory inference. Bioinformatics (submitted).
See CITATION.cff for machine-readable citation metadata.