A zero-cost Rust interface to the Dunbrack 2010 backbone-dependent rotamer library.
Provides bilinearly interpolated side-chain rotamer probabilities, mean χ angles, and standard deviations for 22 amino acid types at any (φ, ψ) backbone conformation. All 740,629 source rows are baked into .rodata at compile time; queries touch zero heap memory and link zero runtime dependencies.
Features • Installation • Usage • Residue Types • Performance • Verification • License
- Zero startup latency. The entire ~28 MB rotamer database is embedded in
.rodataat compile time viabuild.rs. No file I/O, no deserialization, no lazy initialization. - Zero heap allocation. Every query returns a
RotamerIter<N, R>— a stack-allocated array of exactlyRRotamer<N>values. NoVec, noBox, no allocator required. #![no_std]compatible. No standard library, no libm linkage. Usable in embedded firmware, OS kernels, and WASM environments.- Type-safe χ dimensionality. The number of χ angles per residue is a compile-time constant
Nencoded inRotamer<N>andRotamerIter<N, R>. There are no padding zeros, no runtime bounds checks, no wrong-length arrays. - Bilinear interpolation with circular χ means.
Residue::rotamers(phi, psi)bilinearly interpolates across the four surrounding grid cells. χ means are computed via circular weighted mean (sin/cos decomposition), correctly handling the ±180° wraparound. Probabilities are re-normalized to Σ = 1.0 after interpolation. - Precomputed (sin χ, cos χ) in the static table.
build.rsstores sin/cos pairs rather than raw angles, eliminating 8N trigonometric calls per query (4 sin + 4 cos per χ angle, per corner cell). - Custom branchless
atan2f. A two-stage argument-reduction + degree-7 Taylor polynomial implementation with zero conditional branches and ±0.002° maximum error — 25× more accurate than the 0.05° precision requirement, with no libm dependency. - Compile-time data integrity.
build.rsasserts seven invariants before emitting any code: rotamer count, per-row non-negative probabilities, probability sums, per-χ positive standard deviations, φ/ψ = ±180° periodicity, and bin index consistency across all 1,369 grid cells. Compilation fails loudly on data corruption. for_all_residues!macro. A generated declarative macro for writing generic code over all 22 residue types without runtime dispatch.
[dependencies]
dunbrack = "0.1.0"Note: build.rs reads data/dunbrack-2010.lib.csv (740,629 rows) and generates ~28 MB of static Rust source. Initial compilation takes 15–30 seconds depending on hardware.
use dunbrack::{Residue, Val};
// Bilinearly interpolated rotamers for Val at α-helical backbone.
for rot in Val::rotamers(-60.0, -40.0) {
// rot.r: [u8; 1] — rotamer bin index (1-based)
// rot.prob: f32 — probability (Σ = 1.0 across all rotamers)
// rot.chi_mean: [f32; 1] — mean χ angle in degrees, ±180° range
// rot.chi_sigma: [f32; 1] — standard deviation in degrees
println!("r={:?} p={:.4} χ₁={:.1}°±{:.1}°",
rot.r, rot.prob, rot.chi_mean[0], rot.chi_sigma[0]);
}Output (Val at φ=−60°, ψ=−40°):
r=[1] p=0.0414 χ₁=68.0°±7.0°
r=[2] p=0.9391 χ₁=171.5°±5.0°
r=[3] p=0.0194 χ₁=-61.0°±9.6°
The Residue trait exposes compile-time constants usable in fully generic code:
use dunbrack::Residue;
fn rotamer_count<R: Residue>() -> usize {
R::N_ROTAMERS
}
fn residue_name<R: Residue>() -> &'static str {
R::NAME
}Accessing rotamer fields (.prob, .chi_mean, .chi_sigma, .r) requires a concrete type or a monomorphized context, since Residue::Rot carries no field bounds:
use dunbrack::{Residue, Val};
// Collect and find the most probable rotamer for Val.
let best = Val::rotamers(-60.0, -40.0)
.max_by(|a, b| a.prob.partial_cmp(&b.prob).unwrap())
.unwrap();This macro invokes $callback!(Type, N_CHI, N_ROTAMERS) for all 22 residue types. It drives generic infrastructure like benchmarks, coverage tests, and per-type dispatch with zero boilerplate.
use dunbrack::*;
macro_rules! print_info {
($Res:ident, $n_chi:literal, $n_rot:literal) => {
println!("{}: {} χ angles, {} rotamers",
<$Res as Residue>::NAME, $n_chi, $n_rot);
};
}
for_all_residues!(print_info);All 22 residue types from the Dunbrack 2010 library, including separated cysteine and proline variants:
| Type | N_CHI |
N_ROTAMERS |
Notes |
|---|---|---|---|
Arg |
4 | 75 | |
Asn |
2 | 36 | |
Asp |
2 | 18 | |
Gln |
3 | 108 | Largest table |
Glu |
3 | 54 | |
His |
2 | 36 | |
Ile |
2 | 9 | |
Leu |
2 | 9 | |
Lys |
4 | 73 | |
Met |
3 | 27 | |
Phe |
2 | 18 | |
Ser |
1 | 3 | |
Thr |
1 | 3 | |
Trp |
2 | 36 | |
Tyr |
2 | 18 | |
Val |
1 | 3 | |
Cyh |
1 | 3 | Free (non-disulfide) cysteine |
Cyd |
1 | 3 | Disulfide-bonded cysteine |
Cys |
1 | 3 | Combined cysteine pool (CYH + CYD) |
Tpr |
3 | 2 | Trans-proline |
Cpr |
3 | 2 | Cis-proline |
Pro |
3 | 2 | Combined proline pool (TPR + CPR) |
Each type implements Residue + Copy + PartialEq + Eq + Hash + Debug.
Benchmarked with Criterion.rs on an Intel® Core™ i7-13620H (Raptor Lake, 4.90 GHz turbo, AVX2), Linux, opt-level=3, lto=true, codegen-units=1.
Single-point query — time to call Residue::rotamers(phi, psi) and consume the full iterator:
| Residue | N_CHI | N_ROTAMERS | Time | Throughput |
|---|---|---|---|---|
Val |
1 | 3 | 31.1 ns | 32.1 MOps/s |
Ser |
1 | 3 | 32.0 ns | 31.2 MOps/s |
Pro |
3 | 2 | 43.7 ns | 22.9 MOps/s |
Leu |
2 | 9 | 143.5 ns | 6.97 MOps/s |
Phe |
2 | 18 | 268.0 ns | 3.73 MOps/s |
Met |
3 | 27 | 358.0 ns | 2.79 MOps/s |
Asn |
2 | 36 | 551.7 ns | 1.81 MOps/s |
Glu |
3 | 54 | 673.4 ns | 1.49 MOps/s |
Arg |
4 | 75 | 1,142.6 ns | 0.88 MOps/s |
Lys |
4 | 73 | 1,158.9 ns | 0.86 MOps/s |
Gln |
3 | 108 | 1,316.7 ns | 0.76 MOps/s |
Query time scales linearly with N_ROTAMERS at ~12–16 ns per rotamer, dominated by atan2f calls (one per χ angle per rotamer).
Full grid sweep (all 37×37 = 1,369 cells, sustained throughput):
| Residue | Time | Per-query | Table size |
|---|---|---|---|
Val |
38.2 µs | 27.9 ns | 64 KiB |
Gln |
1,834.8 µs | 1,340 ns | 5,776 KiB |
Per-query time drops ~10% in sweep mode due to cache warmth across adjacent cells.
For full data including all 22 residues and methodology, see BENCHMARKS.md.
| Optimization | Savings |
|---|---|
Precomputed (sin χ, cos χ) in table |
Eliminates 8N trig calls per query (e.g. 32 calls → 4 for Arg, N=4) |
Custom branchless atan2f |
Eliminates libm overhead; zero branch-prediction penalties |
Compile-time static tables (build.rs) |
Zero startup cost; OS can share read-only pages across processes |
KEYS deduplication |
bin indices stored once per residue (in KEYS), not per cell; saves ~401 KiB for Arg alone |
Stack-only RotamerIter<N, R> |
No allocator, no pointer indirection; next() is a single array read + increment |
The library is verified at three levels:
Compile time (build.rs assertions) — compilation aborts if any of the following fail:
- Rotamer count per cell matches the registered
N_ROTAMERS - Every rotamer probability ≥ 0
- Probability sum per cell ∈ [0.99, 1.01]
- Per-χ standard deviation > 0 for every rotamer in every cell
- φ = −180° and φ = +180° cells are bitwise identical (periodic boundary)
- ψ = −180° and ψ = +180° cells are bitwise identical
- bin index key sets are identical across all 1,369 cells for each residue
Unit tests (21 tests in src/):
atan2faccuracy: maximum error 3.5×10⁻⁵ rad (±0.002°) over a dense grid- Circular mean with ±180° wraparound
angle_to_gridat boundaries, midpoints, and out-of-range inputs
Integration tests (140 tests in tests/):
| File | Tests | What is verified |
|---|---|---|
accuracy.rs |
8 | Full 740,629-row CSV round-trip; |prob_err| < 1e-5, |chi_mean_err| < 0.05° at every grid point |
coverage.rs |
44 | Every (residue, φ, ψ) combination on the 37×37 grid: correct count, Σprob ≈ 1.0, valid ranges; 10,000 random (φ, ψ) fuzz inputs per residue type (220,000 total) |
interpolation.rs |
88 | Determinacy, continuity (Δprob < 0.05 per 0.1° step), circular χ wrap correctness, normalization at 32×32 off-grid angles |
The atan2f error of ±0.002° is 25× below the 0.05° accuracy threshold, meaning the precision ceiling is the source data (CSV values are stored to one decimal place), not the implementation.
Run the full suite:
cargo testMIT — see LICENSE.