Tensor contractions for AWS Trainium via NKI.
Einstein summation with contraction planning, CP and Tucker decompositions. Expresses scientific tensor workloads naturally instead of decomposing to GEMM. Part of the trnsci scientific computing suite (github.com/trnsci).
trntensor follows the trnsci 5-phase roadmap. Active work is tracked in phase-labeled GitHub issues:
- Phase 1 — correctness (active): matmul + batched-matmul NKI kernels in place; awaiting hardware validation + additional
@pytest.mark.neuroncoverage. - Phase 2 — precision: precision-aware contraction path selection (depends on trnblas#22 double-double GEMM).
- Phase 3 — perf: opt_einsum-style path planner, plan cache reuse.
- Phase 4 — multi-chip: sharded tensor contractions.
- Phase 5 — generation: trn2 fused multi-contraction paths.
Suite-wide tracker: trnsci/trnsci#1.
pip install trntensorimport torch
import trntensor
# Einsum — drop-in for torch.einsum with contraction planning
C = trntensor.einsum("ij,jk->ik", A, B) # matmul
T = trntensor.einsum("ap,bp->ab", B_i, B_j) # DF-MP2 contraction
X = trntensor.einsum("mi,mnP->inP", C_occ, eri) # AO→MO transform
# Contraction planning
plan = trntensor.plan_contraction("ij,jk->ik", A, B)
flops = trntensor.estimate_flops("ij,jk->ik", A, B)
# CP decomposition (tensor hypercontraction)
factors, weights = trntensor.cp_decompose(tensor, rank=10)
reconstructed = trntensor.cp_reconstruct(factors, weights)
# Tucker decomposition (HOSVD)
core, factors = trntensor.tucker_decompose(tensor, ranks=(5, 5, 5))| Category | Operation | Description |
|---|---|---|
| Contraction | einsum |
General tensor contraction |
| Contraction | multi_einsum |
Multiple contractions (fusion-ready) |
| Planning | plan_contraction |
Analyze and select strategy |
| Planning | estimate_flops |
FLOPs for a contraction |
| Decomposition | cp_decompose |
CP/PARAFAC via ALS |
| Decomposition | tucker_decompose |
Tucker via HOSVD |
- Einsum with matmul/bmm/torch dispatch
- Contraction planner
- CP decomposition (ALS)
- Tucker decomposition (HOSVD)
- DF-MP2 einsum example
- NKI fused contraction kernels
- Multi-contraction fusion
- Optimal contraction ordering (like opt_einsum)
| Project | What |
|---|---|
| trnfft | FFT + complex ops |
| trnblas | BLAS operations |
| trnsolver | Linear solvers |
| trnrand | Random number generation |
| trnsparse | Sparse operations |
Apache 2.0 — Copyright 2026 Scott Friedman