GPU-friendly truncation implementations #349

lkdvos · 2026-01-08T14:41:19Z

This is an attempt to get rid of the scalar-indexing oriented approach, and instead do more global operations.
Definitely still WIP, and on CPU there are definitely various optimizations that can be applied if needed.
I do wonder about the performance a bit, as I would actually expect that for a large number of sectors this might just be faster.

Some possible optimizations:

for UniqueFusion, finding the nth value is simply partialsortperm(values, n; by, rev), avoiding the need to allocate the full permutation vector
for CPU, cumsum + findlast can be replaced by a loop to avoid some intermediate allocations

codecov · 2026-01-08T16:34:15Z

Codecov Report

❌ Patch coverage is 91.56627% with 7 lines in your changes missing coverage. Please review.

Files with missing lines	Patch %	Lines
src/tensors/sectorvector.jl	50.00%	4 Missing ⚠️
src/factorizations/adjoint.jl	75.00%	3 Missing ⚠️

Files with missing lines	Coverage Δ
ext/TensorKitCUDAExt/TensorKitCUDAExt.jl	`100.00% <ø> (ø)`
ext/TensorKitCUDAExt/truncation.jl	`100.00% <100.00%> (ø)`
src/factorizations/diagonal.jl	`72.72% <100.00%> (+6.06%)`	⬆️
src/factorizations/truncation.jl	`94.83% <100.00%> (+6.90%)`	⬆️
src/factorizations/adjoint.jl	`73.07% <75.00%> (-1.35%)`	⬇️
src/tensors/sectorvector.jl	`40.57% <50.00%> (-7.81%)`	⬇️

... and 1 file with indirect coverage changes

🚀 New features to boost your workflow:

❄️ Test Analytics: Detect flaky tests, report on failures, and find test suite problems.

Jutho · 2026-01-12T10:16:29Z

I think that implementation is very clean. Not sure I can explain all the errors. Some seem to originate from the eigenvectors of a DiagonalTensorMap also being diagonal, which is being changed, right? But other errors I cannot directly explain without running the code locally.

lkdvos · 2026-01-12T11:11:19Z

I will have a look later, I have a local branch to fix the diagonal implementations already, we'll see what remains after.
I'll also try if it actually works on GPU, and get a similar implementation going for the truncerror version.

kshyatt · 2026-01-22T08:17:17Z

I think this one still needs a little work to get it over the finish line, but then we should be able to really try CUDA with the TNR and PEPS stuff

github-actions · 2026-01-22T15:36:22Z

Your PR no longer requires formatting changes. Thank you for your contribution!

lkdvos · 2026-01-22T19:55:41Z

wait please don't I'm working on this 😆

kshyatt · 2026-01-22T19:57:11Z

I'll stop I just wanted to fix formatting and rebase!!!

lkdvos · 2026-01-22T19:58:11Z

Hahah but the rebase screws up the git history, now I gotta see how much git magic I know to fix my local branch

kshyatt · 2026-01-22T19:59:38Z

Oh no I'm sorry!

This reverts commit 77f0ffa.

…Kit thingies

ext/TensorKitCUDAExt/truncation.jl

src/factorizations/diagonal.jl

This reverts commit f26cffe.

kshyatt force-pushed the ld-truncation branch from a9bb7f6 to 228fdcf Compare January 8, 2026 19:51

lkdvos force-pushed the ld-truncation branch from 228fdcf to dd38bfb Compare January 10, 2026 12:22

kshyatt force-pushed the ld-truncation branch 2 times, most recently from 499360d to 77f0ffa Compare January 20, 2026 10:27

kshyatt force-pushed the ld-truncation branch from 5e5d87b to f8d4dd7 Compare January 22, 2026 08:17

lkdvos mentioned this pull request Jan 22, 2026

convert(TensorMap, t) retains storagetype #357

Merged

lkdvos force-pushed the ld-truncation branch from 9e16472 to 1b3ae78 Compare January 22, 2026 15:15

kshyatt force-pushed the ld-truncation branch from 1b3ae78 to 1190fcd Compare January 22, 2026 19:32

lkdvos and others added 13 commits January 22, 2026 17:03

try to make truncation GPU-friendly

8df520e

Temporarily fix StridedViews version

1481184

Revert "Temporarily fix StridedViews version"

1e46b0d

This reverts commit 77f0ffa.

Small update for diagonal pullbacks

8ef0425

Fix last error

8423ce8

Reenable truncated CUDA tests

848f0cc

make truncation run on GPU

7ae9b05

bypass scalar indexing by specializing

b1fe3bd

convenience overloads

94ecfca

gpu-friendly copies

7395b8a

retain storagetype in extended_S

9af19b7

avoid GPU issues with truncated adjoint tensormaps

180afe6

various utility improvements

efbe088

lkdvos added 4 commits January 22, 2026 17:03

complete rewrite of implementation

eafd7a8

GPU doesn't like trues

f4892cf

remove CUDA specializations and temporarily add missing MatrixAlgebra…

3f273a1

…Kit thingies

better dimension testing

6842a70

lkdvos force-pushed the ld-truncation branch from c6a82dd to 6842a70 Compare January 22, 2026 22:03

lkdvos commented Jan 22, 2026

View reviewed changes

ext/TensorKitCUDAExt/truncation.jl Show resolved Hide resolved

lkdvos linked an issue Jan 22, 2026 that may be closed by this pull request

GPU + truncation support collation issue #346

Open

1 task

lkdvos added 4 commits January 22, 2026 17:07

fix unbound type parameter

5bb2a23

add missing import

ddd0ed6

be careful about double method definitions

f3b45ef

disable diagonal test

f26cffe

lkdvos marked this pull request as ready for review January 23, 2026 12:58

lkdvos commented Jan 23, 2026

View reviewed changes

src/factorizations/diagonal.jl Show resolved Hide resolved

Jutho reviewed Jan 23, 2026

View reviewed changes

src/factorizations/diagonal.jl Outdated Show resolved Hide resolved

lkdvos added 4 commits January 23, 2026 13:47

bump MatrixAlgebraKit dependency

6ff9ac8

Revert "disable diagonal test"

6666459

This reverts commit f26cffe.

remove unnecessary specializations

ebbdb84

specialize CPU implementations

2d7338a

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

GPU-friendly truncation implementations #349

GPU-friendly truncation implementations #349

lkdvos commented Jan 8, 2026

Uh oh!

codecov bot commented Jan 8, 2026 •

edited

Loading

Uh oh!

Jutho commented Jan 12, 2026

Uh oh!

lkdvos commented Jan 12, 2026 •

edited

Loading

Uh oh!

kshyatt commented Jan 22, 2026

Uh oh!

github-actions bot commented Jan 22, 2026 •

edited

Loading

Uh oh!

lkdvos commented Jan 22, 2026 •

edited

Loading

Uh oh!

kshyatt commented Jan 22, 2026

Uh oh!

lkdvos commented Jan 22, 2026

Uh oh!

kshyatt commented Jan 22, 2026

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

4 participants

GPU-friendly truncation implementations #349

Are you sure you want to change the base?

GPU-friendly truncation implementations #349

Conversation

lkdvos commented Jan 8, 2026

Uh oh!

codecov bot commented Jan 8, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Codecov Report

Uh oh!

Jutho commented Jan 12, 2026

Uh oh!

lkdvos commented Jan 12, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

kshyatt commented Jan 22, 2026

Uh oh!

github-actions bot commented Jan 22, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

lkdvos commented Jan 22, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

kshyatt commented Jan 22, 2026

Uh oh!

lkdvos commented Jan 22, 2026

Uh oh!

kshyatt commented Jan 22, 2026

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

4 participants

codecov bot commented Jan 8, 2026 •

edited

Loading

lkdvos commented Jan 12, 2026 •

edited

Loading

github-actions bot commented Jan 22, 2026 •

edited

Loading

lkdvos commented Jan 22, 2026 •

edited

Loading