-
Notifications
You must be signed in to change notification settings - Fork 56
GPU-friendly truncation implementations #349
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
base: main
Are you sure you want to change the base?
Conversation
Codecov Report❌ Patch coverage is
... and 1 file with indirect coverage changes 🚀 New features to boost your workflow:
|
|
I think that implementation is very clean. Not sure I can explain all the errors. Some seem to originate from the eigenvectors of a DiagonalTensorMap also being diagonal, which is being changed, right? But other errors I cannot directly explain without running the code locally. |
|
I will have a look later, I have a local branch to fix the diagonal implementations already, we'll see what remains after. |
499360d to
77f0ffa
Compare
|
I think this one still needs a little work to get it over the finish line, but then we should be able to really try CUDA with the TNR and PEPS stuff |
5e5d87b to
f8d4dd7
Compare
|
Your PR no longer requires formatting changes. Thank you for your contribution! |
1b3ae78 to
1190fcd
Compare
|
wait please don't I'm working on this 😆 |
|
I'll stop I just wanted to fix formatting and rebase!!! |
|
Hahah but the rebase screws up the git history, now I gotta see how much git magic I know to fix my local branch |
|
Oh no I'm sorry! |
This reverts commit 77f0ffa.
This is an attempt to get rid of the scalar-indexing oriented approach, and instead do more global operations.
Definitely still WIP, and on CPU there are definitely various optimizations that can be applied if needed.
I do wonder about the performance a bit, as I would actually expect that for a large number of sectors this might just be faster.
Some possible optimizations:
UniqueFusion, finding thenth value is simplypartialsortperm(values, n; by, rev), avoiding the need to allocate the full permutation vectorcumsum+findlastcan be replaced by a loop to avoid some intermediate allocations