Skip to content

[NDTensors] Avoid threadid in block sparse multithreading code #1650

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Merged
merged 9 commits into from
Apr 21, 2025

Conversation

mtfishman
Copy link
Member

@mtfishman mtfishman commented Apr 20, 2025

Followup to #1648.

To-do:

  • Check the performance.
  • More efficient filtering of blocks that aren't being contracted when building the block contraction list.

@mtfishman
Copy link
Member Author

mtfishman commented Apr 20, 2025

This should be good to go once tests pass. There is a small (10%) slowdown from this PR that I can't track down but I think that is acceptable.

For posterity, running https://github.com/ITensor/ITensorMPS.jl/blob/v0.3.17/examples/dmrg/2d_hubbard_conserve_momentum.jl before this PR gives:

julia> energy, H, psi = main(; Nx=8, Ny=4, U=4.0, t=1.0, nsweeps=5, maxdim=3000, threaded_blocksparse=true);
Threads.nthreads() = 4
ITensors.using_threaded_blocksparse() = true
nnz(H[end ÷ 2]) = 83
nnzblocks(H[end ÷ 2]) = 80
4×8 Matrix{String}:
 ""  ""  ""  ""  ""  ""  ""  ""
 ""  ""  ""  ""  ""  ""  ""  ""
 ""  ""  ""  ""  ""  ""  ""  ""
 ""  ""  ""  ""  ""  ""  ""  ""
inner(psi0', H, psi0) = 25.6931471817727
  0.004488 seconds (23.80 k allocations: 7.051 MiB)
After sweep 1 energy=-8.237374080471689  maxlinkdim=100 maxerr=1.12E-05 time=1.088
After sweep 2 energy=-19.898330380819594  maxlinkdim=200 maxerr=8.70E-04 time=11.542
After sweep 3 energy=-25.874269858341727  maxlinkdim=400 maxerr=2.29E-04 time=25.872
After sweep 4 energy=-26.53067160826987  maxlinkdim=800 maxerr=1.44E-04 time=21.346
After sweep 5 energy=-26.66437390383375  maxlinkdim=2000 maxerr=2.39E-05 time=42.468
102.321462 seconds (869.09 M allocations: 274.549 GiB, 22.87% gc time, 0.15% compilation time)
(Nx, Ny) = (8, 4)
(t, U) = (1.0, 4.0)
flux(psi) = QN(("Ky",0,4),("Nf",32,-1),("Sz",0))
maxlinkdim(psi) = 2000
energy = -26.66437390383375

while this PR gives:

julia> energy, H, psi = main(; Nx=8, Ny=4, U=4.0, t=1.0, nsweeps=5, maxdim=3000, threaded_blocksparse=true);
Threads.nthreads() = 4
ITensors.using_threaded_blocksparse() = true
nnz(H[end ÷ 2]) = 83
nnzblocks(H[end ÷ 2]) = 80
4×8 Matrix{String}:
 ""  ""  ""  ""  ""  ""  ""  ""
 ""  ""  ""  ""  ""  ""  ""  ""
 ""  ""  ""  ""  ""  ""  ""  ""
 ""  ""  ""  ""  ""  ""  ""  ""
inner(psi0', H, psi0) = 25.6931471817727
  0.004372 seconds (24.32 k allocations: 7.572 MiB)
After sweep 1 energy=-8.237374080540086  maxlinkdim=100 maxerr=1.12E-05 time=1.162
After sweep 2 energy=-19.89833038088842  maxlinkdim=200 maxerr=8.70E-04 time=12.989
After sweep 3 energy=-25.874269858341243  maxlinkdim=400 maxerr=2.29E-04 time=29.443
After sweep 4 energy=-26.530671608297425  maxlinkdim=800 maxerr=1.44E-04 time=22.788
After sweep 5 energy=-26.664373903839163  maxlinkdim=2000 maxerr=2.39E-05 time=44.923
111.313039 seconds (936.05 M allocations: 295.924 GiB, 22.80% gc time, 0.14% compilation time)
(Nx, Ny) = (8, 4)
(t, U) = (1.0, 4.0)
flux(psi) = QN(("Ky",0,4),("Nf",32,-1),("Sz",0))
maxlinkdim(psi) = 2000
energy = -26.664373903839163

EDIT: Note that discrepancy is only when block sparse multithreading is run, and in particular for block sparse multithreading where the logic of determining the block contraction list (as opposed to performing the actual contractions) is significant, which is the case for hybrid real space and momentum space 2D DMRG calculations. I don't see any discrepancy for systems like the 1D Heisenberg model when conserving Sz.

Copy link

codecov bot commented Apr 20, 2025

Codecov Report

All modified and coverable lines are covered by tests ✅

Project coverage is 80.91%. Comparing base (3583101) to head (751aa95).
Report is 1 commits behind head on main.

Additional details and impacted files
@@           Coverage Diff           @@
##             main    #1650   +/-   ##
=======================================
  Coverage   80.91%   80.91%           
=======================================
  Files          59       59           
  Lines        4626     4626           
=======================================
  Hits         3743     3743           
  Misses        883      883           

☔ View full report in Codecov by Sentry.
📢 Have feedback on the report? Share it here.

🚀 New features to boost your workflow:
  • ❄️ Test Analytics: Detect flaky tests, report on failures, and find test suite problems.

@mtfishman mtfishman merged commit 96c2b51 into main Apr 21, 2025
15 checks passed
@mtfishman mtfishman deleted the NDTensors_avoid_threadid branch April 21, 2025 01:37
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

1 participant