Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

WIP Parallelize gdist calculation #69

Open
wants to merge 1 commit into
base: trunk
Choose a base branch
from
Open

WIP Parallelize gdist calculation #69

wants to merge 1 commit into from

Conversation

maedoc
Copy link
Member

@maedoc maedoc commented Apr 7, 2021

This enable parallelization of the gdist calculation across CPU cores (6-core CPU drops expected runtime from 8 hours to 1h30). It also enables a progress bar and keyboard interrupt support.

@maedoc maedoc changed the title Parallelize gdist calculation WIP Parallelize gdist calculation Apr 7, 2021
@codecov
Copy link

codecov bot commented Apr 7, 2021

Codecov Report

Merging #69 (235d48f) into trunk (943c0e0) will decrease coverage by 1.57%.
The diff coverage is n/a.

Impacted file tree graph

@@            Coverage Diff             @@
##            trunk      #69      +/-   ##
==========================================
- Coverage   75.31%   73.74%   -1.58%     
==========================================
  Files           9        8       -1     
  Lines        1203     1131      -72     
==========================================
- Hits          906      834      -72     
  Misses        297      297              
Flag Coverage Δ
cpp 73.74% <ø> (ø)
python ?

Flags with carried forward coverage won't be shown. Click here to find out more.


Continue to review full report at Codecov.

Legend - Click here to learn more
Δ = absolute <relative> (impact), ø = not affected, ? = missing data
Powered by Codecov. Last update 943c0e0...235d48f. Read the comment docs.

@ayan-b
Copy link
Member

ayan-b commented Apr 13, 2021

FWIW, there is a parallelization implementation here #65, albeit the heavy-lifting is shifted to the C++ file instead of the pyx one.

@maedoc
Copy link
Member Author

maedoc commented Apr 13, 2021

Thanks for mentioning that. I still prefer to split up the work into big chunks instead of applying OpenMP to the inner loop, since this minimizes overhead and avoids any contention or lack of reentrancy in the gdist algorithm itself. In any case, I ran out of memory and could not compute the whole thing it one go, so yet another reason for the functions in the PR (that allow computing block of rows at a time).

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants