-
Notifications
You must be signed in to change notification settings - Fork 224
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Add common alias cores (x) for grdlandmask #2944
Conversation
@@ -49,7 +49,7 @@ def test_grdlandmask_no_outgrid(expected_grid): | |||
""" | |||
Test grdlandmask with no set outgrid. | |||
""" | |||
result = grdlandmask(spacing=1, region=[125, 130, 30, 35]) | |||
result = grdlandmask(spacing=1, region=[125, 130, 30, 35], cores=2) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The Linux runner for GitHub Actions has 2 CPU cores according to https://docs.github.com/en/actions/using-github-hosted-runners/about-github-hosted-runners/about-github-hosted-runners#supported-runners-and-hardware-resources, but not sure if we should set cores to 2 or 1 for consistent benchmarking.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Using two cores is faster and can help check the OpenMP feature in GMT, so I prefer to use two cores if possible.
BTW, the macOS runner for GitHub Actions has 3 cores, so instead of cores=2
, cores=True
will use all available cores.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Is only 1 core being used if we don't set cores
, or use cores=None
? I'm reading https://docs.generic-mapping-tools.org/6.4/gmt.html#core-full which says [default is to use all available cores]
, but is that only with cores=True
(i.e. -x
in GMT)?
The benchmark.yml
workflow is running on Linux, but I guess we could set cores=True
since this test would also run on macOS in ci_test.yml
. My question was more around whether multi-threading with multiple-cores (2 or more) would provide a consistent benchmark compared to using only 1 core, but I guess we need to merge this PR to find out!
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Is only 1 core being used if we don't set
cores
, or usecores=None
? I'm reading https://docs.generic-mapping-tools.org/6.4/gmt.html#core-full which says[default is to use all available cores]
, but is that only withcores=True
(i.e.-x
in GMT)?
Yes.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Is only 1 core being used if we don't set
cores
, or usecores=None
? I'm reading https://docs.generic-mapping-tools.org/6.4/gmt.html#core-full which says[default is to use all available cores]
, but is that only withcores=True
(i.e.-x
in GMT)?Yes.
Actually it uses all available cores if -x
is not used.
$ gmt grdlandmask -R0/10/0/10 -Gabc.nc -I1/1 -Vi
grdlandmask [INFORMATION]: Enable all available threads (up to xxx)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Maybe set
cores=1
and see what the flame graph diff shows?
The test_grdlandmask_no_outgrid
benchmark speedup went from 3x to 9.6x 🤣 It might be that there is some overhead with using threading (2 cores) vs no threading (1 core). Here's the flame graph diff from https://codspeed.io/GenericMappingTools/pygmt/branches/limit-cores again:
The extra gomp_team_barrier_wait_*
calls are still there, but also some other stuff (in blue).
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This reminds me of the warnings message I saw when running the tests in random order (see https://github.com/GenericMappingTools/pygmt/actions/runs/7396374248/job/20121388367?pr=2936):
pygmt/tests/test_session_management.py::test_session_multiprocessing
pygmt/tests/test_session_management.py::test_session_multiprocessing
/home/runner/micromamba/envs/pygmt/lib/python3.12/multiprocessing/popen_fork.py:66: DeprecationWarning: This process (pid=2433) is multi-threaded, use of fork() may lead to deadlocks in the child.
self.pid = os.fork()
There are some related discussions at https://discuss.python.org/t/concerns-regarding-deprecation-of-fork-with-alive-threads/33555 and related PR at python/cpython#100228.
os.fork() in a multi-threaded application is a likely source of deadlocks on many platforms. We should raise a warning when people call os.fork() from a process that we know has other threads running.
Not sure if it's related to the issue here.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
pygmt/tests/test_session_management.py::test_session_multiprocessing pygmt/tests/test_session_management.py::test_session_multiprocessing /home/runner/micromamba/envs/pygmt/lib/python3.12/multiprocessing/popen_fork.py:66: DeprecationWarning: This process (pid=2433) is multi-threaded, use of fork() may lead to deadlocks in the child. self.pid = os.fork()
That test_session_multiprocessing
function is only running basemap
and savefig
right? It shouldn't have any multi-threading going on, so not sure what's happening.
I tried running some benchmarks using cores from -1 (all cores) to 0 (no-multithreading) to 16 (all cores on my laptop). This was running grdlandmask
with a spacing of 0.05 arc degrees. The results do vary between runs, but in general, more cores means less time:
Code to reproduce:
import time
import pandas as pd
import pygmt
import tqdm
timings = {}
for cores in tqdm.tqdm(range(-1, 16)):
tic = time.perf_counter()
pygmt.grdlandmask(spacing="0.05d", region=[0, 180, 0, 90], cores=cores)
toc = time.perf_counter()
timings[cores] = toc - tic
df = pd.DataFrame.from_dict(data=timings, orient="index", columns=["time"])
# %%
fig = pygmt.Figure()
region = pygmt.info(data=df.reset_index(), per_column=True, spacing=0.05)
fig.plot(
x=df.index,
y=df.time,
region=region,
pen="thick",
frame=[
"WSne+tBenchmark grdlandmask",
"xaf+lCores",
"yaf+lTime (s), lower is better",
],
)
fig.savefig("benchmark_grdlandmask.png")
fig.show()
If I change the spacing from 0.05 arc degrees to 1 arc degree, more cores actually take more time:
For that test_grdlandmask_no_outgrid
unit test which is using spacing=1
(i.e. 1 arc degree), I think we should just use cores=2
, since cores=3
might not be much faster on such a low resolution. As long as we are benchmarking with a fixed number of core counts, it should be ok?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
sounds ok
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Ok, back to using cores=2
in 97637d4.
CodSpeed Performance ReportMerging #2944 will improve performances by ×3Comparing Summary
Benchmarks breakdown
|
The benchmark report is quite confusing. It seems that |
Yeah, it seems to confirm the hypothesis in #2942 (comment) though. Does the setting affect |
pygmt/tests/test_grdlandmask.py
Outdated
@@ -49,7 +49,7 @@ def test_grdlandmask_no_outgrid(expected_grid): | |||
""" | |||
Test grdlandmask with no set outgrid. | |||
""" | |||
result = grdlandmask(spacing=1, region=[125, 130, 30, 35]) | |||
result = grdlandmask(spacing=1, region=[125, 130, 30, 35], cores=1, verbose="i") |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Hmm, the debug info at https://github.com/GenericMappingTools/pygmt/actions/runs/7403072686/job/20142175894?pr=2944#step:8:111 is showing that there are 4 threads on GitHub Actions??
grdlandmask [INFORMATION]: Enable 1 threads of 4 available
grdlandmask [INFORMATION]: Selected ice front line as Antarctica coastline
grdlandmask [INFORMATION]: GSHHG version 2.3.7
grdlandmask [INFORMATION]: Derived from World Vector Shoreline, CIA WDB-II, and Atlas of the Cryosphere
grdlandmask [INFORMATION]: Processed by Paul Wessel and Walter H. F. Smith, 1994-2017
grdlandmask [INFORMATION]: Nodes in the oceans will be set to
0
grdlandmask [INFORMATION]: Nodes on land will be set to
1
grdlandmask [INFORMATION]: Nodes in lakes will be set to
0
grdlandmask [INFORMATION]: Nodes in islands will be set to
1
grdlandmask [INFORMATION]: Nodes in ponds will be set to
0
grdlandmask [INFORMATION]: Apply linear scale to avoid wrap-around: -Jx100id; this temporarily changes the domain
grdlandmask [INFORMATION]: Writing grid to file /tmp/pygmt-0c0skk20.nc
grdlandmask [INFORMATION]: Level 0 set for 33 nodes
grdlandmask [INFORMATION]: Level 1 set for 3 nodes
grdlandmask [INFORMATION]: Enable 1 threads of 4 available
grdlandmask [INFORMATION]: Selected ice front line as Antarctica coastline
grdlandmask [INFORMATION]: GSHHG version 2.3.7
grdlandmask [INFORMATION]: Derived from World Vector Shoreline, CIA WDB-II, and Atlas of the Cryosphere
grdlandmask [INFORMATION]: Processed by Paul Wessel and Walter H. F. Smith, 1994-2017
grdlandmask [INFORMATION]: Nodes in the oceans will be set to
0
grdlandmask [INFORMATION]: Nodes on land will be set to
1
grdlandmask [INFORMATION]: Nodes in lakes will be set to
0
grdlandmask [INFORMATION]: Nodes in islands will be set to
1
grdlandmask [INFORMATION]: Nodes in ponds will be set to
0
grdlandmask [INFORMATION]: Apply linear scale to avoid wrap-around: -Jx100id; this temporarily changes the domain
grdlandmask [INFORMATION]: Writing grid to file /tmp/pygmt-i3n6zkcg.nc
grdlandmask [INFORMATION]: Level 0 set for 33 nodes
grdlandmask [INFORMATION]: Level 1 set for 3 nodes
So the 2 CPUs mentioned at https://docs.github.com/en/actions/using-github-hosted-runners/about-github-hosted-runners/about-github-hosted-runners#supported-runners-and-hardware-resources might have hyperthreading, and we could potentially set cores=4
, though see #2944 (comment) on how more cores might not actually improve runtime.
Description of proposed changes
Add alias cores (-x) for
grdlandmask
, and set limit ofcores=2
fortest_grdlandmask_no_outgrid
to make benchmark execution deviate less.Preview at https://pygmt-dev--2944.org.readthedocs.build/en/2944/api/generated/pygmt.grdlandmask.html
Extends #1423, may address #2942 partially.
Reminders
make format
andmake check
to make sure the code follows the style guide.doc/api/index.rst
.Slash Commands
You can write slash commands (
/command
) in the first line of a comment to performspecific operations. Supported slash commands are:
/format
: automatically format and lint the code/test-gmt-dev
: run full tests on the latest GMT development version