Remove the Dask Compute #131

rosepearson · 2022-12-19T22:17:53Z

Remove the explicit dask compute call.

Deal with this mask:

jennan · 2022-12-19T22:35:48Z

some lines that could cause issues

masks calculation https://github.com/rosepearson/GeoFabrics/blob/main/src/geofabrics/dem.py#L1364
possible solution dem.z.data = da.where(mask & (dem.z.data > 0), dem.z.data, 0)
coarse DEM data loading in https://github.com/rosepearson/GeoFabrics/blob/main/src/geofabrics/processor.py#L713

rosepearson · 2023-05-11T04:24:18Z

Not quite the same subject - but when picking this up again could look at using dask with pandas to remove the horrendous bottle neck that exists when calculating the 'open waterway elevations'. See https://github.com/rosepearson/GeoFabrics/blob/main/src/geofabrics/processor.py#L2386 - move to issue #130

rosepearson · 2023-06-26T01:09:49Z

Instructions for setup:

module purge
module load NeSI
module load Miniconda3
export CYLC_VERSION=8.0.3
export PROJECT=niwa03440

set +u
module load Anaconda3
source $(conda info --base)/etc/profile.d/conda.sh
conda activate /nesi/project/niwa03440/conda/envs/htop
set -u

Running the code:

geofabrics_from_file --instruction /nesi/project/niwa03440/geofabrics/instructions/performance/issue131/small.json

jennan · 2023-06-26T04:14:18Z

I have started some changes in compute_to_disk's branch, the part saving from computation work, but the code crashes before the end (likely due to commenting out saving extends 😅)

rosepearson · 2023-06-26T05:16:58Z

Thanks for the catch-up. It'l looking promising. Looking forward to hearing how the large example goes.

At this stage:

I am expecting to only consider a more advanced DASK scheduler in a later issue.
I am expecting to address various TODOs in the code including moving/removing or updating the 'get extents' code

See issue 153 for notes around planned changes to how or if the extents are calculated.

jennan · 2023-07-07T01:38:38Z

I have run overnight the large example, using a Māui ancil node. It took 6 hours and max 40GB.

Few additional notes:

20 workers & memory limit of 5GB (20 cores & 100 GB requested via Slurm)
chunk_size=100:
- this lowered the memory pressure on workers but created more tasks (than chunk_size=200)
- with chunk_size=200, workers kept being restarted due to too high memory usage (and I wanted to avoid putting too much memory per worker), and restarting a worker implies recomputing data
switch to .zarr output format:
- with netcdf, workers have to wait for a lock to write on disk
- with chunk_size=100 + netcdf, the workers wait too long, leading to poor utilisation (low % of cpu usage) per worker
- but none of this happened with zarr

Ideally I should add the timing of converting from zarr to netcdf.

Efficiency of the job

 $ nn_seff 30023700
Cluster: maui_ancil
Job ID: 30023700
State: COMPLETED
Cores: 10
Tasks: 1
Nodes: 1
Job Wall-time:   59.2%  05:54:55 of 10:00:00 time limit
CPU Efficiency: 186.7%  4-14:26:59 of 2-11:09:10 core-walltime
Mem Efficiency:  37.0%  36.97 GB of 100.00 GB

And here is the slurm job I used for the records

#!/usr/bin/env bash
#SBATCH --time=00-10:00:00
#SBATCH --ntasks=1
#SBATCH --cpus-per-task=20
#SBATCH --mem=100GB
#SBATCH --output logs/%j-%x.out
#SBATCH --error logs/%j-%x.out

# exit on errors, undefined variables and errors in pipes
set -euo pipefail

# load required environment modules
module purge
module load NeSI
module load Miniconda3/22.11.1-1

# activate conda environment
set +u
source $(conda info --base)/etc/profile.d/conda.sh 
conda deactivate  # workaround for https://github.com/conda/conda/issues/9392
export PYTHONNOUSERSITE=1
conda activate ./venv_new
set -u

# disable spilling on disk and increase threshold to pause/kill dask worker
export DASK_DISTRIBUTED__WORKER__MEMORY__TARGET=False
export DASK_DISTRIBUTED__WORKER__MEMORY__SPILL=False
export DASK_DISTRIBUTED__WORKER__MEMORY__PAUSE=0.90
export DASK_DISTRIBUTED__WORKER__MEMORY__TERMINATE=0.95

# change output folder in the .json file
JSON_FILE="issue131/results/large_${SLURM_JOB_ID}.json"
sed "s/\"subfolder\": \"large\"/\"subfolder\": \"large_${SLURM_JOB_ID}\"/" \
    issue131/large.json > "${JSON_FILE}"

# run the large example
time geofabrics_from_file --instruction "${JSON_FILE}"

jennan · 2023-07-09T22:56:17Z

Kia ora @rosepearson ,

I have run more jobs to compare different scenarios (netcdf vs. zarr, smaller or larger chunks, without or without clipping). Here are my (unsorted 😅) notes and my conclusions/recommendations.

job 30020508

20 cores, 100 GB
chunk_size=200, number_of_cores=20, memory_limit=5GiB
crashed after 20 min, when workers restarted due to a spike in memory usage

job 30021473

20 cores, 100 GB
chunk_size=100, number_of_cores=20, memory_limit=5GB
tried export DASK_DISTRIBUTED__COMM__RETRY__COUNT=3
still timeout messages
manually stopped has cpu usage was quite low

job 30022496

20 cores, 200 GB
chunk_size=400, number_of_cores=20, memory_limit=10GB
manually stopped as workers restarted due to a spike in memory usage

job 30022806

20 cores, 100 GB
chunk_size=200, number_of_cores=20, memory_limit=5GB
save as .zarr
good cpu utilisation (> 90%)
manually stopped as workers restarted due to a spike in memory usage

Note: store-map tasks are rescheduled when worker is restarted... but they probably could be cancelled once data is written on disk.

job 30023700

20 cores, 100 GB
chunk_size=100, number_of_cores=20, memory_limit=5GB
without rio.clip
save as .zarr
good cpu utilisation (> 90%)
lower worker memory usage (than chunk_size=200), seems < 2.5GiB / worker

 $ nn_seff 30023700
Cluster: maui_ancil
Job ID: 30023700
State: COMPLETED
Cores: 10
Tasks: 1
Nodes: 1
Job Wall-time:   59.2%  05:54:55 of 10:00:00 time limit
CPU Efficiency: 186.7%  4-14:26:59 of 2-11:09:10 core-walltime
Mem Efficiency:  37.0%  36.97 GB of 100.00 GB

output size

$ du -sh issue131/results/large_30023700/raw_dem_8m.zarr/
3.2G    issue131/results/large_30023700/raw_dem_8m.zarr/

job 30041464

20 cores, 100 GB
chunk_size=100, number_of_cores=20, memory_limit=5GB
with rio.clip
save as .zarr
failed due to chunking pattern not supported by zarr

ValueError: Zarr requires uniform chunk sizes except for final chunk. Variable named 'data_source' has incompatible dask chunks: ((38, 100, 100, 100, 100, 100, 100, 100, 100, 100, 100, 100, 100, 100, 100, 100, 100, 100, 100, 100, 100, 100, 100, 100, 100, 100, 100, 100, 100, 100, 100, 100, 100, 100, 100, 100, 100, 100, 100, 100, 100, 100, 100, 100, 100, 100, 100, 100, 100, 100, 100, 100, 100, 100, 100, 100, 100, 100, 100, 100, 100, 100, 100, 100, 100, 100, 100, 100, 100, 100, 100, 100, 100, 100, 100, 100, 100, 100, 100, 100, 100, 100, 100, 100, 100, 100, 100, 100, 100, 100, 100, 100, 100, 100, 100, 100, 100, 100, 100, 100, 100, 100, 100, 100, 100, 100, 100, 100, 100, 100, 100, 100, 100, 100, 100, 100, 100, 100), (100, 100, 100, 100, 100, 100, 100, 100, 100, 100, 100, 100, 100, 100, 100, 100, 100, 100, 100, 100, 100, 100, 100, 100, 100, 100, 100, 100, 100, 100, 100, 100, 100, 100, 100, 100, 100, 100, 100, 100, 100, 100, 100, 100, 100, 100, 100, 100, 100, 100, 100, 100, 100, 100, 100, 100, 100, 100, 100, 100, 100, 100, 100, 100, 100, 100, 100, 100, 100, 100, 100, 100, 100, 100, 100, 100, 100, 100, 100, 100, 100, 100, 100, 100, 100, 100, 100, 100, 100, 100, 100, 100, 100, 100, 100, 100, 100, 100, 100, 100, 100, 100, 100, 100, 100, 100, 100, 100, 100, 100, 100, 100, 100, 100, 100, 100, 100, 100, 100, 100, 100, 100, 100, 100, 100, 100, 100, 100, 100, 100, 100, 100, 100, 100, 100, 100, 100, 100, 100, 100, 100, 100, 100, 100, 100, 100, 100, 100, 100, 100, 100, 100, 100, 100, 100, 100, 100, 100, 100, 100, 100, 100, 100, 100, 100, 100, 100, 100, 100, 100, 100, 100, 100, 100, 100, 100, 100, 100, 100, 100, 100, 100, 100, 100, 100, 100, 100, 100, 100)). Consider rechunking using `chunk()`.

job 30059619

20 cores, 400 GB
chunk_size=200, number_of_cores=20, memory_limit=20GB
with rio.clip
save as .nc
bad cpu utilisation (~30%)
bad ratio compute vs. transfer (almost 3-to-1)
workers seem to use less than 3GB in general (but go to ~5GB sometimes)
error messages at the end but it seems that it ran to completion

$ nn_seff 30059619
Cluster: maui_ancil
Job ID: 30059619
State: COMPLETED
Cores: 10
Tasks: 1
Nodes: 1
Job Wall-time:   40.1%  04:00:47 of 10:00:00 time limit
CPU Efficiency: 139.4%  2-07:55:47 of 1-16:07:50 core-walltime
Mem Efficiency:  15.1%  60.39 GB of 400.00 GB

output size

$ du -sh issue131/results/large_30059619/raw_dem_8m.zarr 
4.2G    issue131/results/large_30059619/raw_dem_8m.zarr

job 30064157

20 cores, 100 GB
chunk_size=200, number_of_cores=20, memory_limit=10GB
without rio.clip
save as .zarr + blosc @ clevel=4
good cpu utilisation (~95%)
workers seem to use less than 4GB in general (but go to ~5GB sometimes)

$ nn_seff 30064157
Cluster: maui_ancil
Job ID: 30064157
State: COMPLETED
Cores: 10
Tasks: 1
Nodes: 1
Job Wall-time:   34.0%  03:24:03 of 10:00:00 time limit
CPU Efficiency: 193.2%  2-17:42:40 of 1-10:00:30 core-walltime
Mem Efficiency:  56.9%  56.89 GB of 100.00 GB

lighter output

$ du -sh issue131/results/large_30064157/raw_dem_8m.zarr/
828M    issue131/results/large_30064157/raw_dem_8m.zarr/

job 30069881

20 cores, 100 GB
chunk_size=200, number_of_cores=20, memory_limit=10GB
without rio.clip
save as .nc
good cpu utilisation (~95%)
workers seem to use less than 4GB in general (but go sometimes a little bit above)

$ nn_seff 30069881
Cluster: maui_ancil
Job ID: 30069881
State: COMPLETED
Cores: 10
Tasks: 1
Nodes: 1
Job Wall-time:   33.2%  03:19:25 of 10:00:00 time limit
CPU Efficiency: 192.0%  2-15:48:21 of 1-09:14:10 core-walltime
Mem Efficiency:  54.3%  54.30 GB of 100.00 GB

output size (wooops, I forgot to change extension to .nc)

$ du -sh issue131/results/large_30069881/raw_dem_8m.zarr 
4.2G    issue131/results/large_30069881/raw_dem_8m.zarr

job 30073973

20 cores, 200 GB
chunk_size=200, number_of_cores=20, memory_limit=15GB
without rio.clip
save as .nc + zlib compression @ level=4 for "z"
good cpu utilisation (~95%)
workers seem to use less than 7GB in general

$ nn_seff 30073973
Cluster: maui_ancil
Job ID: 30073973
State: COMPLETED
Cores: 10
Tasks: 1
Nodes: 1
Job Wall-time:   27.5%  02:44:48 of 10:00:00 time limit
CPU Efficiency: 190.6%  2-04:21:23 of 1-03:28:00 core-walltime
Mem Efficiency:  40.9%  81.90 GB of 200.00 GB

output size, surprisingly larger than before

$ du -sh issue131/results/large_30073973/raw_dem_8m.nc 
9.0G    issue131/results/large_30073973/raw_dem_8m.nc

Conclusions

zarr vs. netcdf

zarr managed to get lighter outputs (due to compressing chunks)
surprisingly netcdf didn't compress well the outputs (probably me not using the compression properly)
in terms of timing, no real difference, so no need to replace netcdf with zarr
zarr didn't support clipping, i.e. it doesn't like when chunks don't have the same size (except the final chunks, that's ok, but here the first row/column chunks had different sizes).

clip vs no clip

clipping lowers the cpu utilisation of dask workers
clipping has a negative impact on time, but not so bad (4h with clipping and 3h19m without clipping, see job 30059619 vs. 30069881)

chunksize

larger chunksize lead to faster computation
- 2h44 min with chunk size of 300 (and no clip)
- 3h19min with chunk size of 200 (and no clip)
but this increase the memory used per worker
- about 6-7GB per worker for chunk size of 300 (and no clip)
- about 4GB per worker for chunk size of 200 (and no clip)

memory settings

workers memory can have temporary spikes, but otherwise their memory usage is quite stable
workers restarted because they use too much memory will trigger re-computation of all the tasks they were hosting (even if the data has been saved... not great), which means we should avoid at all cost workers to restart
to achieve this, set a high enough memory_limit per worker (e.g. I used 15GB with chunk size 300)
and also configure the limits to restart a worker at a higher percentage (see https://distributed.dask.org/en/latest/worker-memory.html)

# disable spilling on disk and increase threshold to pause/kill dask worker
export DASK_DISTRIBUTED__WORKER__MEMORY__TARGET=False
export DASK_DISTRIBUTED__WORKER__MEMORY__SPILL=False
export DASK_DISTRIBUTED__WORKER__MEMORY__PAUSE=0.90
export DASK_DISTRIBUTED__WORKER__MEMORY__TERMINATE=0.95

but workers will operate at a lower regime in general, so the hosting slurm job doesn't need to request memory_limit x number of workers, but can request much less (for 20 workers and chunk size = 300, we could request 140GB as most the time workers are below 7GB).

I hope you'll find these recommendations useful :).

rosepearson · 2023-07-16T21:17:40Z

Great! Note that the above was all without raw_extents calculations. Move away from doing this - consider either pulling in the geometries specified in the TileIndex files or working out before pulling in the coast DEM and buffering. Notes about raw_extents issues to address as part of this in #153

rosepearson changed the title ~~Remote the Dask Compute~~ Remove the Dask Compute Dec 19, 2022

rosepearson added enhancement New feature or request performance labels Jan 8, 2023

rosepearson mentioned this issue Jun 26, 2023

Fixes associated with Cyclone Gabrielle #153

Closed

rosepearson linked a pull request Jul 3, 2023 that will close this issue

Performance improvements: Compute to disk #183

Merged

10 tasks

rosepearson added this to the 0.10.24 milestone Aug 17, 2023

rosepearson mentioned this issue Aug 17, 2023

General Performance improvements related to geometry manipulations #130

Closed

rosepearson closed this as completed in #183 Aug 17, 2023

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Remove the Dask Compute #131

Remove the Dask Compute #131

rosepearson commented Dec 19, 2022

jennan commented Dec 19, 2022

rosepearson commented May 11, 2023 •

edited

Loading

rosepearson commented Jun 26, 2023

jennan commented Jun 26, 2023

rosepearson commented Jun 26, 2023 •

edited

Loading

jennan commented Jul 7, 2023 •

edited

Loading

jennan commented Jul 9, 2023

rosepearson commented Jul 16, 2023

Remove the Dask Compute #131

Remove the Dask Compute #131

Comments

rosepearson commented Dec 19, 2022

jennan commented Dec 19, 2022

rosepearson commented May 11, 2023 • edited Loading

rosepearson commented Jun 26, 2023

jennan commented Jun 26, 2023

rosepearson commented Jun 26, 2023 • edited Loading

jennan commented Jul 7, 2023 • edited Loading

jennan commented Jul 9, 2023

job 30020508

job 30021473

job 30022496

job 30022806

job 30023700

job 30041464

job 30059619

job 30064157

job 30069881

job 30073973

Conclusions

rosepearson commented Jul 16, 2023

rosepearson commented May 11, 2023 •

edited

Loading

rosepearson commented Jun 26, 2023 •

edited

Loading

jennan commented Jul 7, 2023 •

edited

Loading