-
Notifications
You must be signed in to change notification settings - Fork 3
Bug ? Rasterio + WarpVRT bad performances for GDAL3.0 + Proj6 #10
Comments
@rouault FYI, I'm seeing a huge performance downgrade using latest commit from gdal in My use case as usual is using rasterio to do dynamic tilling (using warped VRT). I'll try to narrow everything when I'm done fixing this |
Note: |
I can't think how OSGeo/gdal@776e602 could impact performance. It has been backported to release/2.4 as well. It really just avoids an error to be emitted. |
Thanks @rouault I'm still investigating I'll continue digging deeper because I've got numerous project that depends on this |
alright narrowing down this, it seems to be a rasterio/rio-tiler problem FYI @sgillies I'm seeing a huge performance decrease using GDAL3.0 (proj 6.2.1) when using rio tiler over remote files. I'm working on make reproductible test for now I'll let you know as soon as I have a better understanding |
Alright still working on this but I have a better sense of what is doing the performance decrease. I know PROJ 6 shipped a lot of changes and that Rasterio supports it but there might be some optimization missing. |
What is expensive with PROJ 6 is creating a PROJ PJ* object (or instanciating a OGRCoordinateTrasnsformation object which is basically a wrapper over it). So you should try to minimize the number of those instanciations. Once a PJ / OGRCoordinateTrasnsformation is created, the peformance of using it with PROJ 6 with many points should be similar to with PROJ < 6. Normally for a warped VRT, GDAL just creates two such objects, so I don't think this would impact performance for a large enough dataset |
@vincentsarago are you making a WarpedVRT (or several) for every request? Sounds like that might no longer scale with PROJ 6. |
Indeed I am but I don’t have a choice. |
@sgillies here some numbers There is a huge performance downgrade when using GDAL3+PROJ6, and has pointed by Even, it might be related to The main performance changes are for
and
GDAL 2Tile Read Time 0.761 seconds
GDAL 3Tile Read Time 5.569 seconds
Same file, same mercator tile Script import pstats
import cProfile
import mercantile
import rasterio
from rio_tiler import utils
def profileit(func):
"""Profiling."""
def wrapper(*args, **kwargs):
prof = cProfile.Profile()
retval = prof.runcall(func, *args, **kwargs)
ps = pstats.Stats(prof)
ps.strip_dirs().sort_stats('time', 'cumulative').print_stats(20)
return retval
return wrapper
@profileit
def _rio_tiler_read(src_path, bounds, tilesize=256):
with rasterio.Env():
utils.tile_read(
src_path,
bounds,
tilesize,
tile_edge_padding=0,
warp_vrt_option=dict(SOURCE_EXTRA=1),
)
return True
src_path = "https://s3.amazonaws.com/opendata.remotepixel.ca/bench_tiler/LC08_L1TP_040013_20191014_20191029_01_T1_B4.tif"
z, x, y = 9, 115, 123
tile = mercantile.Tile(x=x, y=y, z=z)
tile_bounds = mercantile.xy_bounds(tile)
_rio_tiler_read(src_path, tile_bounds) Dockerfile ARG GDAL_VERSION
FROM remotepixel/amazonlinux:gdal${GDAL_VERSION}-py3.7
RUN pip3 install rio-tiler mercantile --no-binary rasterio Makefile docker build --build-arg GDAL_VERSION=3.0 --tag img:latest .
docker run \
--name bench \
--env GDAL_HTTP_MULTIPLEX=YES \
--env GDAL_HTTP_VERSION=2 \
--env GDAL_HTTP_MULTIRANGE=YES \
--env GDAL_HTTP_MERGE_CONSECUTIVE_RANGES=YES \
--env VSI_CACHE=TRUE \
--env VSI_CACHE_SIZE=1073741824 \
--env CPL_DEBUG=ON \
--rm -it img:latest /bin/bash |
More on projection performance is also dependent to the input projection EPSG:32614
EPSG:4326
EPSG:3347
|
Ouch 🤕 I'll try to reproduce soon. |
This comment has been minimized.
This comment has been minimized.
@sgillies I'm so sorry, this might not be on rasterio side after all but might be on rio-tiler 🤦♂ I'll ping you as soon as I have a better sense of what's going on! |
This comment has been minimized.
This comment has been minimized.
@rouault @sgillies Thanks for dropping by! I've published some early numbers over https://github.com/vincentsarago/rio-tiler-bench showing the I don't see any difference between GDAL and Rasterio so @sgillies I'm sorry that I pinged you ;-) |
@vincentsarago If you can isolate pure GDAL cases where you see dramatic performance differences , like the 0.14 vs 1.85 I can see, that would be worth opening a GDAL ticket to see if there's some low hanging fruit. Coordinate instanciatation is slower in PROJ 6 for sure compared to previous versions: but it is like we need 20 ms whereas we needed 1 ms before, so my expectation was that for most use cases that would be within the noise. |
I see, thanks @rouault I'll try to check all the logs and see if I can get a complete understanding. I'll also need to make sure that I'm not compiling GDAL and PROJ with wrong options amazonlinux/base/gdal3.0/Dockerfile Lines 74 to 103 in 68ca116
The command I'm using is pretty straight forward
I'll test with gdal provided docker images to see if I can see the same results. |
osgeo/gdal:alpine-ultrasmall-v2.4.1 (PROJ 5.2)
osgeo/gdal:alpine-ultrasmall-3.0.2 (PROJ 6.2.1)
Updated the number with |
remotepixel/amazonlinux:gdal2.4-py3.7
remotepixel/amazonlinux:gdal3.0-py3.7
Alright .... it make sense now 🤦♂, there something wrong with my docker images. I should have checked that first, I'm sorry @rouault |
No description provided.
The text was updated successfully, but these errors were encountered: