-
Notifications
You must be signed in to change notification settings - Fork 0
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Reproject before or after converting to vector #5
Comments
Testing this theory with a 57GB WorldView 0.5 m raster. PYTHONPATH=/explore/nobackup/people/jacaraba/development python view/ccdc_cli.py --gee_config /explore/nobackup/projects/ilab/gee/gee_config.json --footprint_file /explore/nobackup/projects/3sl/data/VHR/CAS/M1BS_pansharp/WV02_20211229_M1BS_10300100CB91C300-toa-sharpened.tif --output_path /explore/nobackup/projects/ilab/scratch/vhr-toolkit/ccdc-test |
The code would change to: from shapely.geometry import box
from shapely.ops import transform
from pyproj import Transformer, CRS
def _getCoords(file):
"""
Extract coordinates from a raster file and convert
them to EPSG:4326 projection.
Args:
file (str): Path to the raster file.
Returns:
list: A list of coordinate pairs [longitude, latitude]
defining the bounding box of the raster.
Note: This function assumes the input file is a valid
raster file readable by rioxarray.
"""
# Open the raster file
raster = rxr.open_rasterio(file)
# Create a bounding box from the raster's extent
poly = box(*raster.rio.bounds())
# Define the source and target CRS
source_crs = CRS.from_epsg(CRS.from_epsg(raster.rio.crs.to_epsg())) # WGS 84
target_crs = CRS.from_epsg(4326) # Web Mercator
# Create a transformer
transformer = Transformer.from_crs(source_crs, target_crs, always_xy=True)
# Reproject the raster to EPSG:4326 (WGS84) coordinate system
poly_reproj = transform(transformer.transform, poly)
# Extract the coordinates of the bounding box
# Convert each coordinate pair to [longitude, latitude] format
coords = [[i[0], i[1]] for i in list(poly_reproj.exterior.coords)]
return coords |
This saves an hour of compute time. From 1.1 hour to 10 seconds. |
After the comparison, files are identical. Going ahead and making the change for the release. Singularity> cmp -s /explore/nobackup/projects/ilab/scratch/vhr-toolkit/ccdc-test/WV02_20211229_M1BS_10300100CB91C300-toa-sharpened_ccdc.tiff /explore/nobackup/projects/ilab/scratch/vhr-toolkit/ccdc-test/WV02_20211229_M1BS_10300100CB91C300-toa-sharpened_ccdc-reproject-raster.tiff && echo "Files are identical" || echo "Files are different"
Files are identical |
Not totally clear on the goal here, so, take these with a grain of salt:
1. What coords are we needing to get here? The total bounds of a strip will
give a different spatial footprint that the footprint of the actual data.
2. Whatever coords you want, do we need to reproject the full res version
of the data to do it? Coarsen, then manipulate?
…On Tue, Feb 4, 2025, 3:13 PM Jordan Alexis Caraballo-Vega < ***@***.***> wrote:
After the comparison, files are identical. Going ahead and making the
change for the release.
Singularity> cmp -s /explore/nobackup/projects/ilab/scratch/vhr-toolkit/ccdc-test/WV02_20211229_M1BS_10300100CB91C300-toa-sharpened_ccdc.tiff /explore/nobackup/projects/ilab/scratch/vhr-toolkit/ccdc-test/WV02_20211229_M1BS_10300100CB91C300-toa-sharpened_ccdc-reproject-raster.tiff && echo "Files are identical" || echo "Files are different"
Files are identical
—
Reply to this email directly, view it on GitHub
<#5 (comment)>,
or unsubscribe
<https://github.com/notifications/unsubscribe-auth/AB7IW3KA4EUFWZVSMDBIRAL2OENQPAVCNFSM6AAAAABWPMSQZ2VHI2DSMVQWIX3LMV43OSLTON2WKQ3PNVWWK3TUHMZDMMZUHE3DINRQGM>
.
You are receiving this because you were mentioned.Message ID:
***@***.***>
|
Went ahead and did the hotfix. To answer Paul questions: 1) total bounds of the strip, 2) no need to coarsen for CCDC, we just need the full strip to go from X EPSG to 4326 EPSG. The solution was to generate a polygon of the strip coords, then reproject the polygon. Coords are the same and the process takes seconds. |
Closing this issues since it is fixed now. |
The function to get coords is listed below. If we reproject the raster, then that will load the raster to memory (which for large rasters can be incredibly big running the node out of memory). However, we do have the option of reprojecting the vector. So for example, 1) read raster, 2) get bounds, and generate vector, 3) reproject that vector to 4326 then.
If these two reprojections are the same, which I think they are, I would do it in that order.
What do you all think?
The text was updated successfully, but these errors were encountered: