Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Support Geodetic TIFF grids #86

Open
Rennzie opened this issue Nov 28, 2023 · 22 comments
Open

Support Geodetic TIFF grids #86

Rennzie opened this issue Nov 28, 2023 · 22 comments
Labels
enhancement New feature or request

Comments

@Rennzie
Copy link
Contributor

Rennzie commented Nov 28, 2023

As a question. Should we consider supporting TIFF grids alongside Gravsoft and Ntv2?

@busstoptaktik
Copy link
Owner

busstoptaktik commented Nov 29, 2023

That would be awesome, @Rennzie - I just never imagined it realistical.

GTG is an utterly sensible design (although I think an adjustable and smaller block size than 256x256 would be meaningful). And being able to work seamlessly with PROJ's grid distribution would make so much sense.

But I seriously believe it will be a huge undertaking to bring it to life in full. Of course, ignoring the blocking and caching (which essentially builds a tiff-grid-specific virtual memory subsystem) and just read an entire file at a time, will make it much easier, but then the GTG virtues are gone.

I once had an idea of just plopping all grids into one continuous file, with metadata in a sidecar file, then read-only memory mapping the continuous file, and just reading along as if it was a huge array. That way, we make the OS handle the page swapping, which it can probably do better than a user space process. I did some implementation attempts, but other parts of RG were not sufficiently mature that I could imagine the most reasonable implementation, so after some false starts, I ripped it out again.

(I obviously realize that this approach would probably not work in a Wasm setting)

@Rennzie
Copy link
Contributor Author

Rennzie commented Dec 4, 2023

Apologies for the delay, I've been traveling. But It's given me some time to read the full RFC. It does sounds like a big piece of work but would be awesome to attempt.

Given your past attempts do you think it would be feasible to first build a full file reading version and then add the COG support afterwards? I'm imagining a TiffGrid (or GeoTiffGrid) struct that handles header reading and implements the Grid. We could separate header and grid reading in such a way that makes it easier to incrementally add tiles to the grid later down the line. There is a lot of surface area to cover, but if we broke it down bit by bit we could have something workable in the short term that we build on over time? WDYT?

@Rennzie
Copy link
Contributor Author

Rennzie commented Dec 4, 2023

The image-tiff repo seems to be the most active tiff library I can find.

@busstoptaktik
Copy link
Owner

WDYT?

I think the idea is awesome, but the undertaking massive - so perhaps let it live inside Geodesy (perhaps behind a feature flag), while it (and its API) matures, and then extracting it as a separately maintained library, because it's generally useful, down the road? (it would probably fit very well under the Georust umbrella).

Having it start up internally to Geodesy lets us avoid too many needless API considerations, and concentrate on functionality up front, and making (internal) API iterations costless.

@Rennzie
Copy link
Contributor Author

Rennzie commented Dec 7, 2023

Great, that sounds sensible. Shall we keep this issue as a way to discuss how we evolve the API?

In the mean time I'm going to try read in a simple file and grab its grid as a way to explore and learn about the format. Do you have any guidance on crates that could be of use? The one I linked above seems the most promising but you said you'd attempted this before so was curious to know what you'd tried.

@busstoptaktik
Copy link
Owner

Once you have a full grid, I think the GeoRust geotiff will do most of the work. The "download and cache a tile at a time" stuff is probably much harder...

@busstoptaktik
Copy link
Owner

but you said you'd attempted this before so was curious to know what you'd tried

I didn't try anything like this: My thought was that tile-splitting all grids and storing them in one gargantuan read-only-mmap'ed file would give the OS a chance to do what it's best at. The annoying thing about the tile-caching GTG is that essentially it introduces a user-space memory management system.

Fundamentally, the idea is right. But it would require less code if just behaving "as if" everything was in one huge continuous slab of memory, and let the OS handle the swapping disk-memory-level1/2/3-cache. That was why I was thinking of the resource file style jamming-everything-together-and-maintain-an-index-over-file-starts.

@pka
Copy link

pka commented Dec 8, 2023

I don't know how grids are stored in GeoTIFFs, but generally image-tiff does support far more tiff variants than georust/geotiff or other geotiff implementations. So I would suggest to implement geo-specific stuff on top of image-tiff, like I did in https://github.com/pka/georaster

@Rennzie
Copy link
Contributor Author

Rennzie commented Dec 11, 2023

Hit the first blocker. The GTG spec uses the GDAL_Metadata tag which isn't implemented by image-tiff by the looks of things. @pka did you need have any need for that tag in georaster?

@pka
Copy link

pka commented Dec 11, 2023

georaster doens't read this tag yet: https://github.com/pka/georaster/blob/main/src/geotiff.rs#L87-L94

PRs to image-tiff are welcome in my perception, releases are very infrequent, though.

@busstoptaktik
Copy link
Owner

busstoptaktik commented Dec 12, 2023

There are some fairly strict rules for GTG on the PROJ CDN:

  • Compression is DEFLATE
  • Predictor is 3
  • Interleave is BAND (which appears quite counterproductive to cache efficiency)

It does, however, support subdatasets, so your NTv2 infrastructure will be handy, @Rennzie

EDIT: I now see that Georaster supports subdatasets as well. So do we actually need anything but georaster? Perhaps better, @pka, to integrate @Rennzie's NTv2 support into georaster - it fits well under the "Rust library for accessing geospatial raster images" motto

Otherwise, for rudimentary support, I do not think we need to read anything but the Origin and Pixel Size elements

EDIT: And Size, obviously

@Rennzie
Copy link
Contributor Author

Rennzie commented Dec 23, 2023

Apologies for the lengthy silence. I'm travelling and on holiday until mid Jan so will only get a chance to get going again then.

@busstoptaktik busstoptaktik added the enhancement New feature or request label Dec 31, 2023
@busstoptaktik
Copy link
Owner

It seems @weiji14 is making huge progress on this over at cog3pio

@weiji14
Copy link

weiji14 commented Apr 4, 2024

Hi 👋! Yes I'm building another GeoTIFF reader on top of the image-tiff crate, with the array data parsed into ndarray. I'm at a stage where I'd like to upstream the core parts to georust/geotiff (see georust/geotiff#7), maybe combine efforts with @pka's georaster crate.

Otherwise, for rudimentary support, I do not think we need to read anything but the Origin and Pixel Size elements

EDIT: And Size, obviously

I've implemented this recently in weiji14/cog3pio#8 by storing the origin and pixel size in an AffineTransform struct. I know georaster has implemented this via some getter methods at https://github.com/pka/georaster/blob/191ebb6e89912688c7b0ef6264ad197d9dc45e4b/src/geotiff.rs#L140-L152

The GTG spec uses the GDAL_Metadata tag which isn't implemented by image-tiff by the looks of things.

It's relatively straightforward to add a tag to image-tiff, see e.g. image-rs/image-tiff@0a4b510 where the GdalNodata tag was added with a one-line change.

@pka
Copy link

pka commented Apr 4, 2024

👍

maybe combine efforts with @pka's georaster crate.

I'm lurking and open for any form of collaboration...

@busstoptaktik
Copy link
Owner

@weiji14 said:

Yes I'm building another GeoTIFF reader on top of the image-tiff

That sounds awesome! I took a look at image-tiff recently, and it didn't seem to support in any straightforward way, the 2 channel files which provide the bread-and-butter for geodetic datum shifts, so if you are getting that working, image-tiff/cog2pio will really be much more useful for geodesy (and Geodesy).

It would be ideal to be able to use PROJ's infrastructure for CDN delivery of geodetic tiff grids, using whichever combination of georaster and cog2pio you and @pka may be able to conjure up for public consumption :-)

The vista of referring all tiff access from Geodesy to an external dependency, letting Geodesy concentrate on the geodetic intricacies (which are "tuff enuff"), and leave the file format intricacies to a dedicated library seems like the better way, so I'm sure Geodesy will be an eager early adopter once you signal "ready for testing".

@busstoptaktik
Copy link
Owner

@Rennzie as you're the thread starter, I believe you'll get notifications even if not mentioned directly, but a @mention here just to be sure

@weiji14
Copy link

weiji14 commented Apr 5, 2024

took a look at image-tiff recently, and it didn't seem to support in any straightforward way, the 2 channel files which provide the bread-and-butter for geodetic datum shifts, so if you are getting that working, image-tiff/cog2pio will really be much more useful for geodesy (and Geodesy).

Actually, my PR to support reading multi-band GeoTIFFs in image-tiff got merged last month - image-rs/image-tiff#224 (though they haven't made a release yet). I've included this in weiji14/cog3pio#13, though note that cog3pio only works with float32 dtype right now, I still need to get my head around generic types (help is welcome though).

@busstoptaktik
Copy link
Owner

note that cog3pio only works with float32 dtype

That's totally fine: All classical geodetic correction grid formats are f32 anyway (giving corrections in seconds of arc (i.e. 30 m), an f32 resolution of 1e-7 is at the micrometer scale, i.e. much higher resolution than the correction estimation accuracy.

f64 may be necessary if one wants to implement projections directly in the grids: Here the corrections will reach megameter scale, hence only leaving decimeter resolution for the ulp in the f32 case

@Rennzie
Copy link
Contributor Author

Rennzie commented Apr 8, 2024

Thanks for the prompt @busstoptaktik. This all sounds really exciting. I'm unfortunately not able to spend much time on it at the moment (due to huge upheaval at work), but I'm keen to see it get going.

@pka and @weiji14 what are your feelings about combining your efforts into georaster vs georust/geotiff? Or would georaster depend on georust/geotiff in the future?

EDIT: I now see that Georaster supports subdatasets as well. So do we actually need anything but georaster? Perhaps better, @pka, to integrate @Rennzie's NTv2 support into georaster - it fits well under the "Rust library for accessing geospatial raster images" motto

I guess this depends if we want to combine everything into georust/geotiff or not. If we like that option best then perhaps the right thing to do is create a georust/ntv2 crate? Personally I like the idea of georust being the convergence point for our efforts.

Lastly, my long term aim is for geodesy-wasm to support reading GTGs so being able to download them from a URL (via COG [preferable] or in full) is a key implementation point.

@weiji14
Copy link

weiji14 commented Apr 27, 2024

@pka and @weiji14 what are your feelings about combining your efforts into georaster vs georust/geotiff? Or would georaster depend on georust/geotiff in the future?

EDIT: I now see that Georaster supports subdatasets as well. So do we actually need anything but georaster? Perhaps better, @pka, to integrate @Rennzie's NTv2 support into georaster - it fits well under the "Rust library for accessing geospatial raster images" motto

I guess this depends if we want to combine everything into georust/geotiff or not. If we like that option best then perhaps the right thing to do is create a georust/ntv2 crate? Personally I like the idea of georust being the convergence point for our efforts.

I'd prefer putting in effort into georust/geotiff and possibly create georust/ntv2, and then potentially we (or @pka) can decide if georaster will act as the 'high-level' library to wrap different image types, much like how image-rs/image wraps tiff, png and other formats now. Though this does sound like we are re-inventing GDAL somewhat 😆

@pka
Copy link

pka commented Apr 29, 2024

@pka and @weiji14 what are your feelings about combining your efforts into georaster vs georust/geotiff? Or would georaster depend on georust/geotiff in the future?

The reason for creating georaster was reading DTMs from Cloud Optimized GeoTIFFs but also PNGs. For the former, partial read access is essential, for other formats it's not so important, because they are usually served as tiles.
So it's difficult to answer the question. I have no immediate plans with georaster and I'm fine if its functionality goes into a new geotiff library. Since PNG support isn't implemented yet, there is not much lost.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
enhancement New feature or request
Projects
None yet
Development

No branches or pull requests

4 participants