Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Create a transparency layer (alpha channel) for WGS84 derivative GeoTIFFs if one does not already exist #570

Open
thatbudakguy opened this issue Jan 11, 2023 · 11 comments · Fixed by #735 · May be fixed by #799
Open

Create a transparency layer (alpha channel) for WGS84 derivative GeoTIFFs if one does not already exist #570

thatbudakguy opened this issue Jan 11, 2023 · 11 comments · Fixed by #735 · May be fixed by #799
Assignees

Comments

@thatbudakguy
Copy link
Member

thatbudakguy commented Jan 11, 2023

We have received feedback regarding the display of GeoTIFFs stored in our GeoServer instances.

3-Band raster data, those with 3 channels (R. G, B) display with a black box where pixels should be transparent.

Here is an example:
https://earthworks.stanford.edu/catalog/stanford-bf278sn8784

Although the black border is not visible when data are downloaded and used in a GIS, it is visible when using the WMS/WFS feature.

The addition of an alpha channel (4th band) to the WGS84 derivative that is uploaded to GeoServer would make these pixels transparent.

Comment from Stace: This will likely require adding/altering a GDAL transformation to the robots? Should be pretty straightforward, but if we could use the Cloud-Optimized Geotiff driver in GDAL tfor the output, that would give us the added bonus of at least beginning to ingest compliant COGs, moving forward? https://gdal.org/drivers/raster/cog.html

This ticket has been reopened since we discovered the approach taken in #735 of using gdalwarp -dstalpha to add the alpha channel has several undesirable side effects:

Image

Is there a way to use gdal or some other tool to update the GeoTIFF such that it displays as desired in GeoServer? Or do we need to wait until we have moved to making Cloud Optimized GeoTIFFs for everything and access via GeoServer is deprecated?

@thatbudakguy thatbudakguy converted this from a draft issue Jan 11, 2023
@thatbudakguy
Copy link
Member Author

While we're changing the robot behavior, worthwhile (as stace notes above) to start making COGs so that we can unblock sul-dlss/earthworks#867 and use those COGs later on.

@kimdurante
Copy link
Contributor

kimdurante commented Jan 10, 2024

For some more context on this issue. This website is using our WMS/WFS services for georeferenced maps stored in SDR.

https://www.imaginedsanfrancisco.org/

The black borders were appearing, so they had to remediate the files to include an alpha channel.

@edsu edsu self-assigned this Jan 16, 2024
@edsu edsu moved this from Ready to In Progress in Geo Workcycles 2024 Jan 16, 2024
@edsu
Copy link
Contributor

edsu commented Jan 23, 2024

It looks like the place to do this is in normalize_data since that's where other color manipulations are already happening? I'm assuming this new functionality doesn't warrant a new workflow step?

The GeoTIFF in bf278sn8784 referenced above is kinda huge (1 GB) so it would be nice to have a smaller one to test with.

@justinlittman
Copy link
Contributor

Please note that I am in the midst of a significant refactor of Normalize Data robot.

@edsu
Copy link
Contributor

edsu commented Jan 23, 2024

@justinlittman thanks, are you removing the color normalization?

@kimdurante
Copy link
Contributor

@edsu in the gis_workflow_data/fixtures folder:
zx688yx4017 - is a GeoTIFF requiring an alpha channel (~181MB)
vn977hm3834 - is a GeoTIFF containing an alpha channel

Hope that helps! I can add more if you need.

@justinlittman
Copy link
Contributor

@edsu No, but the overall structure of the code will be much different.

@kimdurante
Copy link
Contributor

@edsu added to gis_workflow_data/fixtures folder:

jk521jx4901 - Single band raster

qr891rz3640 - 3 band raster (73.5 MB)

edsu added a commit that referenced this issue Jan 26, 2024
Note, an alpha layer is always added (to single and 3 band GeoTIFF
files) since gdalwarp doesn't add a new alpha band if one is already
present.

Fixes #570
edsu added a commit that referenced this issue Jan 26, 2024
Note, an alpha layer is always added (to single and 3 band GeoTIFF
files) since gdalwarp doesn't add a new alpha band if one is already
present.

Fixes #570
@github-project-automation github-project-automation bot moved this from In Progress to Done in Geo Workcycles 2024 Jan 26, 2024
@edsu
Copy link
Contributor

edsu commented Feb 20, 2024

I'm reopening this after discussion with @kimdurante, @jmartin-sul & @aaron-collier since the approach taken in #735 didn't work as desired. The initial description has been updated with information we learned.

@jmartin-sul
Copy link
Member

jmartin-sul commented Mar 4, 2024

from team planning discussion mon 2024-03-04, and follow up meeting tue 2024-03-12 with @kimdurante and @edsu:

  • giant Cloud Optimized GeoTIFF fails to complete step with alpha channel addition in play, see: https://app.honeybadger.io/projects/52899/faults/105082106 (was on a958cca)
  • we suspect that one issue is that this GeoTIFF already had an alpha channel, and the alpha channel addition operation on this COG added a 5th band/2nd alpha channel. we may want to check for an alpha channel (using gdalinfo?) and then only attempt to add one if one is not already present.
  • this large (2.7 GB iirc) COG may also need to be run through the two step process, where the alpha channel is added to the original and the derivative is output to a virtual (.vrt) file; after which we pipe the virtual file output to gdal_translate to create a compressed GeoTIFF derivative. see e.g. 1a7998e
  • thus, we think the next thing to try might be a combination of:
    • rebasing on WIP for better error messages from system calls (calls to system utilities should report the contents of stderr when the call fails #820), to see what exactly is failing with the giant COG
    • using the -dstnodata 0 flag on the gdalwarp command (as in this commit)
      • but only after examining the input file to see if an alpha channel is present already. no code written yet to do this... but gdalinfo -nomd -json called on file will list band info. only add alpha channel if input file already has exactly 3 bands, since 1 indicates grayscale, 4 indicates already has alpha. can maybe leave comment in code indicating that band will specifically say whether it's alpha1, and maybe the 3 band rule won't apply forever, but works for content we care about now.
      • because maybe the 5th band (2nd/extraneous alpha channel) was causing trouble with earthworks display?
      • example: this doesn't display correctly in leaflet: https://edsu.github.io/leaflet-geoserver-example/ ... or earthworks? but displays with proper transparency (instead of black border) in QGIS (desktop app).
    • the two step process where gdalwarp generates a virtual file, and gdal_translate turns that virtual file into a compressed GeoTIFF derivative with an added (but only if it was missing) alpha channel. see e.g. this commit.

Footnotes

  1. json_output['bands'][N]['colorInterpretation'] == 'Alpha' for any of the N bands in the JSON output, or Band \d Block=.* Type=.*, ColorInterp=Alpha if regexing the less structured output. e.g. { "bands": [{ "band": 1, ... }, ..., { "band":4, ..., "colorInterpretation":"Alpha" }] } or Band 4 Block=2673x128 Type=Byte, ColorInterp=Alpha

@jmartin-sul
Copy link
Member

example of a large COG (cloud optimized GeoTIFF) that's run into trouble with various iterations of this work: https://argo-stage.stanford.edu/view/kg552qb0295

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment