Add Cloud Optimized GeoTIFF Support #211

joshfix · 2019-09-25T19:19:08Z

This PR introduces a new plugin that seeks to take advantage of the COG structure by asynchronously reading byte ranges for contiguous TIFF tiles. It includes caching and non-caching ImageInputStream implementations, as well as clients (RangeReaders) capable of reading remote COGs using HTTP, Azure, and AWS S3 client libraries.

dromagnoli

Hi Josh,
Thanks for the contribution of your work.
I started a first quick review of your commits to provide you a first set of feedbacks.
We might be back with some other comments after other passes.
[] Please, remove all .DS_Store committed files (I see you have put a rule in .gitignore but some files are still there)
[] It would be better if all the commits are squashed into a single one (messages can be preserved). Not a big issue, we might want to click on squash-and-merge during merge.

I have to re-check it a second time to understand how it works when using JAI ImageRead deferred read.

dromagnoli · 2019-10-07T08:34:43Z

plugin/cog/cog-commons/src/main/java/it/geosolutions/imageioimpl/plugins/cog/CogTileInfo.java

+    }
+
+    public void addTileRange(int tileIndex, long offset, long byteLength) {
+        //if ((offset < firstTileOffset && offset > 0) || tileIndex == 0) {


Please remove commented-out lines

dromagnoli · 2019-10-07T08:37:28Z

plugin/cog/cog-commons/src/main/java/it/geosolutions/imageioimpl/plugins/cog/CogTileInfo.java

+            firstTileByteLength = byteLength;
+        }
+        if (offset < headerSize && offset > 0) {
+            //headerSize = (int)offset - 1;


Please remove commented-out lines.

dromagnoli · 2019-10-07T08:38:40Z

...n/cog/cog-commons/src/main/java/it/geosolutions/imageioimpl/plugins/cog/PropertyLocator.java

+package it.geosolutions.imageioimpl.plugins.cog;
+
+/**
+ * This utility class examines will look for either a system property or environment variable.  When calling


I will probably remove the word "examines"

dromagnoli · 2019-10-07T08:42:48Z

plugin/cog/cog-commons/src/main/java/it/geosolutions/imageioimpl/plugins/cog/RangeReader.java

+            if (ranges[i][0] < headerLength - 1) {
+                // this range starts inside of what we already read for the header
+                modified = true;
+                if (ranges[i][1] < headerLength - 1 || ranges[i][1] == headerLength -1) {


I think that we could simplify this check with a "<=".

dromagnoli · 2019-10-07T08:50:25Z

...angereader-azure/src/main/java/it/geosolutions/imageioimpl/plugins/cog/AzureRangeReader.java

+
+                            completed.add(key);
+                        } catch (Exception e) {
+                            e.printStackTrace();


Printing stack trace isn't a good practice. Better using the LOGGER, instead.
Question: What we can do when an exception occurs? Will the data be corrupted? Should we fail the whole read?

If this fails, 1 or more tiles may become corrupt. This could lead to an error decoding that tile(s). Because TIFFImageReader reads 1 tile at a time, I wasn't 100% certain if it would fail on a single failed tile, or if only that tile would not render. Being optimistic, I didn't fail hard -- I wanted to give the other ranges/tiles a chance to be read and cached in the event they still might be rendered.

dromagnoli · 2019-10-07T09:03:49Z

...angereader-azure/src/main/java/it/geosolutions/imageioimpl/plugins/cog/AzureRangeReader.java

+        try {
+            buffer.put(bytes);
+        } catch (Exception e) {
+            LOGGER.severe("Error writing bytes to ByteBuffer for source " + uri);


Wondering if we need to also report the causing exception in the logged error

dromagnoli · 2019-10-07T09:05:52Z

.../cog-rangereader-s3/src/main/java/it/geosolutions/imageioimpl/plugins/cog/S3RangeReader.java

+                            writeValue((int)key, future.get().asByteArray());
+                            completed.add(key);
+                        } catch (Exception e) {
+                            e.printStackTrace();


See my comments above on the "XYZRangeReader" about printing StackTrace

dromagnoli · 2019-10-07T09:15:41Z

...treams/src/main/java/it/geosolutions/imageioimpl/plugins/cog/DefaultCogImageInputStream.java

+        try {
+            streamPos = delegate.getStreamPosition();
+        } catch (IOException e) {
+            e.printStackTrace();


Same comment as above about printing stacktrace :)

dromagnoli · 2019-10-07T09:15:44Z

...treams/src/main/java/it/geosolutions/imageioimpl/plugins/cog/DefaultCogImageInputStream.java

+        try {
+            delegate.seek(streamPos);
+        } catch (IOException e) {
+            e.printStackTrace();


Here too :)

dromagnoli · 2019-10-07T09:19:49Z

...treams/src/main/java/it/geosolutions/imageioimpl/plugins/cog/CachingCogImageInputStream.java

+                        Arrays.copyOfRange(b, (int) tileRange.getStart(), (int) (tileRange.getEnd() + 1));
+                CacheManagement.DEFAULT.cacheTile(key, tileBytes);
+            } catch (Exception e) {
+                e.printStackTrace();


Same as above on printing stack trace

joshfix · 2019-10-08T18:56:57Z

I believe I have addressed all of the issues pointed out. Please let me know if there are more findings and I will address those as well.

dromagnoli · 2019-10-09T08:57:04Z

Thanks @joshfix
@aaime will do a review too early next week.

aaime

Had a look, like the design, found a bunch of minor issues, but also a very serious one (memory usage).
One thing that's left out is control over the asynch requests, I guess they are parallelized, how does one control how many concurrent requests the server (HTTP, S3, Azure) is going to be hit with, worst case?

aaime · 2019-10-14T13:13:14Z

plugin/tiff/src/main/java/it/geosolutions/imageioimpl/plugins/tiff/TIFFImageReaderSpi.java

@@ -125,6 +125,10 @@ public TIFFImageReaderSpi() {
              );
    }

+    public static void setReaderClassName(String readerClassName) {


This does not look right, I see you changing the value in CogImageReaderSpi, but the method should be non static and protected (or even better, turned into a protected constructor taking the reader names), and the two fields also non static, otherwise both Spis might end up exposing the same reader and writer names (maybe not in practice, but still seems problematic too me).

Updated to make this non-static.

aaime · 2019-10-14T13:15:44Z

...g/cog-commons/src/main/java/it/geosolutions/imageioimpl/plugins/cog/CogImageInputStream.java

+ */
+public interface CogImageInputStream {
+
+    void readRanges();


Some javadocs on the methods would not hurt (or else, would help review :-D).
Same goes for the other interfaces.

All of the interfaces in the commons module are now thoroughly commented at the method level.

aaime · 2019-10-14T13:18:25Z

plugin/cog/cog-commons/src/main/java/it/geosolutions/imageioimpl/plugins/cog/CogTileInfo.java

+        this.headerSize = headerSize;
+    }
+
+    public void addTileRange(int tileIndex, long offset, long byteLength) {


So based on the offset one has two different side effects? Maybe best explained in the javadoc of the method, if dedicated methods for each side effect are not desirable.

Could you provide a bit more detail about your concern here? Is it just that the two separate if statements are confusing? I will definitely take another pass at documenting methods better.

It's one method call that does two entirely different things based on input parameters, at the very least they should be explained. Or just have two separate methods with clear names instead (javadocs are still a plus in this case too).

Removed tracking the start tile index, added documentation about the remaining logic to check header size.

aaime · 2019-10-14T13:27:50Z

plugin/cog/pom.xml

+                <version>${azure.version}</version>
+            </dependency>
+            <dependency>
+                <groupId>com.squareup.okhttp3</groupId>


Interesting, haven't heard of this library before. In the GeoServer ecosystem we use commons httpclient (mostly version 3, a bit of 4). Seems a moderately big new dependency for the ecosystem (700k). Would moving to commons-httpclient impair functionality significantly, or cause performance regressions?

OkHttp seems to be a pretty popular client, but I'm not tied to it. This was actually a rewrite as I originally used the Java 11 HttpClient, but Torben pointed out that this needs to be compatible with Java 8, so I moved to OkHttp. I have no issue moving to commons-httpclient as it seems to support async connections. My question now is, would you be against introducing v5 of the client? It supports http/2 multiplexing, which eliminates the concerns of how many connections get created. This seems like the best modern approach to take, but still introduces a new dependency. Or potentially I could provide an http1 and http2 client?

https://hc.apache.org/httpcomponents-client-5.0.x/httpclient5/examples/AsyncClientH2Multiplexing.java

If it's not aligned with the GeoTools current usage it would be one more dependency nevertheless, at that point, probably better to keep it as is and concentrate on the other issues.

Left it alone for now. I'm fine with rewriting this in the future or refactoring this to be the http2 module and providing another http module based on commons-httpclient.

aaime · 2019-10-14T13:36:24Z

plugin/cog/cog-commons/src/main/java/it/geosolutions/imageioimpl/plugins/cog/RangeReader.java

+        return headerLength;
+    }
+
+    public abstract void readAsync(long[]... ranges);


Found the name confusing... the method is actually synchronous, it's just that internally it does calls that are eventually parallellized and running asynch, but it waits for all of them to be completed before returning. I'd just call it "read"? How it works internally it's a implementation detail of the subclasses... or not?

Also, since it's returning nothing, it would be nice to explain what needs to be done next (I'm guessing, calling getBytes()).

All correct. Looks like I'll be refactoring all of this to avoid the usage of ByteBuffer. Had async requests stuck in my brain when I created that method, and originally I also had another read method to make a single request that was purely synchronous. I'll just rename the method as it is in fact, non-synchronous and will return the result after I refactor.

Updated the name and modified the method to return the data.

aaime · 2019-10-14T14:29:59Z

...treams/src/main/java/it/geosolutions/imageioimpl/plugins/cog/DefaultCogImageInputStream.java

+ * @author joshfix
+ * Created on 2019-08-23
+ */
+public class DefaultCogImageInputStream implements ImageInputStream, CogImageInputStream {


So this class works in two steps, initially it has no content and won't work, then one inits the header, and it can provide the header only, and after calling readranges, it can offer the tiles too (only the requested ones). So it's not a generic image input stream, but a special purpose one that can only be used in concert with the COG reader. This should be explicit imho.

Also all correct. The central concept of my design is to "warm" the streams with the data of (only) the requested tiles up front, before we actually start to decode tiles. The only way that can be accomplished is by reading the header to determine the tile offsets and lengths. Basically the input streams are completely non-functional without being initialized with the header data.

I guess I assumed from the name and the interface it would be apparent that it was purpose-built for COG, but I will add more documentation stating and explaining its purpose and limitations.

ImageInputStream are an extension point that can be looked up and used in whatever other format, making one that's specific is off its intended usage. I understand the reason behind the current implementation, and hopefully it's not going to cause side effects for other concurrent sources in the same server (e.g., other geotiffs, ecws and the like), but it's best to be quite clear when going off design.

Added explicit documentation regarding purpose.

aaime · 2019-10-14T14:33:44Z

plugin/cog/cog-commons/src/main/java/it/geosolutions/imageioimpl/plugins/cog/RangeReader.java

+    }
+
+    public void setFilesize(int filesize) {
+        buffer = ByteBuffer.allocate(filesize);


Wow, danger zone! The byte buffer here is actually allocated in memory, so if one has to deal with a bigtiff that's say 100GB in the cloud, the code will actually allocate in the local memory 100GB, even if the current read only aims at reading one tile?
See: https://docs.oracle.com/javase/7/docs/api/java/nio/ByteBuffer.html#allocate(int)

If I understand correctly, the current implementation works for (very) small images and light load, but it's deadly easy to make it go OOM. It would need some sort of sparse byte buffer, or, instead of delegating to a MemoryCacheImageInputStream, build its own implementation that only keeps in memory the tiles needed for the current read... should be doable by keeping an "index" of offsets pointing at the actual bytes, and doing some offset math whenever a read is called.

Thanks for pointing this out. In fact I did not take into account very large files or consider the ramifications of blindly allocating the full filesize in the byte buffer. I'm going to spend a bit of time contemplating the best way to handle this. My initial design was to keep the RangeReader implementations responsibilities as narrow as possible -- read the final ranges required for the requested tiles, but know nothing about tiles, then fill in the byte buffer at the appropriate places. I'm thinking maybe this should be modified so that:

The CogTileInfo object is passed to the range reader so that it has full knowledge of each requested tile index, offset, length, etc

The range reader uses the RangeBuilder to build the contiguous ranges to be read

The range reader then decomposes the results back into per-tile byte arrays (possibly storing them in the corresponding TileRange objects in CogTileInfo

Update the DefaultCogImageInputStream accordingly with offset math, etc

What do you think about this solution?

I think the RangeReader classes can be kept compact and focused, if instead of returning a byte[], end up returning a Map<long[], byte[]> with just the byte ranges read. The interesting and bit complicated bit will be updating the CogImageInputStream implementations so that they keep track of the current reading position and do the offset math dance. I don't see TileInfo needing to be updated, but it's more of an implementation preference.

Updated to return Map<Long, byte[]>, where the Long value is the start of the range. The end of the range, if needed, can be calculated using the byte[] length.

Oh, I suppose it's important to mention that I completely removed a) any reference or attempt to understand the actual file size and b) usage of the class-level ByteBuffer to store all range data :)

aaime · 2019-10-14T14:34:17Z

...-rangereader-http/src/main/java/it/geosolutions/imageioimpl/plugins/cog/HttpRangeReader.java

+                String length = contentRange.split("/")[1];
+                try {
+                    filesize = Integer.parseInt(length);
+                    buffer = ByteBuffer.allocate(filesize);


OOM issue, another problematic point.

ByteBuffer has been removed.

aaime · 2019-10-14T14:34:25Z

.../cog-rangereader-s3/src/main/java/it/geosolutions/imageioimpl/plugins/cog/S3RangeReader.java

+        try {
+            HeadObjectResponse headResponse = client.headObject(headObjectRequest).get();
+            filesize = headResponse.contentLength().intValue();
+            buffer = ByteBuffer.allocate(filesize);


OOM issue, another problematic point.

ByteBuffer has been removed.

aaime · 2019-10-14T14:34:35Z

...angereader-azure/src/main/java/it/geosolutions/imageioimpl/plugins/cog/AzureRangeReader.java

+        connector = new AzureConnector(AzureUrlParser.getAccountName(uri));
+        client = connector.getAzureClient(containerName, filename);
+        filesize = (int) client.getProperties().block().blobSize();
+        buffer = ByteBuffer.allocate(filesize);


OOM issue, another problematic point.

ByteBuffer has been removed.

joshfix · 2019-10-23T15:36:26Z

Committed changes that should address all concerns @aaime. In addition, I made a few other noteworthy changes:

The azure sdk was a preview version and I started encountering very odd behavior. I have decided to remove support entirely until the library stabilizes.
RangeBuilder was renamed to ContiguousRangeComposer to help further distinguish it from RangeReader naming.
Thread/connection pools (and the ability to configure them via environment variables or system properties) has been added to the S3 and HTTP range reader implementations.
File size caching has been removed, as the code no longer attempts to obtain or use the full file size in any capacity.
The mime type "image/tiff; application=geotiff" has been added to TIFFImageReaderSpi (see: Media (MIME) type for GeoTIFF opengeospatial/geotiff#34 (comment))

aaime

Looks good, thanks for all the changes! Added a few comments, hopefully quick to fix.

Now... I'd like to test this in GeoServer. Is it just a matter of putting it in the classpath and using a http URI or is anything else needed?

aaime · 2019-10-23T16:43:58Z

plugin/cog/cog-reader/src/main/java/it/geosolutions/imageioimpl/plugins/cog/CogImageReader.java

+
+/**
+ * ImageReader implementation extending from TIFFImageReader.  If this class encounters an ImageInputStream that does
+ * not implement `CogImageInputStream`, it will simply pas the request on to TIFFImageReader.  Otherwise, it will


pas -> pass

aaime · 2019-10-23T16:49:02Z

plugin/cog/cog-commons/src/main/java/it/geosolutions/imageioimpl/plugins/cog/CogTileInfo.java

+    protected int headerLength = DEFAULT_HEADER_LENGTH;
+    protected Map<Integer, TileRange> tileRanges = new TreeMap<>();
+    public static final int HEADER_TILE_INDEX = -100;
+    public static final int DEFAULT_HEADER_LENGTH = 16384;


Could it be that the header is longer than 16k (e.g., bigtiff with very large tile directory?) Would that badly affect functionality?

I personally have not seen this and decided on 16K because it's the default for GDAL COG reads as well. I would assume if a TIFF had enough overviews and enough tiles, you could eventually break past 16K, but it seems unlikely to me? I just realized that I made an effort to make the initial headerLength configurable, but provided no real mechanism to allow an external user to set it. I'm thinking I could just add it as a field to CogImageReadParam.... thoughts?

aaime · 2019-10-23T16:49:41Z

plugin/cog/cog-commons/src/main/java/it/geosolutions/imageioimpl/plugins/cog/RangeReader.java

@@ -0,0 +1,54 @@
+package it.geosolutions.imageioimpl.plugins.cog;


Used to have copyright header, now gone.

aaime · 2019-10-23T16:57:18Z

...treams/src/main/java/it/geosolutions/imageioimpl/plugins/cog/DefaultCogImageInputStream.java

+    @Override
+    public int read() throws IOException {
+        streamPos++;
+        return 0;


Seems a bit harsh, I'm guessing the TIFF reader never does a single byte read call?

Both are true, it is a bit harsh and it never gets called :) Either way, I updated it to read a single byte from the other read method.

aaime · 2019-10-23T16:58:16Z

...treams/src/main/java/it/geosolutions/imageioimpl/plugins/cog/DefaultCogImageInputStream.java

+
+        // this should never happen -- we should have read all bytes from all tiles in the request envelope
+        if (contiguousRange == null || rangeStart == -1L) {
+            streamPos += len;


If it can really never happen, maybe best to throw an exception here.

updated to throw an exception

aaime · 2019-10-23T16:59:07Z

...treams/src/main/java/it/geosolutions/imageioimpl/plugins/cog/DefaultCogImageInputStream.java

+
+        int relativeStreamPos = (int)(streamPos - rangeStart) + off;
+        // copy the bytes from the fetched tile into the destination byte array
+        for (int i = 0; i < len; i++) {


Eeek, pretty inefficient! Can you use System.arrayCopy instead? It accepts offsets between the two arrays
:-D

updated this and the caching implementation to use System.arraycopy.

aaime · 2019-10-23T16:59:49Z

plugin/cog/pom.xml

+                <version>${azure.version}</version>
+            </dependency>
+            <dependency>
+                <groupId>com.squareup.okhttp3</groupId>


joshfix · 2019-10-23T17:59:50Z

Re: how to test in GeoServer... There is a blocker :)

I think we would really just want the ability to use the GeoTools GeoTiffReader. The problem is that the GeoTiffFormat class doesn't allow you to modify the image reader SPI:

https://github.com/geotools/geotools/blob/master/modules/plugin/geotiff/src/main/java/org/geotools/gce/geotiff/GeoTiffFormat.java#L127

To circumvent the issue, I created my own format class that is nearly an exact duplicate of GeoTiffFormat, but the getReader method always returns a COG reader:

https://github.com/joshfix/cog-image-reader/blob/master/src/main/java/com/joshfix/geotools/geotiff/CogFormat.java#L299

(note this is utilizing an old version of the COG project)

I have tested this in the past using a custom mosaic store that uses a STAC catalog as a feature source and uses the aforementioned format class as the suggested format:

https://github.com/joshfix/imagemosaic-stac/blob/dev/src/main/java/com/joshfix/stac/store/mosaic/StacMosaicReader.java#L267

(forgive the ugly code, i just pushed the dev branch which was full of in-progress tests) It worked fine for me and I was able to dynamically render landsat 8 mosaics. Not sure what the best/recommended approach would be here, but I think we're going to need to update some GeoTools code to get this working in GeoServer.

aaime · 2019-10-31T09:09:42Z

Doh sorry I missed your earlier comment. How about modifying the GeoTIFF format to use a list of potential image readers instead? And skip the subclass?

joshfix · 2019-10-31T12:37:35Z

No worries! I think you may have missed my other comment/question as well about setting the header length. I realized that even though most of my classes accept header length as a variable, there is no way outside of the CogImageReader to set it. Previously, the default value came from CogTileInfo. I modified the code to make the value come from CogImageReadParam. It still has the same default value, but now it's possible to pass in a custom value via the param. I also pushed a change that overrides canDecodeInput in the image reader SPI to make sure the stream is initialized.

RE: modifying GeoTiffFormat... Given that it's a GeoTools class, does that mean that this imageio-ext code is good to go? Should we move this conversation now to GeoTools? I am definitely open to suggestion here on how to handle the GeoTools code.

It seems we would need to deal with this in GeoTiffFormat:

    /** SPI for the reader. */
    private static final TIFFImageReaderSpi IMAGEIO_READER_FACTORY = new TIFFImageReaderSpi();

GeoTiffReader has the same private static final field. In addition, there may be some situations we need to handle. If a CogUrl is passed as input to GeoTiffReader, ImageIOExt.getImageInputStreamSPI should be able to create a new CogImageInputStream, however it will not be initialized (the RangeReader implementation will not be set and the header will not be read). This also applies to anywhere the reader is calling READER_SPI.createReaderInstance()... we'll need to handle initializing it if it's a Cog image reader. Other things that could be problematic: the getHRInfo uses the SPI to create a reader instance, then calls reader.getImageMetadata, which will suffer from the same problem of the stream not being initialized. A Hints object is passed around in GeoTiffReader, in the constructor, and in getHRInfo, so maybe that would be something we could take advantage of to help handle COG readers?

Sorry for the long-winded comment. Let me know how you'd like to proceed from here!

aaime · 2019-12-09T15:23:58Z

Hi @joshfix so I wanted to go ahead and merge, and then move attention to GT/GS indeed.
Tried to build locally, was greeted by this:

Tests in error: 
  readHttpRanges(it.geosolutions.imageio.cog.S3RangeReaderTest): java.util.concurrent.ExecutionException: software.amazon.awssdk.services.s3.model.S3Exception: The AWS Access Key Id you provided does not exist in our records. (Service: S3, Status Code: 403, Request ID: 5EDD5F80B931B58F)
  readS3Ranges(it.geosolutions.imageio.cog.S3RangeReaderTest): java.util.concurrent.ExecutionException: software.amazon.awssdk.services.s3.model.S3Exception: The AWS Access Key Id you provided does not exist in our records. (Service: S3, Status Code: 403, Request ID: 8BD251F71DD2C191)

If there are online tests that need specific setups, or even just internet access, it's best if they are enabled only with a profile, or a config file that the dev has to provide.

joshfix · 2019-12-09T18:06:58Z

This is very interesting. I can't reproduce locally and the test runs against a publically available landsat 8 image on s3, so no aws key is required for to read the image.

aaime · 2019-12-09T18:12:24Z

Hum... a test that requires an internet connection should still be activated only by profiles.
About the failures, maybe I found something... I have an old .aws directory in my home, that has credentials that are probably expired? I tried removing it, and the error message changes:


Tests in error: 
  readHttpRanges(it.geosolutions.imageio.cog.S3RangeReaderTest): java.util.concurrent.ExecutionException: software.amazon.awssdk.core.exception.SdkClientException: Unable to load credentials from any of the providers in the chain AwsCredentialsProviderChain(credentialsProviders=[SystemPropertyCredentialsProvider(), EnvironmentVariableCredentialsProvider(), ProfileCredentialsProvider(), ContainerCredentialsProvider(), InstanceProfileCredentialsProvider()]) : [SystemPropertyCredentialsProvider(): Unable to load credentials from system settings. Access key must be specified either via environment variable (AWS_ACCESS_KEY_ID) or system property (aws.accessKeyId)., EnvironmentVariableCredentialsProvider(): Unable to load credentials from system settings. Access key must be specified either via environment variable (AWS_ACCESS_KEY_ID) or system property (aws.accessKeyId)., ProfileCredentialsProvider(): Profile file contained no credentials for profile 'default': ProfileFile(profiles=[]), ContainerCredentialsProvider(): Cannot fetch credentials from container - neither AWS_CONTAINER_CREDENTIALS_FULL_URI or AWS_CONTAINER_CREDENTIALS_RELATIVE_URI environment variables are set., InstanceProfileCredentialsProvider(): Unable to load credentials from service endpoint.]
  readS3Ranges(it.geosolutions.imageio.cog.S3RangeReaderTest): java.util.concurrent.ExecutionException: software.amazon.awssdk.core.exception.SdkClientException: Unable to load credentials from any of the providers in the chain AwsCredentialsProviderChain(credentialsProviders=[SystemPropertyCredentialsProvider(), EnvironmentVariableCredentialsProvider(), ProfileCredentialsProvider(), ContainerCredentialsProvider(), InstanceProfileCredentialsProvider()]) : [SystemPropertyCredentialsProvider(): Unable to load credentials from system settings. Access key must be specified either via environment variable (AWS_ACCESS_KEY_ID) or system property (aws.accessKeyId)., EnvironmentVariableCredentialsProvider(): Unable to load credentials from system settings. Access key must be specified either via environment variable (AWS_ACCESS_KEY_ID) or system property (aws.accessKeyId)., ProfileCredentialsProvider(): Profile file contained no credentials for profile 'default': ProfileFile(profiles=[]), ContainerCredentialsProvider(): Cannot fetch credentials from container - neither AWS_CONTAINER_CREDENTIALS_FULL_URI or AWS_CONTAINER_CREDENTIALS_RELATIVE_URI environment variables are set., InstanceProfileCredentialsProvider(): Unable to load credentials from service endpoint.]

Rings any bell? Could it be that you're not seeing the error as you have the "right" credentials or at least a set of valid ones?

joshfix · 2019-12-09T21:17:42Z

I renamed my aws config files and was able to reproduce. It was a simple fix -- when I'm building the s3 client, if no credentials were provided I build an anonymous provider. I also added an online profile for all tests that require internet access.

aaime · 2020-01-09T11:50:28Z

Merged finally! Going to have a look at the geotools/geoserver side

aaime · 2020-01-09T14:30:47Z

Ok had a look, you're right, having to initialize the streams with the range reader is source of troubles indeed... at this point I don't see much alternative to making a custom subclass of the GeoTiff reader.

There you can possibly dodge the SPI mechanism and manually setup the stream, and init it using the desired area to read and target resolution (which are available in the read method of the reader).

If no area is required, then the whole tiff at max resolution is read, to just read the metadata, make it read only the header I believe?

I would support making changes to the base gt-geotiff classes, enough to make it possible to setup the streams correctly.

Does this sound like a plan?

joshfix · 2020-01-09T17:57:28Z

Thanks for getting this merged! Very excited to take the next step and get this working in GeoTools. I'm a bit buried at the moment so it may be a bit before I can circle back to this. Totally open to any suggestions, however, and if you see a clear path forward I'll be happy to implement it.

joshfix added 11 commits September 20, 2019 14:52

broke code up into separate modules

f50fbb5

removed ds_store files

a585568

converted http range reader to use okhttp; working azure reader

e749327

implemented CogUri source

ae37df6

fixed refactoring bugs

b00fb5a

added cache test

232397e

added maven build profiles

bfa19be

updated docs; added ability to read reagion for s3 from url param

af30540

updated profile names

202c6b7

removed test class

31ea8da

final cleanup

46ba56d

dromagnoli self-requested a review October 7, 2019 08:23

dromagnoli requested changes Oct 7, 2019

View reviewed changes

dromagnoli requested a review from aaime October 7, 2019 09:28

removed .DS_Store files; replaced print statements with logging

d8fe92e

aaime requested changes Oct 14, 2019

View reviewed changes

joshfix added 2 commits October 23, 2019 11:16

modifications per @aaime peer review

e3e5906

removed azure support due to unstable library support

96104b4

aaime requested changes Oct 23, 2019

View reviewed changes

PR 211 - final updates

f50194b

joshfix added 2 commits October 31, 2019 07:50

moved header length source from CogTileInfo to CogImageReadParam

e7608f1

added canDecodeInput method to image reader spi

48a1514

added anonymous s3 support; added online profile for tests

5c32652

aaime merged commit 9e17ef9 into geosolutions-it:master Jan 9, 2020

joshfix deleted the cog branch January 9, 2020 17:57

		@@ -0,0 +1,54 @@
		package it.geosolutions.imageioimpl.plugins.cog;

Add Cloud Optimized GeoTIFF Support #211

Add Cloud Optimized GeoTIFF Support #211

Conversation

joshfix commented Sep 25, 2019

dromagnoli left a comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

joshfix commented Oct 8, 2019

dromagnoli commented Oct 9, 2019

aaime left a comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

aaime Oct 21, 2019 • edited Loading

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

joshfix commented Oct 23, 2019

aaime left a comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

joshfix commented Oct 23, 2019

aaime commented Oct 31, 2019

joshfix commented Oct 31, 2019

aaime commented Dec 9, 2019

joshfix commented Dec 9, 2019

aaime commented Dec 9, 2019

joshfix commented Dec 9, 2019

aaime commented Jan 9, 2020

aaime commented Jan 9, 2020

joshfix commented Jan 9, 2020

aaime Oct 21, 2019 •

edited

Loading