Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Object store #1194

Merged
merged 565 commits into from
Feb 3, 2023
Merged

Object store #1194

merged 565 commits into from
Feb 3, 2023

Conversation

deepsidhu85
Copy link
Contributor

@deepsidhu85 deepsidhu85 commented Feb 8, 2022

Description of changes

What did you change in this pull request? Provide a description of files changed, user interactions changed, etc. Include how to test your changes.

Added support for object storage (cloud based storage). Currently Microsoft Azure and Amazon AWS S3 are supported.

A few notes about what is stored in the cloud and how the data is used by IRIDA and Galaxy:

  1. Sequence, reference, assemblies, and analysis output files are the only files stored in the cloud.
  2. When running a pipeline, a local copy of the sequence files is downloaded from the cloud, uploaded to galaxy, and once the pipeline has completed the analysis output files are written to the cloud while the local copy of the data that was downloaded to the IRIDA server is removed.
  3. IRIDA Linker and Galaxy Importer both use end points in IRIDA to get the data that is requested so these tools don't need to know where the data is stored
  4. IRIDA on startup checks if the configuration is valid for Azure and AWS S3. If not valid the server does not start up with a clear indication in the logs of why is failed to start (For Azure the container does not exist and/or container does not have proper permissions. Same applies to AWS S3 bucket). This is done by attempting to write and read a small file to the container/bucket.

Also note that you can emulate Microsoft Azure Blob storage locally using Azurite for testing on your local machine. AWS does not provide a storage emulator so there is no way currently to test out on your local machine. Possible solution is being tested and may be released in a future release.

Related issue

Link to the GitHub issue this pull request addresses using the #issuenum format. If it completes an issue, use Fixes #issuenum to automatically close the issue.

Fixes #228

Checklist

Things for the developer to confirm they've done before the PR should be accepted:

  • CHANGELOG.md (and UPGRADING.md if necessary) updated with information for new change.
  • Tests added (or description of how to test) for any new features.
  • User documentation updated for UI or technical changes.

…l or not and added logic to check if the local filesystem is writeable. If these fail then we exit and log the exception
…dated to use enum comparison. Moved checking if basedirectory exists on the local file system and if it is writable into the iridafilestorageutility
…ridafilestorageutility throws if a connection is not made to a bucket/container
…heck is required. Updated azure file storage utility class to check if we can list blobs in a container instead of getting properties as the token is scoped
…ore when running an analysis submission with cloud stored files
@deepsidhu85 deepsidhu85 changed the title WIP: Object store Object store Jan 28, 2023
@ericenns ericenns requested a review from ksierks January 30, 2023 18:44
@joshsadam
Copy link
Contributor

I see there are some JS/JSX files in here.  I am sure that you do not want to update them to TypeScript as UI really is not the focus of this PR, but we will need to make sure that they get updated correctly.

* `aws.access.key=ACCESS_KEY`
* `aws.secret.key=SECRET_KEY`

See [AWS S3 Bucket Storage Setup](https://docs.aws.amazon.com/AmazonS3/latest/userguide/GetStartedWithS3.html) for instructions on how to setup S3 storage.
Copy link
Contributor

@ksierks ksierks Jan 31, 2023

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I did one quick Azure test using Azurite and it worked. 😄
Then I wanted to try testing AWS, but noticed you only give a link to the actual S3 documentation. I know you can sign up for a free trial, but I wasn't really happy that they asked for a credit card. Could we use a AWS mock service, like Azurite? I found LocalStack, but then realised your settings don't take a url for AWS like they do for Azure. Could we add a url so developers can use a mock service of their choosing for development?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Awesome glad it worked! Yeah I wouldn't be either! When I started working on the object store stuff for AWS we had an account since we were evaluating it along with Azure. Once we no longer had access to an AWS account we focused most of the work on Azure hence why we instructions for how to use it with Azurite to test and none for AWS. I had seen localstack before but unfortunately the maven repository we are using (com.amazonaws:aws-java-sdk-s3) doesn't allow us to make a connection to the AWS S3 bucket using a URL like we do with Azurite for Azure. I will see if we can possibly use another maven repository for testing which will allow us to make a connection using a URL. I will make a note to take a look but I don't think it will be out in this release

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I take that back. I see the version of the maven package we are using has a setEndpoint method. This will need to be tested out so either way I don't think we will get it into this release. Thanks again @ksierks

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think you have to create an EndpointConfiguration. If not, I don't know how else to test AWS.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yeah I've been trying to get it to work. It'd be really helpful for testing it out. I'll keep working away at it and hopefully get it working! Thanks!

…oved erroralertprops and updated to use AlertProps for ErrorAlert component, simplified logic, added cloud configuration values (commented out) to filesystem.properties
Copy link
Contributor

@joshsadam joshsadam left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Gets my blessing, great job Deep.

@deepsidhu85
Copy link
Contributor Author

Gets my blessing, great job Deep.

Thanks @joshsadam. Appreciate all the feedback and help!

@ksierks ksierks self-requested a review February 2, 2023 22:39
Copy link
Contributor

@ksierks ksierks left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is a really big PR. I tried testing some things out like uploading and downloading sequence files. I also tried sequence file concatenation. I didn't run it with the uploader, but I did try the REST API. I even tried that analysis submission file endpoint where you pass the file view in the accept header. Great job!

@deepsidhu85
Copy link
Contributor Author

This is a really big PR. I tried testing some things out like uploading and downloading sequence files. I also tried sequence file concatenation. I didn't run it with the uploader, but I did try the REST API. I even tried that analysis submission file endpoint where you pass the file view in the accept header. Great job!

Thanks for reviewing and testing it out! @ksierks Yeah it became quite a large PR. Lots of changes were required to get it to work unfortunately but we got there!

Copy link
Member

@ericenns ericenns left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Looks good to me 👍

@ericenns ericenns merged commit 36f67dd into development Feb 3, 2023
@ericenns ericenns deleted the object-store branch February 3, 2023 17:28
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
back-end Issues relating to the service layer, repository, model, configuration, or database feature
Projects
None yet
Development

Successfully merging this pull request may close these issues.

Object store support
5 participants