-
Notifications
You must be signed in to change notification settings - Fork 1.2k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
S3AttachmentStore #3779
S3AttachmentStore #3779
Conversation
Codecov Report
@@ Coverage Diff @@
## master #3779 +/- ##
=========================================
- Coverage 75.78% 71.1% -4.69%
=========================================
Files 145 146 +1
Lines 6897 6953 +56
Branches 418 418
=========================================
- Hits 5227 4944 -283
- Misses 1670 2009 +339
Continue to review full report at Codecov.
|
Would need some suggestion on where to document steps for configuring S3AttachmentStore |
1b815d5
to
f7aec89
Compare
This PR is now ready for review |
418dcf3
to
1792a88
Compare
Needs a PG |
187e6a7
to
aef88cf
Compare
PG of commit aef88cf is 🔵 |
aef88cf
to
304d0be
Compare
aws sdk pulls in quite a few dependencies which are not used with akka s3 based usage as it uses its own http client instead of commons http client. So those can be safely excluded
304d0be
to
5966452
Compare
failure => s"[ATT_DELETE] '$prefix' internal error, doc: '$docId', failure: '${failure.getMessage}'") | ||
} | ||
|
||
override def shutdown(): Unit = {} |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
nothing to shutdown?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
It uses Akka cached Http connection pool which is bound to ActorSystem lifecycle so need not be explicitly closed. It may be possible to close the pool but comments here indicates that it may pose problem so not attempting them
This PR introduces a S3AttachmentStore which is an AttachmentStore implementation for storing attachments in S3 API compatible object storages.
This PR introduces a
S3AttachmentStore
which is anAttachmentStore
implementation for storing attachments in S3 API compatible object storagesIt should be possible to use this with IBM Cloud Object Storage as it supports S3 API.
Description
S3AttachmentStore
uses Alpakka S3 Connector for making calls to S3. The object key is constructed in following formWhere
namespace
- Entity type in lowercase. For now onlywhiskentity
is used with attachmentsIt supports following operations
readAttachment
- Read is done in a streaming way using GET Objectattach
- Stream is uploaded via multipart upload API where upload is done in chunks of 5 MB and then finally stitched together. This avoids buffering the whole stream on disk. So any upload would invoke 2+ remote calls to complete uploaddeleteAttachments
- This is done via List Objects v2 API to list object under prefix<namespace>/<docId>
and then remove them via delete object apideleteAttachment
- This is done via single delete object apiUsage
To enable use of
S3AttachmentStore
following env variables need to be setCONFIG_whisk_spi_AttachmentStoreProvider
=whisk.core.database.s3.S3AttachmentStoreProvider
CONFIG_whisk_s3_bucket
- Bucket nameAWS_ACCESS_KEY_ID
- AWS Access Key ID. See Access Keys for details on key id and access key. Also see AWS Java SDK for details on how access keys are looked upAWS_SECRET_ACCESS_KEY
AWS_REGION
- . AWS Region name. See AWS Region Selection for detailsSee Alpakka S3 for more configuration details. In OpenWhisk setup the config are under
whisk.s3.alpakka
namespace (see s3-reference.conf for details)Testing
All attachment related test are consolidated in
S3AttachmentStoreBehaviorBase
and then there are suites to run it against various optionsS3AttachmentStoreMinioTests
- Performs testing against a Minio S3 Proxy. These test only need docker to be present.S3AttachmentStoreAwsTests
- Performs testing against S3. It requires following env variables to be configured. These are configured in travis settingsnfor master repo and in my fork. So are executed there.a.
AWS_ACCESS_KEY_ID
b.
AWS_SECRET_ACCESS_KEY
c.
AWS_REGION
We can also use S3Mock which is built on top of akka stack but then it may later pose problem if our dependency version diverge. So using Minio for now
Dependencies
S3 support pulls in following dependencies
Add s3 support increases the size of controller.tar from 88 MB (92 jars) to 91 MB (97 jars). Following are version change and new jars being pulled in
Below was the original dependency list which was trimmed down to previous one based on the assumption alpakka s3 client uses akk http stack. So commons http and jackson can be excluded. All tests pass with these dependencies removed
TODO
s3s
->s3
(see Uri.from fails for scheme ending in digit akka/akka-http#2080)Related issue and scope
My changes affect the following components
Types of changes
Checklist: