-
Notifications
You must be signed in to change notification settings - Fork 517
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
S3 FS implementation #90
Conversation
👍 |
Could this get some love? I am very interested in using this in my project. Depending on a fork is never ideal. |
if os.IsNotExist(err) { | ||
return fs.OpenFile(name, os.O_CREATE, 0777) | ||
} | ||
return (*S3File)(nil), err |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Any reason you're returning a valid interface to a nil pointer? While everyone should be checking the error, CacheOnReadFs checks the interface for nil, and this breaks in hard-to-troubleshoot ways. While I don't think anyone should be ignoring the error, returning a valid interface isn't doing anyone any favors either.
if strings.Contains(err.Error(), "404") { | ||
statDir, err := fs.statDirectory(name) | ||
return statDir, err | ||
} |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Rather than checking the string, I think it would be better to check the error code:
if err, ok := err.(awserr.Error); ok {
if err.Code() == "NotFound" {
return fs.statDirectory(name)
}
}
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Here's a runnable example to play around with (just manipulate the bucket name):
package main
import (
"log"
"github.com/aws/aws-sdk-go/aws"
"github.com/aws/aws-sdk-go/aws/awserr"
"github.com/aws/aws-sdk-go/aws/session"
"github.com/aws/aws-sdk-go/service/s3"
)
func main() {
sess, err := session.NewSession()
if err != nil {
fmt.Fprintln(os.Stderr, err)
os.Exit(-1)
}
if _, err := s3.New(sess).HeadObject(&s3.HeadObjectInput{
Bucket: aws.String("bucketName"),
Key: aws.String("/missing/key/"),
}); err != nil {
if err, ok := err.(awserr.Error); ok {
fmt.Println("CODE:", err.Code())
}
}
}
|
||
var s3bucket string | ||
|
||
func init() { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This should be done as an integration test. Instead, you could Monkey Patch s3 methods (mock cannot really be done).
// my_test.go
...
monkey.PatchInstanceMethod(reflect.TypeOf(s3Session), "GetObject", s3_mocks.GetObjectMockFn())
...
// mocks_s3.go
type ClosingBuffer struct {
*bytes.Buffer
}
func (cb *ClosingBuffer) Close() error {
return nil
}
func GetObjectMockFn() func(_ *s3.S3, _ *s3.GetObjectInput) (*s3.GetObjectOutput, error) {
return func(_ *s3.S3, _ *s3.GetObjectInput) (*s3.GetObjectOutput, error) {
cb := &ClosingBuffer{bytes.NewBufferString(`This is the content of a file`)}
var rc io.ReadCloser
rc = cb
rc.Close()
return &s3.GetObjectOutput{
Body: rc,
}, nil
}
}
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Maybe Minio as a semi-functional mock?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Which would mean that your unit tests would depend on Minio, which I'm against.
Your unit tests should not depend on external services to work, but Minio would be a good candidate for integration testing, but then I'd rather connect my tests to a test bucket in s3.
I've had a chat with a few people who run their unit tests as integration tests to avoid to have to re-write parts of their tests, which sounds wrong to me.
Here, we're using the aws sdk, which we assume is stable and has been tested, so what you want to is just test your app logic and mock what functions will return, and allow you to also test how you handle errors you'd not get when aws behaves fine. It gives you more flexibility in your unit tests to catch all edge cases you can think of. I do agree that writing mocks can often add complexity in your tests, but it's worth it.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yeah, I can see where you're coming from. Using S3 itself as an integration test sounds potentially slow (and $) too, but I guess that depends on a lot of factors.
With mock vs integration... neither way seems to be perfect, both being needed. 😉
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Arguably, storage is really really cheap these days, 0.0245 per gb for standard storage.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Some concerns about unit tests
I know this has been sitting around for a while, but it would be nice to see if one of the s3 backends could be merged in |
Any news on this? I would also love to see this merged. @rgarcia Would you rebase your PR? |
Someone should probably move this PR to absfs/afero |
@@ -24,7 +24,7 @@ import ( | |||
// readDirNames reads the directory named by dirname and returns |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Can you also update the function name in the comment?
panic("read after close") | ||
} | ||
if f.offset != 0 { | ||
panic("TODO: non-offset == 0 read") |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Maybe open a follow/up issue to track this.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
On a second thought, this should probably be fixed from the get-go.
case 1: | ||
f.offset += int(offset) | ||
case 2: | ||
// can probably do this if we had GetObjectOutput (ContentLength) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Seems to be a thing now
https://godoc.org/github.com/aws/aws-sdk-go/service/s3#GetObjectOutput
panic("write after close") | ||
} | ||
if f.offset != 0 { | ||
panic("TODO: non-offset == 0 write") |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This seems like it should be fixed, otherwise Write in existing files is not useful?
Bucket: aws.String(f.bucket), | ||
Key: aws.String(f.name), | ||
Body: readSeeker, | ||
ServerSideEncryption: aws.String("AES256"), |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
For my own info. how is the key managed server-side? Is it fully managed by Amazon?
Key: aws.String(name), | ||
}) | ||
} | ||
return file, err |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
nit:
return file, nil
Key: aws.String(nameClean), | ||
}) | ||
if err != nil { | ||
if strings.Contains(err.Error(), "404") { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
awserr is a thing so please try to cast to typed errors
@rgarcia if you have time addressing the outstanding comments, I can help with merging. |
Thanks for the review @Kargakis and hi everyone. I made this PR almost 4 years ago and since then my team's priorities have shifted and we no longer use afero. So unfortunately it's very unlikely that I will find the time to finish the work here. |
Thanks @rgarcia, that's why we have come up with our own spin on afero https://github.com/bsm/bfs, it has adapters for in-memory, file system, S3, GCS, FTP and soon AZ Blob |
Hi guys, I reworked a fork of this implementation here and I'm interested by your feedback on it. It's based on @wreulicke's fork of @aviau's fork of @rgarcia's code and basically fixes some limitations it had. The biggest limitation I had is that it didn't handle the streaming (for upload or download) of files. I used a pipe and a go-routine to handle that. In my humble opinion:
I intend to build some other Afero.Fs for a multi-backend ftp server which was initially only a library. |
Hi Florent,
if you want to submit your S3 backend as a PR to this repo, I'll happily help with reviewing and merging your work.
…-------- Original Message --------
On 20 May 2020, 23:08, Florent Clairambault wrote:
Hi guys,
I reworked a fork of this implementation [here](https://github.com/fclairamb/afero-s3) and I'm interested by your feedback on it. It's based on ***@***.***(https://github.com/wreulicke)'s fork of ***@***.***(https://github.com/aviau)'s fork of ***@***.***(https://github.com/rgarcia)'s code and basically fixes some limitations it had. The biggest limitation I had is that it didn't handle the streaming (for upload or download) of files. I used a pipe and a go-routine to handle that.
In my humble opinion:
- Each Fs implementation should be in its own repository so that each developer can choose the one he wants
- You should recommend some of them or even create an afero-contrib github organization where you share the core Fs
I intend to build some other Afero.Fs for a multi-backend [ftp server](https://github.com/fclairamb/ftpserver) which was initially only [a library](https://github.com/fclairamb/ftpserverlib).
—
You are receiving this because you were mentioned.
Reply to this email directly, [view it on GitHub](#90 (comment)), or [unsubscribe](https://github.com/notifications/unsubscribe-auth/AA5YK7YOBNT3SZJIW6AXGGTRSRBFZANCNFSM4CEPSALA).
|
Is there still work being done on this? |
This is a mostly-working/mostly-tested implementation of S3 as an FS.
There are some gaps in the implementation, but it is otherwise pretty functional:
Seek
with whence set to 2All of these things increase the complexity of the implementation significantly (some are questionably well-defined, e.g. append in an eventually-consistent FS like S3), so for now I've left them out.
In order to reduce clutter in the root package
github.com/spf13/afero
, I put the whole implementation + tests in a sub package,github.com/spf13/afero/s3
. In order to leverage the genericFs
tests inafero_test.go
, I copy-pasted them into a sub packagegithub.com/spf13/afero/tests
. I also added a few tests to this file as I went along.In order to run the tests you must pass a bucket:
go test -v github.com/spf13/afero/s3 -s3bucket <test bucket>
. You also need to setAWS_REGION
,AWS_ACCESS_KEY_ID
, andAWS_SECRET_ACCESS_KEY
.Let me know what you think 😄