-
Notifications
You must be signed in to change notification settings - Fork 1.5k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
feat(Bulk Loader + Live Loader): Supporting Loading files via s3/minio #7359
Conversation
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I left a few comments, but otherwise
Reviewed 24 of 24 files at r1.
Reviewable status: all files reviewed, 2 unresolved discussions (waiting on @gja, @manishrjain, @pawanrawal, and @vvbalaji-dgraph)
chunker/chunk.go, line 370 at r1 (raw file):
file string
Maybe call this filename
so it matches the name mentioned in the comment.
This signature is also getting kinda long (past 100 characters).
filestore/filestore.go, line 23 at r1 (raw file):
url, err := url.Parse(path) x.Check(err)
Can you test if this doesn't return an err (and then panic from x.Check) with local filesystem paths? I'm wondering if there's an edge case here if, say, the local path has spaces or something that's different from URLs.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Reviewable status: all files reviewed, 2 unresolved discussions (waiting on @danielmai, @manishrjain, @pawanrawal, and @vvbalaji-dgraph)
chunker/chunk.go, line 370 at r1 (raw file):
Previously, danielmai (Daniel Mai) wrote…
file string
Maybe call this
filename
so it matches the name mentioned in the comment.This signature is also getting kinda long (past 100 characters).
DONE
filestore/filestore.go, line 23 at r1 (raw file):
Previously, danielmai (Daniel Mai) wrote…
url, err := url.Parse(path) x.Check(err)
Can you test if this doesn't return an err (and then panic from x.Check) with local filesystem paths? I'm wondering if there's an edge case here if, say, the local path has spaces or something that's different from URLs.
Tested. Even url.Parse(C:\foobar\file with spaces and a % works.rdf.gz
)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Reviewed 23 of 24 files at r1, 2 of 2 files at r2.
Reviewable status: all files reviewed, 11 unresolved discussions (waiting on @danielmai, @gja, @pawanrawal, and @vvbalaji-dgraph)
chunker/chunk.go, line 370 at r2 (raw file):
// StreamReader returns a bufio given a ReadCloser. The file is passed just to check for .gz files func StreamReader(file string, key x.SensitiveByteSlice, f io.ReadCloser) (rd *bufio.Reader, cleanup func()) {
100 chars.
dgraph/cmd/live/run.go, line 407 at r2 (raw file):
// processFile forwards a file to the RDF or JSON processor as appropriate func (l *loader) processFile(ctx context.Context, fs filestore.FileStore, filename string, key x.SensitiveByteSlice) error {
100 chars.
filestore/filestore.go, line 1 at r2 (raw file):
package filestore
license.
filestore/local_files.go, line 1 at r2 (raw file):
package filestore
license.
filestore/remote_files.go, line 1 at r2 (raw file):
package filestore
license
filestore/remote_files.go, line 21 at r2 (raw file):
func (rf *remoteFiles) Open(path string) (io.ReadCloser, error) { url, err := url.Parse(path) x.Check(err)
return error, don't check fail.
filestore/remote_files.go, line 51 at r2 (raw file):
bucket, prefix := rf.mc.ParseBucketAndPrefix(url.Path) for obj := range rf.mc.ListObjectsV2(bucket, prefix, true, context.TODO().Done()) {
What happens if you pass a nil channel?
systest/bulk_live/common/bulk_live_fixture.go, line 69 at r2 (raw file):
if opts.remote { schemaPath = "minio://" + testutil.ContainerAddr("minio1", 9001) + "/data/" + schemaPath + "?secure=false"
100 chars.
testutil/exec.go, line 130 at r2 (raw file):
func DgraphBinaryPath() string { // Useful for OSX, as GOPATH/bin/dgraph is likely set to the linux compiled version of dgraph for docker
100 chars.
#7359) It is now possible to pass an s3 / minio path to bulk and live loader. -f s3:///<bucket>/path-to-rdf
This change is