Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Fix bug when filtering s3 locations #531

Merged
merged 2 commits into from
Dec 5, 2013
Merged

Conversation

jamesls
Copy link
Member

@jamesls jamesls commented Dec 5, 2013

The pattern is evaluated against the entire bucket. This doesn't
mattern for suffix searches, but for a prefix search, you'd have to
include the bucket name. This is inconsistent with filtering local
files. Now they're both consistent. Given:

  --exclude 'foo*'

This will filter any full path that startsw with foo. So locally:

  rootdir/
    foo.txt      # yes
    foo1.txt     # yes
    foo/bar.txt  # yes
    bar.txt      # no

And on s3, we now have the same results:

  bucket/
    foo.txt      # yes
    foo1.txt     # yes
    foo/bar.txt  # yes
    bar.txt      # no

I also added debug logs to help customers troubleshoot why their
filters aren't working the way they expect. For example:

s3.filters - DEBUG - /private/tmp/syncme/level-1/file-7 matched exclude filter: /private/tmp/syncme/*
s3.filters - DEBUG - /private/tmp/syncme/level-1/file-7 did not match include filter: /private/tmp/syncme/f*
s3.filters - DEBUG - /private/tmp/syncme/level-1/file-7 matched include filter: /private/tmp/syncme/level*
s3.filters - DEBUG - =/private/tmp/syncme/level-1/file-7 final filtered status, should_include: True

This makes the unit tests more concise and easier to
follow.  Removes a lot of the duplication.  Also makes it
easier to write new filter tests.
The pattern is evaluated against the entire bucket.  This doesn't
mattern for suffix searches, but for a prefix search, you'd have to
include the bucket name.  This is inconsistent with filtering local
files.  Now they're both consistent.  Given:

  --exclude 'foo*'

This will filter any full path that startsw with foo.  So locally:

  rootdir/
    foo.txt      # yes
    foo1.txt     # yes
    foo/bar.txt  # yes
    bar.txt      # no

And on s3, we now have the same results:

  bucket/
    foo.txt      # yes
    foo1.txt     # yes
    foo/bar.txt  # yes
    bar.txt      # no

I also added debug logs to help customers troubleshoot why their
filters aren't working the way they expect.
@garnaat
Copy link
Contributor

garnaat commented Dec 5, 2013

LGTM

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants