Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Implement find command. #2201

Merged
merged 1 commit into from
Sep 5, 2017
Merged

Implement find command. #2201

merged 1 commit into from
Sep 5, 2017

Conversation

CarterMcClellan
Copy link
Contributor

@CarterMcClellan CarterMcClellan commented Jul 4, 2017

In response to Issue #2175, with the addition of the following commands: ignore, newer, older, larger, smaller, maxdepth, and the "or" operator. For further expansion on the command see: https://gist.github.com/CarterMcClellan/3744f1ad74ec757bf83163a1c5d25183.

Fixes #2175

cmd/find-main.go Outdated
Usage: "Prints the found files given args to STDOUT",
},
cli.StringFlag{
Name: "mcexec",
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Why not just exec?

Copy link
Contributor

@badscooter23 badscooter23 Jul 4, 2017

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The rationale for calling it --mcexec is that the command has to be an mc command (vs any shell command which would be implied by --exec).

Why?

mc find myminio -name *.jpg

will return something like

myminio/somebucket/somefolder/x1.jpg
mymino/somebucket/somefolder/x2.jpg

the names returned (if the are names of objects in an object store) are not understandable to a standard unix/linux shell command, so

mc find myminio --name *.jpg --exec "cp {} ~/pictures"

Fails because the shell cp command does not work on objects in myminio...

cp myminio/somebucket/somefolder/x1.jpg ~/pictures

gives a "file not found" error, etc.

mc find myminio --name *.jpg --mcexec "mc cp {} ~/pictures"

"execs" and does a

mc cp myminio/somebucket/somefolder/x1.jpg ~/pictures

Which will succeed.

--mcexec makes it more obvious/clear that the command has to be an mc command.

Copy link
Member

@harshavardhana harshavardhana Jul 4, 2017

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@badscooter23 it doesn't need to be mc always.. now this begs a question we can't pass in the alias since its not meaningful always.

I think we should have a meaning like

mc find myminio --name *.jpg --exec "mc cp {alias} ~/pictures"
mc find myminio --name *.jpg --exec "wget {full} -O ~/pictures/{name}"

Now the question is what will be the default {} be is it full or alias . This we will have to discuss with @abperiasamy since this was not covered in the original proposal.

Copy link
Contributor Author

@CarterMcClellan CarterMcClellan Jul 4, 2017

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

What if we had two separate exec flags and the default for mcexec was {alias}, and the default for regular exec was {full}?

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think there is a lot of value in enhancing the semantics of substitution... So I will elaborate on the details so that we can more easily discuss the best answer.

Today the following is allowed

{} Full pathname (with alias substituted for URL in object store)
{base} Pathname after the last slash (essentially the file name or object name)
{dir} Pathname up to the last slash (full pathname without the file or object name)

If 'mc find' is executed with a target of an object store then the pathnames returned (today) will have the "alias" substituted for the "URL"... for example:

on my system, myminio is at 'http://169.254.194.86:9000'

mc find myminio -name *.jpg
myminio/copy-images/Blinds.jpg
myminio/copy-images/Bokeh_Tails.jpg
...

The find output is easier to read and more intuitive it it returns names with the alias. Those names are not valid outside of mc.

It makes sense that {} substitute the output from the find command (the full path). It might make sense to have a separate flag to control the output format (full URL or with alias).

Independently, I kind of like the idea of have --mcexec (which always takes an mc command) and --exec (which can take any command). The user will just have a syntax available to get full pathnames (as full URLS) and will have to be aware that only some shell commands will be able to use those URLs. (your example of wget makes sense, but lots of other commands will not work even with the "full" URL).

Copy link
Contributor Author

@CarterMcClellan CarterMcClellan Jul 4, 2017

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Or, to avoid introducing more complexity to the usage of the command what if a user is trying to exec a command such as "wget {full} -O ~/pictures/{name}", then what if we peak inside of the exec string and see if the command which they are passing is or is not an valid mc command? If it is not then interpret {} as {full}, but if the command passed is a valid mc command then we can interpret {} as {alias}.

Thus allowing us to only need the mcexec flag, and not need to introduce more substitution args.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

What if we had two separate exec flags and the default for mcexec was {alias}, and the default for regular exec was {full}?

This might make things easier for writing code, but there is a lot of education on keeping it separate with all these variations - the price of that is immense. It might make the command itself unusable since we cannot easily convey what these mean.

My suggestion is something more simple here keep it always -exec and honor these styles.

-exec "mc cp {} /mnt/pictures"
-exec "cp {} /mnt/pictures"

If user provides just cp then with auto substitution we treat it as mc cp , We never allow unix tools to be used in the beginning but can be used as part of the subsequent commands for example.

-exec "mc share download --json {} | jq -r .share | xargs curl | convert - -resize "10%" - | mc pipe {}-thumb"
-exec "share download --json {} | jq -r .share | xargs curl | convert - -resize "10%" {}-thumb.png"

Here share just becomes mc share

Copy link
Contributor Author

@CarterMcClellan CarterMcClellan Jul 5, 2017

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

My suggestion is something more simple here keep it always -exec and honor these styles.

Aren't there two issues with this?

  1. If the user has their own command in the file path which they are trying to call (like a shell script).

We never allow unix tools to be used in the beginning but can be used as part of the subsequent commands for example.

  1. To accomplish this notion of "subsequent" has to first be defined, and does this vary from os to os. Ex, on Windows () can be used to group or nest commands (https://ss64.com/nt/syntax-brackets.html), but on Unix the usage of () can allow the output of a command to replace itself (http://www.faqs.org/docs/bashman/bashref_30.html).

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

If the user has their own command in the file path which they are trying to call (like a shell script).

User has it won't work since it doesn't understand the abstraction which mc provides between filesystem and object storage.

To accomplish this notion of "subsequent" has to first be defined, and does this vary from os to os. Ex, on Windows () can be used to group or nest commands (https://ss64.com/nt/syntax-brackets.html), but on Unix the usage of () can allow the output of a command to replace itself (http://www.faqs.org/docs/bashman/bashref_30.html).

No this is verbatim there is no OS level interpretation here, you are not even passing this to bash or windows command line for it to be interpreted. The interpretation only happens after you constructed the right command line.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Also NOTE: this is not decided yet from the command point of view until @abperiasamy provides his views - It was my two cents to avoid the complexity of handling all kinds of use cases and defining multiple exec styles.

cmd/find-main.go Outdated
7. Find all files which contain Spock and contain the word "Vulcan" in the filepath name
$ {{.HelpName}} s3 -name "Spock" -path "*/Vulcan/*"

8. Find all files which contain Clingon, then continue watching the bucket "neutralZone". Once a new object is added which meets the given params preform the given exec.
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pretty sure its Klingon.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

:P haha

cmd/find.go Outdated
@@ -257,6 +257,24 @@ func subArgsHelper(args, filepath string, fileContent contentMessage) string {
}
}

for {
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

These two for loops should be combined.. and created in a concise manner.. Basically you need to have a parser code which can execute a function as well instead.

Also not sure why would you need an infinite for loop.

cmd/find.go Outdated
if strings.Contains(str, "{url}") {
str = getPurl(filepath)
for {
if strings.Contains(str, "{\""+"url"+"\"}") {
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Same as above.

Copy link
Member

@harshavardhana harshavardhana left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Also @CarterMcClellan can you squash your changes?

cmd/find.go Outdated
@@ -0,0 +1,472 @@
/*
* Minio Client (C) 2015 Minio, Inc.
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Copyright should point to 2017 for newer files.

cmd/find.go Outdated
"github.com/minio/mc/pkg/console"
"github.com/minio/minio/pkg/wildcard"
)

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Add comment..

cmd/find.go Outdated
}

func (f findMSG) JSON() string {
return console.Colorize("Find", f.pathKey)
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is not a JSON representation check other commands like ls.go for more inputs.. You have to use json.Marshal

cmd/find.go Outdated
}

func nameFlag(filepath, pattern string) (matched bool, err error) {
base := convert.Base(filepath)
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

why convert? we should keep filepath and use path for the variable.

cmd/find.go Outdated
pathKey: printString,
})
}

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Each functions need comments..

cmd/find.go Outdated
fileContent := parseContent(content)
filePath := fileContent.Key

//traversing in a object store not a file path
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Same as above..

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

You should read https://github.com/golang/go/wiki/CodeReviewComments this as well and apply it throughout this PR.

cmd/find.go Outdated

go func() {
defer wg.Done()

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Even this needs a comment.. each large for { loop.

cmd/find.go Outdated

func timeHelper(pattern string) (int, error) {
conversion := map[string]int{
"dy": 1,
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Is there a reason for this custom conversion?

cmd/find.go Outdated
console.Fatalln()
}

shareDB := newShareDBV1()
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Don't need to generate an updated the shareDB file IMO, since find executes and completes quickly as it processes. There is no reason to keep inside the DB. Just do client.ShareDownload()

cmd/find_test.go Outdated
@@ -0,0 +1,78 @@
/*
* Minio Client (C) 2014, 2015 Minio, Inc.
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Fix the licensing.. this is a new file keep it 2017 no older dates.

cmd/find.go Outdated
* limitations under the License.
*/

// All mc utilities are stored under the cmd directory

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

package comment should be of the form "Package cmd ..."

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

cmd/find-main.go Outdated
func checkFindSyntax(ctx *cli.Context) {
args := ctx.Args()

//help message on [mc][find]
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Comments should have a //<space> check the Go comments in any standard lib.

cmd/find-main.go Outdated

//mainFind - handler for mc find commands
func mainFind(ctx *cli.Context) error {
// Additional command specific theme customization.
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Just like this comment change everywhere to add space character after //

cmd/find.go Outdated
)

// FindMSG holds JSON and string values for printing
type FindMSG struct {
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

FindMSG can be lowercase though. findMSG

cmd/find.go Outdated
}

// NameFlag pattern matches off of the base of the filepath
func NameFlag(path, pattern string) (bool, error) {
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The function name should be more appropriate NameFlag means its a flag not a matcher function. We should name them appropriately such as

func nameMatch(...)
func pathMatch(...)
func regexMatch(...)

cmd/find.go Outdated
}

// PrintFlag prints the output in accordance with the supplied substitution arguments
func PrintFlag(path string, ctx *cli.Context, fileContent contentMessage) {
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Context should always be the first argument to the function.. also this function should be renamed as

func doFindPrint()
func doFindExec()
func doFindWatch()
func doFindNewer()
func doFindOlder()
func doFindLargerSize()
func doFindSmallerSize()

cmd/find.go Outdated
}

func isMatch(fileContent contentMessage, orOp bool, ctx *cli.Context) bool {
match := true
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

No reason to set match := true since if (match && !orOp) is as good as if (!orOp)`

cmd/find.go Outdated
// verify that the newly added object matches all of the other specified params
if ctx.String("name") != "" {
temp1, err1 := NameFlag(fileContent.Key, ctx.String("name"))
match = temp1
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Don't you think lets say name and ignore were provided if match is set to true by PathFlag() before would be overwritten by the NameFlag which can lead to further issues down the line.

cmd/find.go Outdated
"mh": 30,
"yr": 365,
}
if int1, err1 := strconv.Atoi(pattern); err1 == nil {
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

do not use err == nil pattern start with this style.

t, err := strconv.Atoi(pattern)
if err != nil {
         tu := = pattern[len(pattern)-2:] // time unit 
         t, err = strconv.Atoi(pattern[:len(pattern)-2]
         if err != nil {
                  return 0, err 
        }
        t = t * conversion(strings.ToLower(tu))
}
return t, nil

cmd/find.go Outdated
newClnt, err := newClientFromAlias(targetAlias, objectURL)
fatalIf(err, "Error with newClientFromAlias")

expiry := time.Duration(604800) * time.Second
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Specify what is the expiry time.. i.e added a proper comment.

defer close(objectsCh)
objectsCh <- content
}()

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Each for loop needs a comment..

@codecov-io
Copy link

codecov-io commented Jul 27, 2017

Codecov Report

Merging #2201 into master will decrease coverage by 0.29%.
The diff coverage is 1.62%.

Impacted file tree graph

@@           Coverage Diff            @@
##           master   #2201     +/-   ##
========================================
- Coverage     9.3%      9%   -0.3%     
========================================
  Files          98     100      +2     
  Lines        9177    9547    +370     
========================================
+ Hits          854     860      +6     
- Misses       8193    8557    +364     
  Partials      130     130
Impacted Files Coverage Δ
cmd/find-main.go 0% <0%> (ø)
cmd/main.go 0% <0%> (ø) ⬆️
cmd/find.go 1.8% <1.8%> (ø)

Continue to review full report at Codecov.

Legend - Click here to learn more
Δ = absolute <relative> (impact), ø = not affected, ? = missing data
Powered by Codecov. Last update fe58346...57b6100. Read the comment docs.

@CarterMcClellan CarterMcClellan changed the title Find method W.I.P Find method Jul 31, 2017
Copy link
Member

@harshavardhana harshavardhana left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM..

@CarterMcClellan CarterMcClellan changed the title Find method Mc find command Jul 31, 2017
@harshavardhana harshavardhana changed the title Mc find command mc find command Jul 31, 2017
@harshavardhana harshavardhana changed the title mc find command Implement find command. Jul 31, 2017
Copy link
Member

@harshavardhana harshavardhana left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

diff --git a/cmd/find.go b/cmd/find.go
index ec280fb..9b36a6f 100644
--- a/cmd/find.go
+++ b/cmd/find.go
@@ -115,9 +115,9 @@ func doFindWatch(path string, ctx *cli.Context) {
        noAlias := (e == nil)
 
        clnt, content, err := url2Stat(path)
-       errorIf(err, "Unable to construct client")
-       fileContent := parseContent(content)
+       fatalIf(err, "Unable to construct client")
 
+       fileContent := parseContent(content)
        if noAlias {
                //pass hostname (alias) to to watchEvents so that it can remove from pathnames returned from watch
                watchEvents(ctx, clnt, params, alias, fileContent)
@@ -337,7 +337,13 @@ func watchEvents(ctx *cli.Context, clnt Client, params watchParams, alias string
                                if match && ctx.String("exec") != "" {
                                        doFindExec(ctx, msg.Key, fileContent)
                                }
-
+                               if match && ctx.String("print") != "" {
+                                       doFindPrint(msg.Key, ctx, msg)
+                               } else if match {
+                                       printMsg(findMSG{
+                                               Path: msg.Key,
+                                       })
+                               }
                        case err, ok := <-watchObj.Errors():
                                if !ok {
                                        return

}
for _, r := range pattern {
rpattern = append(rpattern, r)
}
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Converting string to []rune is as simple as []rune(s) where s is a string. So the above for loops and make lines can be done away with.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

There is a reason for this..

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Whats the reason?

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actually i don't remember the reason, perhaps nothing that i can think of right now. you can send a fix for this..

}
for _, r := range pattern {
rpattern = append(rpattern, r)
}
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Conversion is just: []rune(s)

cmd/find.go Outdated
msg.Time, _ = time.Parse(time.RFC822, event.Time)
msg.Size = event.Size

msg.Key = alias + msg.Key
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is more readably assigned like so:

msg := contentMessage{
    Key: strings.TrimPrefix(..),
    Time: ...,
    ....
}

cmd/find-main.go Outdated
},
cli.BoolFlag{
Name: "or",
Usage: "Treat given params as exclusive as opposed to being interdependent",
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This flag needs to be clarified - I assumed that it applies to the conditions/expressions on either side of the --or flag on the command line (this is the standard meaning in the GNU/POSIX find command). From the implementation it is clear now that it is meant as a global flag that lets a file match any of the given conditions and be included in the output. It needs to be described in more simple terms.

cmd/find-main.go Outdated
},
cli.BoolFlag{
Name: "watch",
Usage: "Prefroms given args every time that an object is created",
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Prefroms -> Performs - there are multiple different misspellings of the same word in many places in this patch.

cmd/find-main.go Outdated
7. Find all files which contain emacs and contain the word "badEditor" in the filepath name
$ {{.HelpName}} s3 -name "emacs" -path "*/badEditor/*"

8. Find all files which contain foo, then continue watching the bucket "buck". Once a new object is added which meets the given params preform the given exec.
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Misspelling.

}
case '*':
return deepMatchRune(str, pattern[1:], simple) ||
(len(str) > 0 && deepMatchRune(str[1:], pattern, simple))
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Neat algo, but I wonder if it would be more efficient to split on * and substring search for the resulting split patterns. This looks a bit inefficient because each call leads to two branched recursive calls (I didnt analyze it in detail).

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is not related to find @donatello

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The code is pretty much optimized , if this can be reduced further from the following benchmarks feel free to send a PR.

+func BenchmarkAscii(t *testing.B) {
+       for i := 0; i < t.N; i++ {
+               if !wildcard.Match("hello", "hello") {
+                       t.Fatal("fail")
+               }
+       }
+}
+
+func BenchmarkUnicode(t *testing.B) {
+       for i := 0; i < t.N; i++ {
+               if !wildcard.Match("h情llo", "h情llo") {
+                       t.Fatal("fail")
+               }
+       }
+}
$ go test -bench .
BenchmarkAscii-4     	30000000	        44.7 ns/op
BenchmarkUnicode-4   	20000000	        68.0 ns/op


filePath = s
fileContent.Key = s
}
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It is not clear how max-depth is limiting the depth of the search - for any depth looks like it is reassigning filePath and fileContent.Key. Shouldn't there be a line to skip (or continue statement in the loop) the file if it is too many levels deep?

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I was trying this out and there appears to be a bug:

$ mc find myminio/bucket1
myminio/bucket1/NOTICE
myminio/bucket1/a/NOTICE
myminio/bucket1/a/b/NOTICE
myminio/bucket1/a/b/c/NOTICE
$ mc find myminio/bucket1 --maxdepth 1
myminio/
myminio/
myminio/
myminio/
$ mc find myminio/bucket1 --maxdepth 2
myminio/bucket1/
myminio/bucket1/
myminio/bucket1/
myminio/bucket1/
$ mc find myminio/bucket1 --maxdepth 3
myminio/bucket1/NOTICE
myminio/bucket1/a/
myminio/bucket1/a/
myminio/bucket1/a/
$ mc find myminio/bucket1 --maxdepth 4
myminio/bucket1/NOTICE
myminio/bucket1/a/NOTICE
myminio/bucket1/a/b/
myminio/bucket1/a/b/
$ mc find myminio/bucket1 --maxdepth 5
myminio/bucket1/NOTICE
myminio/bucket1/a/NOTICE
myminio/bucket1/a/b/NOTICE
myminio/bucket1/a/b/c/
$ mc find myminio/bucket1 --maxdepth 6
myminio/bucket1/NOTICE
myminio/bucket1/a/NOTICE
myminio/bucket1/a/b/NOTICE
myminio/bucket1/a/b/c/NOTICE

mc find should not list directories (and certainly not repeatedly).

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

There appears to be another corner case bug:

With find (executed in the mc codebase top directory):

$ find vendor/ -maxdepth 1
vendor/
vendor/golang.org
vendor/vendor.json
vendor/github.com
vendor/gopkg.in
$ find vendor/ -maxdepth 2
vendor/
vendor/golang.org
vendor/golang.org/x
vendor/vendor.json
vendor/github.com
vendor/github.com/dustin
vendor/github.com/rjeczalik
vendor/github.com/fatih
vendor/github.com/hashicorp
vendor/github.com/minio
vendor/github.com/cheggaaa
vendor/github.com/go-ini
vendor/github.com/mattn
vendor/github.com/pkg
vendor/gopkg.in
vendor/gopkg.in/check.v1

whereas with mc find:

$ mc find vendor/ --maxdepth 1
$ mc find vendor/ --maxdepth 2
vendor/vendor.json

mc is not listing directories - that is fine. However, with a max-depth of 1 it is not traversing the given path at all. There is an off-by-one bug here.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think that this is taken care of now

@CarterMcClellan
Copy link
Contributor Author

Alright so I have added the following change to find.

Find watch will now have a syntax check in the beginning to verify that a bucket name is specified, so that it can gracefully exit.

A FORMAT section has been added to provide valid suffixes for usage of the time and size related commands.

Time conversions have been updated to d,w,m,y.

All of those ^, are AB's requested changes

cmd/find-main.go Outdated
},
cli.StringFlag{
Name: "path",
Usage: "Match diretcory names matching wildcard pattern",
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Misspelling.

cmd/find-main.go Outdated
6. Find all jpgs, pngs, and gifs using regex
$ {{.HelpName}} s3/photos -regex "(?i)\.(jpg|png|gif)$"

7. Mirror all photos from s3 bucket *coninuously* from the s3 bucket to minio play test bucket.
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Misspelling.

cmd/find-main.go Outdated
Smaller/Larger Flags: Will accept the suffixes: "b": Byte, "kib": KiByte, "kb": KByte, "mib": MiByte, "mb": MByte, "gib": GiByte, "gb": GByte, "tib": TiByte, "tb": TByte, "pib": PiByte, "pb": PByte, "eib": EiByte, "eb": EByte, // Without suffix "": Byte, "ki": KiByte, "k": KByte, "mi": MiByte, "m": MByte, "gi": GiByte, "g": GByte, "ti": TiByte, "t": TByte, "pi": PiByte, "p": PByte, "ei": EiByte, "e": EByte,

Older/Newer Flags: Will accept the suffixes d, w, m, y.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This section does not look complete. It should probably be renamed as UNITS:

UNITS:
   --smaller, --larger flags accept human-readable case-insensitive number suffixes such as "k", "m", "g" and "t" referring to the metric units KB, MB, GB and TB respectively. Adding an "i" to these prefixes, uses the IEC units, so that "gi" refers to "gibibyte" or "GiB". A "b" at the end is also accepted. Without suffixes the unit is bytes.

   --older, --newer flags accept the suffixes "d", "w", "m" and "y" to refer to units of days, weeks, months and years respectively. <details about 1year=365 days or 12 months, etc> 

cmd/find-main.go Outdated
{{end}}
EXAMPLES:
1. Find all files named foo from all buckets.
$ {{.HelpName}} s3 -name "file"
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think examples should use the double-dash option format like "--name" to be consistent with examples from other mc commans.


filePath = s
fileContent.Key = s
}
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I was trying this out and there appears to be a bug:

$ mc find myminio/bucket1
myminio/bucket1/NOTICE
myminio/bucket1/a/NOTICE
myminio/bucket1/a/b/NOTICE
myminio/bucket1/a/b/c/NOTICE
$ mc find myminio/bucket1 --maxdepth 1
myminio/
myminio/
myminio/
myminio/
$ mc find myminio/bucket1 --maxdepth 2
myminio/bucket1/
myminio/bucket1/
myminio/bucket1/
myminio/bucket1/
$ mc find myminio/bucket1 --maxdepth 3
myminio/bucket1/NOTICE
myminio/bucket1/a/
myminio/bucket1/a/
myminio/bucket1/a/
$ mc find myminio/bucket1 --maxdepth 4
myminio/bucket1/NOTICE
myminio/bucket1/a/NOTICE
myminio/bucket1/a/b/
myminio/bucket1/a/b/
$ mc find myminio/bucket1 --maxdepth 5
myminio/bucket1/NOTICE
myminio/bucket1/a/NOTICE
myminio/bucket1/a/b/NOTICE
myminio/bucket1/a/b/c/
$ mc find myminio/bucket1 --maxdepth 6
myminio/bucket1/NOTICE
myminio/bucket1/a/NOTICE
myminio/bucket1/a/b/NOTICE
myminio/bucket1/a/b/c/NOTICE

mc find should not list directories (and certainly not repeatedly).

cmd/find-main.go Outdated

--smaller, --larger flags accept human-readable case-insensitive number suffixes such as "k", "m", "g" and "t" referring to the metric units KB, MB, GB and TB respectively. Adding an "i" to these prefixes, uses the IEC units, so that "gi" refers to "gibibyte" or "GiB". A "b" at the end is also accepted. Without suffixes the unit is bytes.

--older, --newer flags accept the suffixes "d", "w", "m" and "y" to refer to units of days, weeks, months and years respectively. <details about 1year=365 days or 12 months, etc>
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Could you fill in the exact details about 1y=365 days or 12 months or whatever the actual conversion is?

@vadmeste
Copy link
Member

@CarterMcClellan, can you fix your commit message?

$ {{.HelpName}} s3/photos --name "*.jpg" --exec "mc cp {} play/test"

5. Find all jpg images from any folder prefixed with album.
$ {{.HelpName}} s3/photos --name "*.jpg" --path "*/album*/*"
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Can't we merge --name and --path in this case? I am not sure if there are some cases when --path is mandatory

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yeah this was done to demonstrate the AND logic which is used by default in the find method i/e I want to find something ending in *.jpg with album in the file path.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yeah this was done to demonstrate the AND logic which is used by default in the find method i/e I want to find something ending in *.jpg with album in the file path.

I still think having both --name and --path flags is useless, only one is sufficient, what do you think @CarterMcClellan ?

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The reasoning was to be closer to find command and AB wanted this extensible nature. Part of the proposal gist ..

$ {{.HelpName}} s3/photos --name "*.jpg" --path "*/album*/*"

6. Find all jpgs, pngs, and gifs using regex
$ {{.HelpName}} s3/photos --regex "(?i)\.(jpg|png|gif)$"
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Can't we merge --regex with --name ?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

No because we are evaluating for several different suffixes, I think that we could technically but this would be less effective, would have to use -or flag and then do -name "jpg" -name "png" .. etc. I think this was chosen to illustrate how regex can be more flexible/ terse in certain scenarios.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

No because we are evaluating for several different suffixes, I think that we could technically but this would be less effective, would have to use -or flag and then do -name "jpg" -name "png" .. etc. I think this was chosen to illustrate how regex can be more flexible/ terse in certain scenarios.

👍

$ {{.HelpName}} s3/buck --name "*foo" --watch --exec "mc cp {} play/test"

8. Generate self expiring urls (7 days), for all objects between 64 MB, and 1 GB in size.
$ {{.HelpName}} s3 --larger 64MB --smaller 1GB --print {url}
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

shared_url is a better name I guess.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Inside of the curly braces as a substitution argument?

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Inside of the curly braces as a substitution argument?

yes

Copy link
Contributor Author

@CarterMcClellan CarterMcClellan Aug 12, 2017

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

AB chose this one, I think because the original ideas "purl" or "presigned url" were thought to be more confusing and that url was very concise and easy to understand.

cmd/find-main.go Outdated

--smaller, --larger flags accept human-readable case-insensitive number suffixes such as "k", "m", "g" and "t" referring to the metric units KB, MB, GB and TB respectively. Adding an "i" to these prefixes, uses the IEC units, so that "gi" refers to "gibibyte" or "GiB". A "b" at the end is also accepted. Without suffixes the unit is bytes.

--older, --newer flags accept the suffixes "d", "w", "m" and "y" to refer to units of days, weeks, months and years respectively. With the standard rate of conversion being 7 days being in 1 week, 30 days in 1 week, and 365 days in one year.
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

"30 days in 1 week" -> "30 days in 1 month"


filePath = s
fileContent.Key = s
}
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

There appears to be another corner case bug:

With find (executed in the mc codebase top directory):

$ find vendor/ -maxdepth 1
vendor/
vendor/golang.org
vendor/vendor.json
vendor/github.com
vendor/gopkg.in
$ find vendor/ -maxdepth 2
vendor/
vendor/golang.org
vendor/golang.org/x
vendor/vendor.json
vendor/github.com
vendor/github.com/dustin
vendor/github.com/rjeczalik
vendor/github.com/fatih
vendor/github.com/hashicorp
vendor/github.com/minio
vendor/github.com/cheggaaa
vendor/github.com/go-ini
vendor/github.com/mattn
vendor/github.com/pkg
vendor/gopkg.in
vendor/gopkg.in/check.v1

whereas with mc find:

$ mc find vendor/ --maxdepth 1
$ mc find vendor/ --maxdepth 2
vendor/vendor.json

mc is not listing directories - that is fine. However, with a max-depth of 1 it is not traversing the given path at all. There is an off-by-one bug here.

cmd/find.go Outdated
noAlias := (e == nil)

clnt, content, err := url2Stat(path)
errorIf(err, "Unable to construct client")
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Could you check #2201 (review) and make the changes?

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks @donatello we need to fix it.. Otherwise there is a crash in the current flow. I didn't add the context my mistake.

Copy link
Member

@donatello donatello left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM.

@donatello
Copy link
Member

This looks ready for merge. Any more reviews?

@harshavardhana
Copy link
Member

@CarterMcClellan can you rebase your PR? its ready to be merged in.

@harshavardhana
Copy link
Member

harshavardhana commented Sep 1, 2017

@CarterMcClellan you need to rebase to the master branch here https://github.com/minio/mc . To do that you need to do following locally

git fetch <remote_pointing_to_minio_mc>
git rebase <remote_pointing_to_minio_mc>/master

NOTE: Make sure you are in the find branch before you do above commands.

@deekoder deekoder merged commit 49ef42d into minio:master Sep 5, 2017
@OlegAlexander
Copy link

OlegAlexander commented Jun 18, 2024

Dear mc team, could you please add more detailed documentation/usage examples of how to use mc rm to remove files that match a pattern? I did eventually figure it out but it took me a while:

mc find myminio/mybucket/myfolder --name "pattern*.jpg" --exec "mc rm {}"

Here's how I found the right command:

  1. Go here to find no info about wildcards.
  2. Google minio "mc rm" wildcard.
  3. Go here and then come to this page.
  4. Read the examples on how to use mc find with --exec in the discussion above.

EDIT: I see there is documentation about --exec usage here. But I think all the other commands that could use wildcards should link to the --exec docs.

Thank you!

@marktheunissen
Copy link
Contributor

Hi @OlegAlexander, could you please put your request on https://github.com/minio/docs, as that's where documentation for mc lives. Either make a PR with suggested changes, or file an issue there for discussion, rather than comment on old pull requests.

Pattern matching is documented here: https://min.io/docs/minio/linux/reference/minio-mc.html#pattern-matching

@minio minio locked as resolved and limited conversation to collaborators Jun 19, 2024
@harshavardhana
Copy link
Member

Locked of any off-topic conversations.

Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

Proposal for new mc find command
10 participants