Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

put can read from stdin by specifying the '-' source file #98

Closed
wants to merge 10 commits into from

Conversation

Shastick
Copy link
Contributor

Here is a small proposal for put ing via STDIN, as requested in #44.

I'm relatively new to GO (and the github workflow as well...) and more than happy to read any comment!

No tests written in go for now, though :/
@colinmarc
Copy link
Owner

Hi @Shastick,

Thanks for the PR! This looks like a reasonable approach, but can you add some tests and fix the build, to start? Also, please run goimports -w to format your code.

@colinmarc
Copy link
Owner

(If you rebase, the protoc problem in travis should be fixed)

@Shastick
Copy link
Contributor Author

Thanks for the feedback. Will do this asap :)

colinmarc and others added 7 commits January 24, 2018 21:18
(this should fix travis issues, hopefully)
For some reason, make on travis recently started deciding that the
*.pb.go files were older than the *.proto files, ever though they
originate from the same `git checkout`. The easiest way to fix this
is simply to touch them right before we run make, so that they look
newer.
@Shastick
Copy link
Contributor Author

Hoping the test is enough: I did not test scenarios where it should fail (existing file/directory with the same name)

@Shastick Shastick force-pushed the put-stdin branch 3 times, most recently from 6c1a9f1 to 50c093e Compare January 29, 2018 10:50
@Shastick
Copy link
Contributor Author

Added test cases for expected failures

Copy link
Owner

@colinmarc colinmarc left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is looking good! Just a few small nitpicks here and there, then I'll squash and merge.

cmd/hdfs/put.go Outdated
if existing.IsDir() {
fatal(fmt.Errorf("can't write from STDIN into a directory: %s", dest))
} else {
fatal(fmt.Errorf("won't write to an existing file: %s", dest))
Copy link
Owner

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Both branches here can be just fatal(&os.PathError{"put", dest, os.ErrExist})

cmd/hdfs/put.go Outdated
}
}

func putFromStdIn(client *hdfs.Client, dest string) {
Copy link
Owner

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Nitpick: this should be putFromStdin (without the I capitalized), since the standard library spells it that way.

assert_equal $SHA `shasum < $BATS_TMPDIR/mobydick_stdin_test.txt | awk '{ print $1 }'`
}

@test "put from stdin into existing file" {
Copy link
Owner

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

How does the normal hadoop -fs behave in this case, out of curiosity?

Copy link
Contributor Author

@Shastick Shastick Jan 29, 2018

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Just had a try:

put: `/tmp/testme': File exists

cmd/hdfs/put.go Outdated
}

mode := 0755 | os.ModeDir
// make sure parent dir exists
Copy link
Owner

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This and the next few comments are unnecessary. Also, for future reference, comments should be in complete sentences.

@Shastick
Copy link
Contributor Author

Shastick commented Jan 29, 2018

The failure seems unrelated (only happened on one of the 4 runs) for a rm test:

not ok 50 rm dir without -r, but with -f
# (from function `setup' in test file cmd/hdfs/test/rm.bats, line 8)
#   `$HDFS touch /_test_cmd/rm/b' failed
# no available namenodes: dial tcp 127.0.0.1:9000: i/o timeout

update: meh, did not see that these were marked "allowed failures".

@colinmarc colinmarc closed this in 47855e4 Jan 30, 2018
@colinmarc
Copy link
Owner

colinmarc commented Jan 30, 2018

Merged in 47855e4. Thanks so much for the contributions!

junjieqian pushed a commit to microsoft/colinmarc-hdfs that referenced this pull request Mar 16, 2018
junjieqian pushed a commit to microsoft/colinmarc-hdfs that referenced this pull request Mar 16, 2018
junjieqian added a commit to microsoft/colinmarc-hdfs that referenced this pull request Mar 17, 2018
* implement client options, support multiple namenodes with failover

* Fix an alignment bug causing us to send very small packets

The faulty math in block_write_stream was causing us to always send
512 byte packets.

* rename host to namenodeHost

* document standby retry behaviour

* include detail in namenode error

* deprecate WrapNamenodeConnection

* replace goto behaviour with for loop

* fallback to error from last failure

* travis: bump go version to 1.x

* Regenerate proto files

(this should fix travis issues, hopefully)

* travis: touch *.pb.go before running make

For some reason, make on travis recently started deciding that the
*.pb.go files were older than the *.proto files, ever though they
originate from the same `git checkout`. The easiest way to fix this
is simply to touch them right before we run make, so that they look
newer.

* travis: attempt to fix hadoop tarball caching

* Fix comment for CopyToRemote

Fixes colinmarc#89, colinmarc#90

* Deprecate NewForUser and NewForConnection

* Add the ability for 'put -' to read from stdin

Fixes colinmarc#44, closes colinmarc#98

* Convert from govendor to dep

* Update dependencies

* Tweak put tests

Add a very basic test, and put everything in the same directory so it
gets cleaned up properly.

* Improve conf loading and surrounding tests

- Fix a bug where configuration wasn't loaded from specified paths
- Only search one path at a time, rather than deduping from all
- Correctly handle fs.default.name
- Improve test coverage and documentation

* Fix the merge from upstream

* more fixes to merge master

* more fixes
@Shastick Shastick deleted the put-stdin branch August 20, 2019 12:49
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants