Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

sort: implement -k and -t support #1996

Merged
merged 23 commits into from
Apr 10, 2021
Merged

sort: implement -k and -t support #1996

merged 23 commits into from
Apr 10, 2021

Conversation

miDeb
Copy link
Contributor

@miDeb miDeb commented Apr 1, 2021

For #1728.

This allows to specify keys after the -k flag and a custom field
separator using -t.

Not sure if I've missed something... For reference GNU coreutils documentation about sort , POSIX spec.

This allows to specify keys after the -k flag and a custom field
separator using -t.

Support for options for specific keys is still missing, and the -b flag
is not passed down correctly.
@miDeb
Copy link
Contributor Author

miDeb commented Apr 1, 2021

There's a test failure (of a test I added) saying thread 'test_sort::test_keys_invalid_field_zero' panicked at 'Broken pipe (os error 32)', tests/common/util.rs:733:37. Is this a problem with the test or something else?

@miDeb
Copy link
Contributor Author

miDeb commented Apr 1, 2021

I think I want to implement the missing functionality first, before requesting a review

@miDeb miDeb marked this pull request as draft April 1, 2021 20:00
src/uu/sort/src/sort.rs Outdated Show resolved Hide resolved
src/uu/sort/src/sort.rs Outdated Show resolved Hide resolved
src/uu/sort/src/sort.rs Outdated Show resolved Hide resolved
@kimono-koans
Copy link
Contributor

Just FYI, I've been working on sort too. See most recently #2008. Pleased to let you implement -k and -t, but be aware there is some functionality that I've added and I'd hate for us to needlessly conflict. If you need my help with anything, please let me know.

@miDeb
Copy link
Contributor Author

miDeb commented Apr 2, 2021

Thanks, I'll take a look at your PR, hopefully there aren't too many conflicts...
I think I figured out how to do it, but if you have any suggestions I'll be happy to take them.

@miDeb miDeb changed the title sort: implement basic -k and -t support sort: implement -k and -t support Apr 2, 2021
@sylvestre
Copy link
Sponsor Contributor

---- test_sort::test_keys_invalid_field_zero stdout ----

current_directory_resolved: 

run: /home/travis/build/uutils/coreutils/target/debug/coreutils sort -k 0.1

thread 'test_sort::test_keys_invalid_field_zero' panicked at 'Broken pipe (os error 32)', tests/common/util.rs:742:37

note: run with `RUST_BACKTRACE=1` environment variable to display a backtrace.

Is failing twice in the CI

@miDeb
Copy link
Contributor Author

miDeb commented Apr 2, 2021

I haven't been able to reproduce this locally, and I have no idea why this is failing for this test specifically...

@miDeb
Copy link
Contributor Author

miDeb commented Apr 2, 2021

Well, that seems to have worked. There's one remaining failing test, saying
Error: Incompatible 'Cargo.lock' format; try cargo +1.40.0 update, which I guess isn't something I have caused because I haven't modified Cargo.lock at all.

@sylvestre
Copy link
Sponsor Contributor

Yeah, don't bother about minrustv :)

@sylvestre
Copy link
Sponsor Contributor

I will fix it here:
#2013

@sylvestre
Copy link
Sponsor Contributor

Please let me know when it is ready :)

@miDeb
Copy link
Contributor Author

miDeb commented Apr 8, 2021

This should be more or less ready. I don't know if I should wait for #2008 to be merged first so I can rebase on top of it, as there are going to be some conflicts

@miDeb miDeb marked this pull request as ready for review April 8, 2021 15:42
@sylvestre
Copy link
Sponsor Contributor

merged #2008 sorry for the conflicts :(
it has plenty of tests

@miDeb
Copy link
Contributor Author

miDeb commented Apr 8, 2021

Thanks, I already expected that :)

@miDeb
Copy link
Contributor Author

miDeb commented Apr 10, 2021

This should be ready now. Performance seems to have regressed slightly, but I don't think it's a big deal. Note that if you want to measure performance you should specify an outfile (-o), because I made writes to stdout buffered as a drive-by change, resulting in a clear perf win for redirecting stdout (sort > outfile).

@miDeb
Copy link
Contributor Author

miDeb commented Apr 10, 2021

Spoke too soon, some tests are failing. Will fix them :)

@sylvestre
Copy link
Sponsor Contributor

could you please create and document how you did the benchmarking?
For example, create a
BENCHMARKING.md
file
with where to get a data set (or generate it) and the commands to evaluate performances? (with hyperfine if possible)

thanks

@miDeb
Copy link
Contributor Author

miDeb commented Apr 10, 2021

I added a benchmarking.md file. Below is a comparison of performance before and after my change:

wordlist: (hyperfine "target/release/coreutils sort shuffled_wordlist.txt -o output.txt")
before:
Time (mean ± σ): 26.1 ms ± 3.6 ms [User: 64.3 ms, System: 9.6 ms]
after:
Time (mean ± σ): 27.2 ms ± 2.3 ms [User: 107.5 ms, System: 12.7 ms]

wordlist ignoring the case (hyperfine "target/release/coreutils sort shuffled_wordlist.txt -f -o output.txt")
before:
Time (mean ± σ): 253.4 ms ± 2.5 ms [User: 590.0 ms, System: 177.0 ms]
after:
Time (mean ± σ): 41.5 ms ± 4.0 ms [User: 142.8 ms, System: 13.0 ms]
This performance improvement is because we're uppercasing everything at the beginning once instead of at every comparison.

numbers: (hyperfine "target/release/coreutils sort shuffled_numbers.txt -n -o output.txt")
before:
Time (mean ± σ): 40.1 ms ± 3.8 ms [User: 338.7 ms, System: 9.7 ms]
after:
Time (mean ± σ): 44.6 ms ± 3.6 ms [User: 370.2 ms, System: 9.5 ms]

@sylvestre sylvestre merged commit 49c9d8c into uutils:master Apr 10, 2021
@miDeb miDeb mentioned this pull request Apr 10, 2021
@miDeb miDeb mentioned this pull request Apr 26, 2021
4 tasks
@miDeb miDeb deleted the sort-fields branch April 26, 2021 15:08
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants