Skip to content

Commit

Permalink
for release
Browse files Browse the repository at this point in the history
  • Loading branch information
Vanessa McHale committed Jun 27, 2017
1 parent c1842ff commit f7272f9
Show file tree
Hide file tree
Showing 10 changed files with 98 additions and 58 deletions.
4 changes: 3 additions & 1 deletion CONTRIBUTING.md
Original file line number Diff line number Diff line change
Expand Up @@ -12,7 +12,9 @@ If you'd like binaries for any of the platforms listed
## Support for other languages

If you'd like me to add support for other languages, just open an issue and tell
me all file extensions associated with its build artifacts.
me all file extensions associated with its build artifacts, plus common names
for build directories, and names/extensions of any configuration files typically
used.

Alternately, if you'd like to fork it and open a PR yourself, the relevant regex is
[here](https://github.com/vmchale/file-sniffer/blob/master/src/walk_parallel/single_threaded.rs#L73).
Expand Down
35 changes: 30 additions & 5 deletions README.md
Original file line number Diff line number Diff line change
Expand Up @@ -4,7 +4,7 @@

If you do a significant amount of programming, you'll probably end up with
build artifacts scattered about. `sn` is a tool to help you find those
artifacts.
artifacts.

`sn` is also a replacement for `du`. It has far nicer
output, saner commands and defaults, and it even runs faster on big directories
Expand Down Expand Up @@ -82,14 +82,14 @@ To turn off colorized output:
export CLICOLOR=0
```

### Comparison
### Comparison (or, 100 Things I Hate About du)

#### Reasons to use `du`

* Reads disk usage, not file sizes
* Optionally dereferences symlinks
* Slightly faster on small directories
* Well-supported
* Slightly faster on small directories
* Stable and well-supported

#### Reasons to use `sn`

Expand All @@ -105,6 +105,29 @@ export CLICOLOR=0
* Extensible in Rust
* Benefits from upstream improvements in Rust ecosystem

#### Benchmark results

| Directory | Tool | Command | Time |
| --------- || ---- | ------- | ---- |
| Source | sn | `sn p` | 60.74 ms |
| Source | sn | `sn a` | 99.92 ms |
| Source | du | `du -hacd2` | 88.28 ms |
| Build | sn | `sn p`| 185.2 ms |
| Build | sn | `sn a` | 271.9 ms |
| Build | du | `du -hacd2` | 195.5 ms |
| Project | sn | `sn p` | 36.68 ms |
| Project | sn | `sn a` | 42.90 ms |
| Project | du | `du -hacd2` | 35.53 ms |

These commands are all essentially equivalent in function, except that `sn p`
may use more threads than `sn a` or `du`.

Results were obtained using Gabriel Gonzalez's [bench](https://github.com/Gabriel439/bench)
tool. "Source" was my programming directory alone, comprising data, source code,
and version control; around 600MB total. "Project" was a single polyglot project,
plus artifacts; around 1GB total. "Build" was my programming directory, with
current projects built; around 4GB total.

#### Screenshots (alacritty + solarized dark)

##### The Tin Summer
Expand All @@ -119,7 +142,8 @@ export CLICOLOR=0

Currently, `sn` looks for files that either have an extension associated with
build artifacts, or executable files that are ignored by version control. It also looks for "build
directories", like `.stack-work`, `elm-stuff`, etc. and it considers *all* their
directories", like `.stack-work`, `elm-stuff`, etc. and if it finds a
configuration file like `tweet-hs.cabal`, it considers *all* their
contents to be build artifacts.

#### Languages Supported
Expand All @@ -136,4 +160,5 @@ The *intent* is to support basically anything, so feel free to open a PR or star
- [x] Vimscript
- [x] Idris
- [x] FORTRAN
- [ ] Ruby
- [ ] C
14 changes: 13 additions & 1 deletion TODO.md
Original file line number Diff line number Diff line change
Expand Up @@ -37,9 +37,15 @@
- [x] fix bugs w/ excludes & overzealous use of .gitignores
- [x] multiple included paths
- [x] let it run on a single file
- [x] don't call `is_project_dir()` three times
- [x] improve ergonomics (and speed) by guessing language of project
directory

# Bugs

- [ ] when using `--all` with `ar`, it should not recurse arbitrarily far.
- [ ] when `--all` is used with `sort`, it *should* recurse arbitrarily far.

# UI/Ergonomics

- [ ] silent flag to ignore warnings?
Expand All @@ -65,6 +71,12 @@
- [ ] elm
- [ ] python

# Code maintenance

- [ ] make `read_all()` take a struct.
- [ ] strip out machinery for `with_gitignore` and `artifact_regex`


# Performance

- [ ] parity with du without threading
Expand Down Expand Up @@ -97,4 +109,4 @@
- [ ] make an error type & use that to organize things
- [x] change french/german binary name
- [ ] upsteam PR to clap-rs?
- [ ] fix build.rs
- [x] fix build.rs
14 changes: 0 additions & 14 deletions bash/build_all

This file was deleted.

2 changes: 0 additions & 2 deletions bash/windows_test

This file was deleted.

5 changes: 5 additions & 0 deletions build.rs
Original file line number Diff line number Diff line change
Expand Up @@ -11,7 +11,12 @@ use clap::App;
fn main() {

// load configuration
#[cfg(feature = "english")]
let yaml = load_yaml!("src/cli/options-en.yml");
#[cfg(feature = "francais")]
let yaml = load_yaml!("src/cli/options-fr.yml");
#[cfg(feature = "deutsch")]
let yaml = load_yaml!("src/cli/options-de.yml");
let mut app = App::from_yaml(yaml).version(crate_version!());

// generate bash completions if desired
Expand Down
4 changes: 0 additions & 4 deletions src/cli/options-en.yml
Original file line number Diff line number Diff line change
Expand Up @@ -191,10 +191,6 @@ subcommands:
short: o
long: sort
help: Sort results by size
- gitignore:
short: g
long: no-gitignore
help: Don't bother using .gitignore or darcs boring file information for faster traversal
- depth:
short: d
long: depth
Expand Down
2 changes: 1 addition & 1 deletion src/test.rs
Original file line number Diff line number Diff line change
Expand Up @@ -20,7 +20,7 @@ fn bench_cli_options(b: &mut Bencher) {
b.iter(|| {
App::from_yaml(yaml)
.version(crate_version!())
.get_matches_from(vec!["sn", "ar", "-g", "."])
.get_matches_from(vec!["sn", "ar", "."])
})
}

Expand Down
26 changes: 13 additions & 13 deletions src/utils.rs
Original file line number Diff line number Diff line change
Expand Up @@ -9,10 +9,20 @@ use self::num_cpus::get;

/// Gather the information from `.gitignore`, `.ignore`, and darcs `boring` files in a given
/// directory, and assemble a `RegexSet` from it.
pub fn mk_ignores(in_paths: &PathBuf, maybe_gitignore: &Option<RegexSet>) -> Option<RegexSet> {
pub fn mk_ignores(in_paths: &PathBuf, maybe_ignore: &Option<RegexSet>) -> Option<RegexSet> {

if let Some(ref gitignore) = *maybe_gitignore {
Some(gitignore.to_owned())
if let Some(ref ignore) = *maybe_ignore {
Some(ignore.to_owned())
} else if let (ignore_path, Ok(mut file)) =
{
let mut ignore_path = in_paths.clone();
ignore_path.push(".ignore");
(ignore_path.clone(), File::open(ignore_path.clone()))
} {
let mut contents = String::new();
file.read_to_string(&mut contents)
.expect("File read failed."); // ok because we check that the file exists
Some(file_contents_to_regex(&contents, &ignore_path))
} else if let (gitignore_path, Ok(mut file)) =
{
let mut gitignore_path = in_paths.clone();
Expand All @@ -33,16 +43,6 @@ pub fn mk_ignores(in_paths: &PathBuf, maybe_gitignore: &Option<RegexSet>) -> Opt
file.read_to_string(&mut contents)
.expect("File read failed."); // ok because we check that the file exists
Some(darcs_contents_to_regex(&contents, &darcs_path))
} else if let (ignore_path, Ok(mut file)) =
{
let mut ignore_path = in_paths.clone();
ignore_path.push(".ignore");
(ignore_path.clone(), File::open(ignore_path.clone()))
} {
let mut contents = String::new();
file.read_to_string(&mut contents)
.expect("File read failed."); // ok because we check that the file exists
Some(file_contents_to_regex(&contents, &ignore_path))
} else {
None
}
Expand Down
50 changes: 33 additions & 17 deletions src/walk_parallel/single_threaded.rs
Original file line number Diff line number Diff line change
Expand Up @@ -19,9 +19,19 @@ fn glob_exists(s: &str) -> bool {
glob(s).unwrap().filter_map(Result::ok).count() != 0 // ok because panic on IO Errors is good?
}

/// Helper function to identify project directories. The heuristic is as follows:
/// Helper function to identify project directories. The heuristic is as follows:
///
/// 1. For `.stack-work`, look for a
/// 1. For `.stack-work`, look for a `.cabal` file or a `package.yaml` file in the parent
/// directory.
/// 2. For `target`, look for a `Cargo.toml` file in the parent directory.
/// 3. For `elm-stuff`, look for `elm-package.json` in the parent directory.
/// 4. For `build`, `dist`, look for a `.cabal`, `setup.py` or `cabal.project` file.
/// 5. For `dist-newstyle`, look for a `.cabal` or `cabal.project` file.
/// 6. For `nimcache`, look for a `.nim` file in the parent directory.
/// 6. Otherwise, if `setup.py` is in the parent directory and it ends with `.egg-info`, return
/// true.
/// 7. In all other cases, return false, but still proceed into the directory to search files by
/// extension.
pub fn is_project_dir(p: &str, name: &str) -> bool {
// for project directories
lazy_static! {
Expand All @@ -41,6 +51,10 @@ pub fn is_project_dir(p: &str, name: &str) -> bool {
parent_string.push_str("/../*.cabal");
parent_path.exists() || hpack.exists() || glob_exists(&parent_string)
}
"nimcache" => {
parent_string.push_str("/../*.nim");
glob_exists(&parent_string)
}
"target" => {
parent_path.push("../Cargo.toml");
parent_path.exists()
Expand Down Expand Up @@ -250,7 +264,6 @@ pub fn read_size(
if path_type.is_file() {
// if this fails, it's probably because `path` is a broken symlink
if let Ok(metadata) = val.metadata() {
// faster on Windows
if !artifacts_only ||
{
is_artifact(
Expand All @@ -272,7 +285,6 @@ pub fn read_size(
let dir_size = if artifacts_only &&
is_project_dir(path_string, val.file_name().to_str().unwrap())
{
// REGEX_PROJECT_DIR.is_match(path_string) {
read_size(
&path,
depth + 1,
Expand Down Expand Up @@ -414,9 +426,21 @@ pub fn read_all(
// otherwise, go deeper
else if path_type.is_dir() {
if let Some(d) = max_depth {
if depth + 1 >= d ||
(artifacts_only &&
is_project_dir(path_string, val.file_name().to_str().unwrap()))
if depth + 1 >= d && (!with_gitignore || !artifacts_only) {
let dir_size = {
read_size(
&path,
depth + 1,
artifact_regex,
excludes,
&gitignore,
with_gitignore,
artifacts_only,
)
};
tree.push(path_string.to_string(), dir_size, None, depth + 1, true);
} else if artifacts_only &&
is_project_dir(path_string, val.file_name().to_str().unwrap())
{
let dir_size = {
read_size(
Expand All @@ -425,16 +449,8 @@ pub fn read_all(
artifact_regex,
excludes,
&gitignore,
with_gitignore &&
!is_project_dir(
path_string,
val.file_name().to_str().unwrap(),
),
artifacts_only &&
!is_project_dir(
path_string,
val.file_name().to_str().unwrap(),
), // FIXME only compute this once
false,
false,
)
};
tree.push(path_string.to_string(), dir_size, None, depth + 1, true);
Expand Down

0 comments on commit f7272f9

Please sign in to comment.