Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

test cleanup: Figure out a better story for string literals. #5745

Closed
alexcrichton opened this issue Jul 18, 2018 · 14 comments · Fixed by #5791
Closed

test cleanup: Figure out a better story for string literals. #5745

alexcrichton opened this issue Jul 18, 2018 · 14 comments · Fixed by #5791
Labels
A-testing-cargo-itself Area: cargo's tests

Comments

@alexcrichton
Copy link
Member

Right now the string literals in Cargo's test suite are a bit of a mess. They're intended to be pretty concise definitions of files all inline in the test itself. Unfortunately over time rustfmt has made them look pretty ugly and they're not really that ergonomic any more.

What use to be:

#[test]
fn foo() {
    project()
        .file("foo.rs", "
            fn foo() {}
        ")
}

is largely now reformatted automatically via rustfmt to

#[test]
fn foo() {
    project()
        .file(
            "foo.rs",
            "
            fn foo() {}
        "
        )
}

and "ideally" is instead written as:

#[test]
fn foo() {
    project()
        .file(
            "foo.rs",
            "
                fn foo() {}
            "
        )
}

Unfortunately this means that the "ideal" way we're writing tests is with 4 levels of indentation, quite a lot!

I'm not really sure of a great solution here, but we should find a strategy that works well with rustfmt's default settings and looks pretty good and easy to write at the same time.

@dwijnand
Copy link
Member

Is there any prior art in the Rust ecosystem we can look at here? I'm thinking things like the CLI WP or the testing setups in existing CLI projects, like ripgrep.

@alexcrichton
Copy link
Member Author

cc @killercup, you may have thoughts on this?

@dwijnand
Copy link
Member

dwijnand commented Jul 23, 2018

For one-line files, how do you feel about one-line definitions?

#[test]
fn foo() {
    project()
        .file("foo.rs", "fn foo() {}")
}

@alexcrichton
Copy link
Member Author

Indeed yeah! If it fits on one line that looks fine to me

@matklad
Copy link
Member

matklad commented Jul 25, 2018

What we do in IntelliJ is that we use an external DSL of sorts to define a directory structure using only a single string literal. In rustfmted Rust this might look like this:

fn main() {
    let p = build_project(
        r#"
//- Cargo.toml
[project]
name = "foo"
version = "0.1.0"
authors = []
[workspace]
members = ["bar"]

//- src/main.rs
fn main() {}

//- bar/Cargo.toml
[project]
name = "bar"
version = "0.1.0"
authors = []
workspace = ".."

//- bar/src/main.rs
"fn main() {}"
"#,
    );
}

internally, this'll split on //- \w* separators and desugar into the builder calls.

There's also an observation that Cargo.toml is spelled with minor variations in every test, and that it might seem like a good to try to deduplicate that using something like basic_bin_manifest. I personally would be slightly against such deduplication: in my experience, it's usually better to spell the test data fully explicitly: it never changes, so deduplication does not save maintenance effort, and it's slightly easier to read the test if you don't need to execute that basic_bin_manifest in your head.

EDIT: version with format doesn't look too ugly:

fn main() {
    let p = build_project(&format!(
        r#"
//- Cargo.toml
[project]
name = "foo"
version = "{version}"
authors = []
[workspace]
members = ["bar"]

//- src/main.rs
fn main() {}

//- bar/Cargo.toml
[project]
name = "bar"
version = "{version}"
authors = []
workspace = ".."

//- bar/src/main.rs
"fn main() {}"
"#,
        version = "0.0.1"
    ));
}

@dwijnand
Copy link
Member

Thanks @matklad! It's great to have examples from other projects.

There's also an observation that Cargo.toml is spelled with minor variations in every test, and that it might seem like a good to try to deduplicate that using something like basic_bin_manifest. I personally would be slightly against this duplication: in my experience, it's usually better to spell the test data fully explicitly: it never changes, so deduplication does not save maintenance effort, and it's slightly easier to read the test if you don't need to execute that basic_bin_manifest in your head.

Could you expand more from your experience why everything spelt out is usually better?

While I agree that some restraint should be applied in not over-generalising, I also think it's important for tests to be readable and reviewable.

By the way some degree of "spelt out fully explicitly" has been removed with #5782 landing yesterday. I'd be happy to revert (or forward fix) that change if it is preferred.

Open to ideas and suggestions on striking a good balance.

@matklad
Copy link
Member

matklad commented Jul 25, 2018

Sure! First of all I'd like to clarify that I don't have a too strong opinion about this, either approach is good! Here are some rough thoughts about why duplicating test data (as opposed to test code) is, surprisingly, not bad:

  • we have strict backwards-compatibility requirements, and this guarantees that we will not have to modify the data in the future, so abstraction does not save maintenance cost here

  • because new tests are written by copy-pasting and modifying old test, abstraction does not save writing effort either

  • on the readability front, there's a tension between "reading less lines" and "not interpreting function calls to understand what test data actually looks like". Usually, you read a single test in isolation, when you debug it, so ease of understanding might be preferred over the raw number of lines

The main problem with abstracting test data repetition is that in some tests there are slight variations in the data, and so you'll have to either add a bunch of knobs to the abstraction (which in the limit equals to just writing out everything explicitly) or to use very different style for tests which don't fit the abstraction exactly. EDIT: an example of such abstraction creep is the main_file function.

FWIW, the "test data as a string" approach allows for some deduplication as well:

r#"
//- Cargo.toml
[DEFAULT_CARGO_TOML]

//- src/main.rs
fn main() {}
#"

@matklad
Copy link
Member

matklad commented Jul 25, 2018

OTOH, #5782 deletes quite a bunch of code! :)

So, what about this?

  • We change fn project signature to fn project(project_descr: &str) -> Project, where project_descr is the //- path DSL. It internally uses ProjectBuilder, which can also be used directly for a cases where you want to dynamically generate a project (if we have such tests).

  • the DSL also supports //- default_project maker, which creates a self-contained project with Cacrgo.toml and empty src/lib.rs

  • Complex tests look like this:

    #[test]
    fn test_foo() {
        let p = project(r#"
    //- Cargo.toml
    [project]
    name = "foo"
    version = "0.1.0"
    authors = []
    [workspace]
    members = ["bar"]
    
    //- src/main.rs
    fn main() {}
    
    //- bar/Cargo.toml
    [project]
    name = "bar"
    version = "0.1.0"
    authors = []
    workspace = ".."
    
    //- bar/src/main.rs
    "fn main() {}"
    "#,
        );
    }

    the //- bits definitely are magical, but presumably it is easy to figure out what's going on by just looking at the string literal?

  • Simple tests look like this:

    #[test]
    fn test_bar() {
        let p = build_project(
            r#"
    //- default_project
    //- .cargo/config
    [target]
    nonexistent-target = "foo"
    "#,
        );
    }

    The //- default_project is definitely not self-documenting, but it should save us quite a bit of typing, and presumably its easy enough to figure out what it does?

  • Trivial CLI tests are one-liners: let p = build_project("//- default_project");

@dwijnand
Copy link
Member

dwijnand commented Jul 25, 2018

Note https://github.com/rust-lang/cargo/pull/5787/files is a proposal for deleting twice as much code as well, by using a very simple 2 param basic_manifest function. I think the ROI is worth it.

@matklad
Copy link
Member

matklad commented Jul 25, 2018

That one feels like it's kind of on the borderline for me :) I think we can save about as many lines by removing authors = [] which isn't required and by writing code such that rustfmt does not add so many empty lines (the only way to do that I know though is that project DLS trick). What's left are [package], name and version words, which more or less correspond to the name of the function, first positional argument and positional second argument. I.e, we encode exactly the same amount of information, only in a slightly roundabout way.

That is, I wouldn't aim at reducing the net amount of code, and rather focus on reducing the number of distinct things you need to specify to write a test. Removing a default Cargo.toml completely (together with empty lib.rs preferably) seems like a win. Providing a more ergonomic way to write Cargo.toml is less clear: TOML is pretty ergonomic as is.

@dwijnand
Copy link
Member

I think there's value in reducing vertical verbosity. The proposal in #5787 is 1 line per basic manifest. If my counting is right in the DSL you describe it would be ~5 lines per manifest (//- Cargo.toml, [package], name = "..", version = "..", and a blank line separator). That's certainly better than the 9(!) lines it is right now. 😄

Another idea I wanted to run by you (and the rest of the cargo team), how would you feel instead about using and being able to manipulate a TOML data type in the tests instead of a string literal? Dedicated tests could be created in the suite to be confident about the string to TOML parsing and consumption, but for the rest of the test suite instead of a bare string a nicely structured and typed piece of data could be used. WDYT?

@alexcrichton
Copy link
Member Author

The DSL @matklad is mentioning sounds like a great idea to me! I hadn't thought of that before and it sounds like a great way (even if not the exact syntax) to cut down on the wordiness without losing too much information.

I agree with @matklad that we don't want to make things too magical, but I agree with @dwijnand that the changes so far I think have been pretty good in that Cargo.toml stuff is largely just noise compared to the rest of the project

bors added a commit that referenced this issue Jul 25, 2018
Declare one-line files on one line, in test projects

Builds on #5787
Fixes #5745 unless @alexcrichton feels more can be done there.
@dwijnand
Copy link
Member

@matklad This issue will close with #5791 landing.

But I'm going to have a try at implementing your idea. I'm just going to look at #5742 first, as I've been thinking about it.

@matklad
Copy link
Member

matklad commented Jul 25, 2018

Excellent @dwijnand! Thank you so much for looking into this!

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
A-testing-cargo-itself Area: cargo's tests
Projects
None yet
Development

Successfully merging a pull request may close this issue.

3 participants