Skip to content
This repository has been archived by the owner on Dec 29, 2022. It is now read-only.

hover tests test_tooltip #1151

Closed
alexheretic opened this issue Nov 28, 2018 · 21 comments
Closed

hover tests test_tooltip #1151

alexheretic opened this issue Nov 28, 2018 · 21 comments

Comments

@alexheretic
Copy link
Member

alexheretic commented Nov 28, 2018

What's the deal with this multi-test. It's been failing a bunch lately, but generally for pretty good reasons as far as I can see. However, we don't run this test upstream which means it doesn't prevent any regressions.

  • Why don't we run this upstream? Is there any point having non-upstream tests, what do they guarantee when rls can be released with them failing?
  • Should we refactor it into individual unit tests per case, maybe with a macro?
  • RUSTFLAGS=--cfg=enable_tooltip_tests is horrible, changing that env causes recompilation of all dependencies. A feature would have been better... but see my first point.
  • Why doesn't this test work when you run it on it's own. I get
    running 1 test
    error: Unrecognized option: 'crate-name'
    test actions::hover::test::test_tooltip ... test actions::hover::test::test_tooltip has been running for over 60 seconds
    
    followed by hanging.

@nrc @Xanewok @aloucks

@aloucks
Copy link
Contributor

aloucks commented Nov 29, 2018

I believe @alexcrichton added the RUSTFLAGS option because the std lib save-analysis data isn't available (or up-to-date?) when run upstream and it's required for these tests to pass. Although I think it's only required for the individual sub-tests that have std in the file name (hovers over Vec and String for example). The std tests are a small subset of the tooltip tests.

The recent failing tests were legitimate and mostly due to the save-analysis generation being broken for a month or so. I think one failing test was also due to a change in the docs (and a trait link now correctly being returned).

I'm not certain about the test hanging when run in isolation. I know they didn't used to do that (I ran them in isolation many times in the past). I suspect it's Racer, but that still doesn't explain why it would only happen when run in isolation.

@aloucks
Copy link
Contributor

aloucks commented Nov 29, 2018

I think there's some hover functionality coverage in the std tests that isn't tested elsewhere (e.g. hovering over traits). I might have some time this weekend to have a look at improving the situation and maybe removing the std tests all together. With those gone, we shouldn't need the RUSTFLAGS conditional.

@alexheretic
Copy link
Member Author

@aloucks that sounds good thanks!

@aloucks
Copy link
Contributor

aloucks commented Nov 30, 2018

I remember now why it was setup as a single test with multiple checks -- it would have had to re-build the test project for every single test case, which would likely be even slower. I considered some kind of global/run-once for a single build but that seemed even less ideal.

I found where it's hanging (it's waiting for the build to complete), but I can't explain why it's only happening when running the test in isolation. I see there were some recent changes to the way that
InitActionContext is initialized but nothing stands out as being the culprit.

self.ctx.block_on_build();

@aloucks
Copy link
Contributor

aloucks commented Nov 30, 2018

I found the issue causing the test to hang when run alone -- it's unable to build the hover test project.

The debug output showed that it was trying to use the rls test binary in place of rustc which seemed odd and then I found this in the other test harness code:

rls/src/test/harness.rs

Lines 45 to 47 in f75e68b

if env::var("RUSTC").is_err() {
env::set_var("RUSTC", "rustc");
}

I must have been running cargo test hover instead of cargo test tooltip, which ran a few additional tests including one that makes use of the other test harness and sets the RUSTC environment variable.

@aloucks
Copy link
Contributor

aloucks commented Nov 30, 2018

I want to add some more tests that cover the cases that the current std tests cover, but I'm little hesitant to completely remove the std tests all together. They're effectively testing and save-analysis generation right now, which I believe is valuable.

@alexheretic
Copy link
Member Author

Great thanks @aloucks.

I can run this test on it's own now with cargo test -- --ignored. Can we ditch the enable_tooltip_tests can just use that in CI?

I think it'll help to have as many of these tests as possible running in rust-lang/rust and when regular devs run cargo test. I'll raise a pr splitting out the tests into std & non-std & removing the --cfg thing as it's another barrier to running the std tests.

@aloucks
Copy link
Contributor

aloucks commented Dec 8, 2018

I think that sounds good. I wasn't able to spend as much time on it as I had hoped to last weekend.

A few things:

  1. This test has references to items in both an external crate fnv and std (but didn't have std in the filename). I'd probably just remove the references to std::sync::Arc and be done with it.
    https://github.com/rust-lang/rls/blob/master/test_data/hover/src/test_tooltip_mod_use_external.rs
  2. I believe the std test is the only place where trait method tooltips are tested. I was going to create a new test for trait methods and then move or remove all of the std tests.
    https://github.com/rust-lang/rls/blob/master/test_data/hover/src/test_tooltip_std.rs#L22

@alexheretic
Copy link
Member Author

Ok I've raised #1175 now splitting the tooltip harness into two. I've also been trying to get to the bottom of the windows errors and have made a little progress.

@aloucks for 1. I've just moved the latter parts of test_tooltip_mod_use_external.rs into test_tooltip_std. We can rework it into the non-ignored test in a later pr.

I think rust changes have broken clippy now, but otherwise does the pr look ok to you?

@aloucks
Copy link
Contributor

aloucks commented Dec 8, 2018

@alexheretic Yeah, I got hung-up on clippy too. The PR looks good 👍

@alexheretic
Copy link
Member Author

I wonder if we should split out tests that require racer fallback to work too, this might speed debugging.

@Xanewok
Copy link
Member

Xanewok commented Dec 17, 2018

@alexheretic I think that'd be valuable for sure!

I'm quite puzzled why macOS tests fail intermittently https://travis-ci.org/rust-lang/rls/jobs/468945232. Any ideas? Sometimes it's 4 tests, sometimes it's 6 tests (or even different count?). The thing is, the analysis should be ready and data should be accurate. Is formatter the culprit? Do we race? Glancing through the code, I'd say

jobs.wait_for_all();

should be enough? Any ideas? 😢

@alexheretic
Copy link
Member Author

The drop impl wait_for_all() call is there just to prevent

panic!("orphaned concurrent job");

Which is otherwise possible as a race condition.

I don't understand the cause of the mac test failures and I can't test manually on that platform. I think this shows we need to improve the test so that it gives us as clear an indication as possible of what the root cause is.

Clarifying tests that rely on analysis OR racer may help, as if the tests are unexpectedly failing analysis and using racer then the root cause is an issue with analysis and the diff won't help.

Is using the field name foo: Foo an indication that we're falling back to racer here?

        expect_data: Ok(
            Ok(
                [
                    LanguageString(
                        LanguageString {
                            language: "rust",
                            value: "test_tooltip_01::Foo"
                        }
                    )
                ]
            )
        ),
        actual_data: Ok(
            Ok(
                [
                    LanguageString(
                        LanguageString {
                            language: "rust",
                            value: "foo: Foo"
                        }
                    )
                ]
            )
        )

@Xanewok
Copy link
Member

Xanewok commented Dec 17, 2018

Is using the field name foo: Foo an indication that we're falling back to racer here?

I think so, yeah - Racer shows span from definition site (foo: Foo) while rls-data tries to generate qualnames (test_tooltip_01::Foo path) for type info.

@Xanewok
Copy link
Member

Xanewok commented Dec 17, 2018

@alexheretic could we explicitly disable any Racer fallback for these tests or split them, as you're saying?

@alexheretic
Copy link
Member Author

Yes we can and I think we should. I did start experimenting with this, but nightly/clippy issues dampened my efforts at the time.

If everything is compiling and the tests are passing; disabling racer_completion in the harness and seeing what breaks should point out the racer tests.

This was referenced Dec 17, 2018
@Xanewok
Copy link
Member

Xanewok commented Dec 17, 2018

Disabled racer_completion in https://travis-ci.org/Xanewok/rls/builds/469023342:
Failures:

test_tooltip_mod_use_external.rs 14 12
test_tooltip_mod_use_external.rs 15 12

Full report:

Failures (expectedactual): [
    TestFailure {
        test: Test {
            file: "test_tooltip_mod_use_external.rs",
            line: 14,
            col: 12
        },
        expect_file: "/home/travis/build/Xanewok/rls/test_data/hover/save_data/test_tooltip_mod_use_external.rs.0014_012.json",
        actual_file: "/home/travis/build/Xanewok/rls/target/tests/hover/save_data/test_tooltip_mod_use_external.rs.0014_012.json",
        expect_data: Ok(
            Ok(
                [
                    LanguageString(
                        LanguageString {
                            language: "rust",
                            value: "libstd/sync/mod.rs"
                        }
                    ),
                    String(
                        "Useful synchronization primitives.\n\n## The need for synchronization\n\nConceptually, a Rust program is a series of operations which will\nbe executed on a computer. The timeline of events happening in the\nprogram is consistent with the order of the operations in the code."
                    )
                ]
            )
        ),
        actual_data: Ok(
            Ok(
                []
            )
        )
    }-diff: Ok(
        Ok(
            [
                LanguageString(
                    LanguageString {
                        language: "rust",
                        value: "libstd/sync/mod.rs"
                    }
                ),
                String(
                    "Useful synchronization primitives.\n\n## The need for synchronization\n\nConceptually, a Rust program is a series of operations which will\nbe executed on a computer. The timeline of events happening in the\nprogram is consistent with the order of the operations in the code."
                )
            ]
        )
    ),
    TestFailure {
        test: Test {
            file: "test_tooltip_mod_use_external.rs",
            line: 15,
            col: 12
        },
        expect_file: "/home/travis/build/Xanewok/rls/test_data/hover/save_data/test_tooltip_mod_use_external.rs.0015_012.json",
        actual_file: "/home/travis/build/Xanewok/rls/target/tests/hover/save_data/test_tooltip_mod_use_external.rs.0015_012.json",
        expect_data: Ok(
            Ok(
                [
                    LanguageString(
                        LanguageString {
                            language: "rust",
                            value: "libstd/sync/mod.rs"
                        }
                    ),
                    String(
                        "Useful synchronization primitives.\n\n## The need for synchronization\n\nConceptually, a Rust program is a series of operations which will\nbe executed on a computer. The timeline of events happening in the\nprogram is consistent with the order of the operations in the code."
                    )
                ]
            )
        ),
        actual_data: Ok(
            Ok(
                []
            )
        )
    }-diff: Ok(
        Ok(
            [
                LanguageString(
                    LanguageString {
                        language: "rust",
                        value: "libstd/sync/mod.rs"
                    }
                ),
                String(
                    "Useful synchronization primitives.\n\n## The need for synchronization\n\nConceptually, a Rust program is a series of operations which will\nbe executed on a computer. The timeline of events happening in the\nprogram is consistent with the order of the operations in the code."
                )
            ]
        )
    )
]

@alexheretic
Copy link
Member Author

Those 2 are from the "std" test run, also 6 from the main run failed:

  • test_tooltip_01.rs.0080_011
  • test_tooltip_01.rs.0093_018
  • test_tooltip_mod_use_external.rs.0011_007
  • test_tooltip_mod_use_external.rs.0011_007
  • test_tooltip_mod_use_external.rs.0012_007
  • test_tooltip_mod_use_external.rs.0012_012

@ehuss
Copy link
Contributor

ehuss commented Dec 18, 2018

I can try to help you dig into this if you'd like. I'm not super familiar with RLS, but I am familiar with Cargo and use MacOS mostly.

Here's traces of a successful and failed runs: success and failure

There are a bunch of differences between them, and I noticed the failed case is using a racer fallback, but I have no idea how it decides to do that.

One thing I noticed is that it is compiling the hover target twice concurrently. On a hunch, I made this change:

diff --git a/src/actions/hover.rs b/src/actions/hover.rs
index a7df910..ef2a6b9 100644
--- a/src/actions/hover.rs
+++ b/src/actions/hover.rs
@@ -1191,6 +1191,7 @@ pub mod test {
                 related_information_support: true,
             };
             let mut config = config::Config::default();
+            config.all_targets = false;

             let temp_dir = tempfile::tempdir().unwrap().into_path();
             config.target_dir = config::Inferrable::Specified(Some(temp_dir.clone()));

and it made the problem go away. That may not be directly helpful, but maybe it is a clue? Since it is compiling hover twice, how does it know which one to use for analysis/tooltips?

@alexheretic
Copy link
Member Author

Thanks @ehuss that's interesting. Actually it seems fine to disable all_targets here anyway, we shouldn't need this feature for hover tests and they may run a little faster too.

The racer fallback triggers when the analysis method fails, so if another build has started concurrently with the hover call that could make sense.

@alexheretic
Copy link
Member Author

I think this issue is pretty much done. The only improvement left to do is moving as much out of the ignored tests as possible, leaving these as std testing only.

Thanks guys!

Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Labels
None yet
Projects
None yet
Development

No branches or pull requests

4 participants