improve folder name for persistent doc tests #69458

Luro02 · 2020-02-25T12:39:25Z

This fixes #69411, by using the entire path as folder name and storing already visited paths in a HashMap + appending a number to the file name for duplicates.

Luro02 · 2020-02-26T17:53:32Z

One solution is to pass around a HashSet, that contains already written doc tests.

If there is a duplicate folder it would be overwritten (if the folder does not already exist in the HashSet) otherwise the next free number would be appended to the filepath, e.g. module_1_file_rs_1.

This would prevent cluttering of the folder (old tests would be overwritten) and it does not involve difficult parsing to know wether or not the test is from a proc-macro. It would also prevent future name clashes.

GuillaumeGomez · 2020-02-26T22:49:23Z

Just a thought about this: instead of basing the id on the file line, wouldn't it be better to base on it the test number? Like it's the third test of this file so file_3.rs or equivalent. What do you think of this?

Also, cc @rust-lang/rustdoc

Luro02 · 2020-02-28T12:23:21Z

@GuillaumeGomez this would work, but from where would you get the test number? or should this iterate through all folders? (would require one more syscall for each test, which could decrease runtime performance)

rust-highfive · 2020-02-29T13:09:55Z

The job mingw-check of your PR failed (pretty log, raw log). Through arcane magic we have determined that the following fragments from the build log may contain information about the problem.

Click to expand the log.

2020-02-29T11:54:41.0297103Z ========================== Starting Command Output ===========================
2020-02-29T11:54:41.0299556Z [command]/bin/bash --noprofile --norc /home/vsts/work/_temp/d9b6d399-9205-4dba-a479-e6fe5730d964.sh
2020-02-29T11:54:41.0299814Z 
2020-02-29T11:54:41.0303314Z ##[section]Finishing: Disable git automatic line ending conversion
2020-02-29T11:54:41.0322179Z ##[section]Starting: Checkout rust-lang/rust@refs/pull/69458/merge to s
2020-02-29T11:54:41.0326065Z Task         : Get sources
2020-02-29T11:54:41.0326334Z Description  : Get sources from a repository. Supports Git, TfsVC, and SVN repositories.
2020-02-29T11:54:41.0326593Z Version      : 1.0.0
2020-02-29T11:54:41.0326768Z Author       : Microsoft
---
2020-02-29T11:54:44.3080044Z ##[command]git remote add origin https://github.com/rust-lang/rust
2020-02-29T11:54:44.3089590Z ##[command]git config gc.auto 0
2020-02-29T11:54:44.3097458Z ##[command]git config --get-all http.https://github.com/rust-lang/rust.extraheader
2020-02-29T11:54:44.3104142Z ##[command]git config --get-all http.proxy
2020-02-29T11:54:44.3112552Z ##[command]git -c http.extraheader="AUTHORIZATION: basic ***" fetch --force --tags --prune --progress --no-recurse-submodules --depth=2 origin +refs/heads/*:refs/remotes/origin/* +refs/pull/69458/merge:refs/remotes/pull/69458/merge
---
2020-02-29T12:05:19.6399420Z     Checking rustdoc v0.0.0 (/checkout/src/librustdoc)
2020-02-29T12:05:20.1646437Z error[E0412]: cannot find type `Path` in this scope
2020-02-29T12:05:20.1647876Z    --> src/librustdoc/test.rs:210:35
2020-02-29T12:05:20.1648799Z     |
2020-02-29T12:05:20.1649848Z 210 |     visited_tests: &Mutex<HashMap<Path, usize>>,
2020-02-29T12:05:20.1652410Z     |
2020-02-29T12:05:20.1653353Z help: possible candidates are found in other modules, you can import them into scope
2020-02-29T12:05:20.1654272Z     |
2020-02-29T12:05:20.1655490Z 1   | use crate::clean::types::Path;
---
2020-02-29T12:05:20.1660897Z 1   | use syntax::ast::Path;
2020-02-29T12:05:20.1661790Z     |
2020-02-29T12:05:20.1662622Z help: you might be missing a type parameter
2020-02-29T12:05:20.1663423Z     |
2020-02-29T12:05:20.1665154Z 194 | fn run_test<Path>(
2020-02-29T12:05:20.1666626Z 
2020-02-29T12:05:20.1666626Z 
2020-02-29T12:05:22.5324872Z error[E0282]: type annotations needed for `std::sync::Mutex<std::collections::HashMap<K, V>>`
2020-02-29T12:05:22.5326443Z     |
2020-02-29T12:05:22.5327217Z 723 |         let test_number = Mutex::new(HashMap::new());
2020-02-29T12:05:22.5328456Z     |             -----------              ^^^^^^^^^^^^ cannot infer type for type parameter `K`
2020-02-29T12:05:22.5329395Z     |             |
2020-02-29T12:05:22.5329395Z     |             |
2020-02-29T12:05:22.5330889Z     |             consider giving `test_number` the explicit type `std::sync::Mutex<std::collections::HashMap<K, V>>`, where the type parameter `K` is specified
2020-02-29T12:05:22.6088034Z error: aborting due to 2 previous errors
2020-02-29T12:05:22.6093551Z 
2020-02-29T12:05:22.6106545Z Some errors have detailed explanations: E0282, E0412.
2020-02-29T12:05:22.6107273Z For more information about an error, try `rustc --explain E0282`.
---
2020-02-29T12:05:22.6313097Z   local time: Sat Feb 29 12:05:22 UTC 2020
2020-02-29T12:05:22.7872908Z   network time: Sat, 29 Feb 2020 12:05:22 GMT
2020-02-29T12:05:22.7875187Z == end clock drift check ==
2020-02-29T12:05:23.3938226Z 
2020-02-29T12:05:23.4009270Z ##[error]Bash exited with code '1'.
2020-02-29T12:05:23.4021183Z ##[section]Finishing: Run build
2020-02-29T12:05:23.4070333Z ##[section]Starting: Checkout rust-lang/rust@refs/pull/69458/merge to s
2020-02-29T12:05:23.4074801Z Task         : Get sources
2020-02-29T12:05:23.4075101Z Description  : Get sources from a repository. Supports Git, TfsVC, and SVN repositories.
2020-02-29T12:05:23.4075535Z Version      : 1.0.0
2020-02-29T12:05:23.4075752Z Author       : Microsoft
2020-02-29T12:05:23.4075752Z Author       : Microsoft
2020-02-29T12:05:23.4076079Z Help         : [More Information](https://go.microsoft.com/fwlink/?LinkId=798199)
2020-02-29T12:05:23.4076433Z ==============================================================================
2020-02-29T12:05:23.7663477Z Cleaning any cached credential from repository: rust-lang/rust (GitHub)
2020-02-29T12:05:23.7714449Z ##[section]Finishing: Checkout rust-lang/rust@refs/pull/69458/merge to s
2020-02-29T12:05:23.7814591Z Cleaning up task key
2020-02-29T12:05:23.7816166Z Start cleaning up orphan processes.
2020-02-29T12:05:23.8019496Z Terminate orphan process: pid (3901) (python)
2020-02-29T12:05:23.8354898Z ##[section]Finishing: Finalize Job

I'm a bot! I can only do what humans tell me to, so if this was not helpful or you have suggestions for improvements, please ping or otherwise contact @TimNN. (Feature Requests)

Luro02 · 2020-02-29T18:23:17Z

Simply reading the folder names and putting the test in the next free folder would not be a viable solution, because this would mean that old test will never be overwritten, therefore one has to keep somewhere a list of all folders that were already created.

I decided to use a Mutex, because I do not know what could be used instead, which also works in a multi-threaded environment.

src/librustdoc/test.rs

bors · 2020-03-01T04:55:55Z

☔ The latest upstream changes (presumably #69592) made this pull request unmergeable. Please resolve the merge conflicts.

Luro02 · 2020-03-01T09:37:54Z

src/librustdoc/test.rs

@@ -205,6 +207,7 @@ fn run_test(
    mut error_codes: Vec<String>,
    opts: &TestOptions,
    edition: Edition,
+    visited_tests: &Mutex<HashMap<String, usize>>,


Should this be changed to visited_tests: &Mutex<HashMap<u64, usize>>?

It is not required to keep the entire folder_name in memory a hash would suffice.

Luro02 · 2020-03-01T09:49:44Z

src/librustdoc/test.rs

+                visited_tests
+                    .lock()
+                    .unwrap()
+                    .entry(folder_name.clone())
+                    .and_modify(|v| *v += 1)
+                    .or_insert(0)


This part would then be changed to something like this:

let mut hasher = std::collections::hash_map::DefaultHasher::new(); folder_name.hash(&mut hasher); visited_tests .lock() .unwrap() .entry(hasher.finish()) .and_modify(|v| *v += 1) .or_insert(0)

which would most likely increase performance, because only 8 bytes have to be saved for each test in the HashMap, instead of an arbitrary number of bytes, which would be most likely larger than 8 bytes. I think cloning an entire String should be slower than hashing it.

I am not sure if this change is desirable, because it might make the code harder to read?

Luro02 · 2020-03-01T09:57:56Z

r? @QuietMisdreavus

kinnison · 2020-03-08T09:58:10Z

The naming after a location in the source has always bugged me -- I wonder if there's any hope we could actually name the test after the path to the item being tested, and then perhaps a monotonic test number for that item.

GuillaumeGomez · 2020-03-14T21:59:45Z

Sorry for not responding earlier, completely missed the notification...

That's why I suggested to use the position (not in term of line) of the test in the file. But it still seems to be not enough, but I'm not sure what I'm missing here...

An idea maybe @ollie27 ?

Luro02 · 2020-03-14T22:45:18Z

What is wrong with the current implementation?

GuillaumeGomez · 2020-03-15T13:49:11Z

I can't put the finger on it. But maybe your implementation is perfectly correct and I'm just imagining things. That's why I asked for other opinions. If they don't find anything, then we can just merge it. Don't worry, we'll move forward and sorry it takes so much time!

ollie27

Overall I think what this PR is proposing is a nice improvement but there will still be chances for filename collisions.

I don't like removing the line number from the path as that will make it difficult to figure out which test corresponds to which path. Maybe the line number can be included as well as the incremented number like "{name}_{line}_{number}". Possibly only appending the number if it's greater than 0 so in most cases it won't be noticed.

src/librustdoc/test.rs

Luro02 · 2020-03-17T10:09:03Z

Possibly only appending the number if it's greater than 0 so in most cases it won't be noticed.

I did not implement this, because I think it is better to have a unified naming scheme. It would also increase the code complexity unnecessarily.

JohnCSimon · 2020-03-28T21:54:48Z

Ping from triage:
@Luro02 - can you please post an update to this PR and address ollie27's change requests?

Luro02 · 2020-03-28T22:18:59Z

@JohnCSimon

I already addressed the changes requested by @ollie27.
I am just waiting for somebody to finally approve this PR.

ollie27

Sorry about the delay.

With the line number included in the hash this should be good to merge. We could do with some regression tests for --persist-doctests though but I can't see an easy way to write them so we can leave that as a follow up.

src/librustdoc/test.rs

GuillaumeGomez

Looks good to me now. Just waiting for @ollie27's confirmation and then it's good to go! Thanks a lot!

ollie27 · 2020-03-30T17:00:23Z

Yeah, looks good to me too.

@bors r=GuillaumeGomez,ollie27

bors · 2020-03-30T17:00:25Z

📌 Commit bc00b16 has been approved by GuillaumeGomez,ollie27

…ie27 improve folder name for persistent doc tests This partially fixes rust-lang#69411 by using the entire path as folder name, but I do not know how to deal with the proc-macro problem, where a doc test is forwarded to multiple generated functions, which have the same line for the doc test (origin). For example ```rust #[derive(ShortHand)] pub struct ExtXMedia { /// The [`MediaType`] associated with this tag. /// /// # Example /// -> /// ``` <- this line is given to `run_test` /// # use hls_m3u8::tags::ExtXMedia; /// use hls_m3u8::types::MediaType; /// /// let mut media = ExtXMedia::new(MediaType::Audio, "ag1", "english audio channel"); /// /// media.set_media_type(MediaType::Video); /// /// assert_eq!(media.media_type(), MediaType::Video); /// ``` /// /// # Note /// /// This attribute is required. #[shorthand(enable(copy))] media_type: MediaType, // the rest of the fields are omitted } ``` and my proc macro generates ```rust #[allow(dead_code)] impl ExtXMedia { /// The [`MediaType`] associated with this tag. /// /// # Example /// /// ``` /// # use hls_m3u8::tags::ExtXMedia; /// use hls_m3u8::types::MediaType; /// /// let mut media = ExtXMedia::new(MediaType::Audio, "ag1", "english audio channel"); /// /// media.set_media_type(MediaType::Video); /// /// assert_eq!(media.media_type(), MediaType::Video); /// ``` /// /// # Note /// /// This attribute is required. #[inline(always)] #[must_use] pub fn media_type(&self) -> MediaType { struct _AssertCopy where MediaType: ::std::marker::Copy; self.media_type } /// The [`MediaType`] associated with this tag. /// /// # Example /// /// ``` /// # use hls_m3u8::tags::ExtXMedia; /// use hls_m3u8::types::MediaType; /// /// let mut media = ExtXMedia::new(MediaType::Audio, "ag1", "english audio channel"); /// /// media.set_media_type(MediaType::Video); /// /// assert_eq!(media.media_type(), MediaType::Video); /// ``` /// /// # Note /// /// This attribute is required. #[inline(always)] pub fn set_media_type<VALUE: ::std::convert::Into<MediaType>>( &mut self, value: VALUE, ) -> &mut Self { self.media_type = value.into(); self } } ``` rustdoc then executes both tests with the same line (the line from the example above the field -> 2 different tests have the same name). We need a way to differentiate between the two tests generated by the proc-macro, so that they do not cause threading issues.

RalfJung · 2020-03-30T21:10:46Z

partially fixes #69411

When this PR lands it will close that issue (because "fixes" is a magic keyword for GitHub). Is that deliberate, given that it is just a partial fix? If no, please edit the PR message to no longer say "fixes" (or "closes").

Luro02 · 2020-03-30T21:20:25Z

@RalfJung
It is no longer a partial fix. The issue is resolved, by appending a number to all filenames and incrementing it if there is a conflict, so no files should be overwritten.

I updated the post. So it should be fine if the issue is closed with this PR.

bors · 2020-03-30T23:06:58Z

⌛ Testing commit bc00b16 with merge a46c7116049fcbc2e8f7c72db429d36e14310020...

bors · 2020-03-30T23:10:36Z

💔 Test failed - checks-azure

Luro02 · 2020-03-31T11:50:11Z

I updated the fork (git rebase upstream/master)

GuillaumeGomez · 2020-03-31T11:58:31Z

Once the CI is ok, let's approve again then.

GuillaumeGomez · 2020-03-31T13:54:49Z

@bors r=GuillaumeGomez,ollie27

bors · 2020-03-31T13:54:50Z

📌 Commit 2e40ac7 has been approved by GuillaumeGomez,ollie27

bors · 2020-03-31T13:56:50Z

⌛ Testing commit 2e40ac7 with merge 02bf2b4659cb49caa3f0281f354ab048d3652e88...

Centril · 2020-03-31T14:00:50Z

@bors retry yielding

@ghost

Rollup of 7 pull requests Successful merges: - rust-lang#69425 (add fn make_contiguous to VecDeque) - rust-lang#69458 (improve folder name for persistent doc tests) - rust-lang#70268 (Document ThreadSanitizer in unstable-book) - rust-lang#70600 (Ensure there are versions of test code for aarch64 windows) - rust-lang#70606 (Clean up E0466 explanation) - rust-lang#70614 (remove unnecessary relocation check in const_prop) - rust-lang#70623 (Fix broken link in README) Failed merges: r? @ghost

rust-highfive assigned steveklabnik Feb 25, 2020

Luro02 commented Feb 29, 2020

View reviewed changes

src/librustdoc/test.rs Outdated Show resolved Hide resolved

bors added the S-waiting-on-author Status: This is awaiting some action (such as code changes or more information) from the author. label Mar 1, 2020

Luro02 commented Mar 1, 2020

View reviewed changes

rust-highfive assigned QuietMisdreavus and unassigned steveklabnik Mar 1, 2020

kinnison assigned GuillaumeGomez and unassigned QuietMisdreavus Mar 8, 2020

ollie27 suggested changes Mar 17, 2020

View reviewed changes

src/librustdoc/test.rs Outdated Show resolved Hide resolved

Luro02 requested a review from ollie27 March 29, 2020 18:35

ollie27 approved these changes Mar 29, 2020

View reviewed changes

src/librustdoc/test.rs Outdated Show resolved Hide resolved

src/librustdoc/test.rs Outdated Show resolved Hide resolved

Luro02 requested a review from ollie27 March 30, 2020 13:59

GuillaumeGomez approved these changes Mar 30, 2020

View reviewed changes

ollie27 approved these changes Mar 30, 2020

View reviewed changes

bors added S-waiting-on-bors Status: Waiting on bors to run and complete tests. Bors will change the label on completion. and removed S-waiting-on-author Status: This is awaiting some action (such as code changes or more information) from the author. labels Mar 30, 2020

Centril mentioned this pull request Mar 30, 2020

Rollup of 4 pull requests #70581

Closed

bors added S-waiting-on-review Status: Awaiting review from the assignee but also interested parties. and removed S-waiting-on-bors Status: Waiting on bors to run and complete tests. Bors will change the label on completion. labels Mar 30, 2020

improve folder name for persistent doc tests

2e40ac7

bors added S-waiting-on-bors Status: Waiting on bors to run and complete tests. Bors will change the label on completion. and removed S-waiting-on-review Status: Awaiting review from the assignee but also interested parties. labels Mar 31, 2020

Dylan-DPC-zz mentioned this pull request Mar 31, 2020

Rollup of 7 pull requests #70625

Merged

bors merged commit 3e31006 into rust-lang:master Mar 31, 2020

improve folder name for persistent doc tests #69458

improve folder name for persistent doc tests #69458

Uh oh!

Conversation

Luro02 commented Feb 25, 2020 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

Luro02 commented Feb 26, 2020

Uh oh!

GuillaumeGomez commented Feb 26, 2020

Uh oh!

Luro02 commented Feb 28, 2020

Uh oh!

rust-highfive commented Feb 29, 2020

Uh oh!

Luro02 commented Feb 29, 2020

Uh oh!

Uh oh!

bors commented Mar 1, 2020

Uh oh!

Luro02 Mar 1, 2020

Choose a reason for hiding this comment

Uh oh!

Luro02 Mar 1, 2020

Choose a reason for hiding this comment

Uh oh!

Luro02 commented Mar 1, 2020

Uh oh!

kinnison commented Mar 8, 2020

Uh oh!

GuillaumeGomez commented Mar 14, 2020

Uh oh!

Luro02 commented Mar 14, 2020

Uh oh!

GuillaumeGomez commented Mar 15, 2020

Uh oh!

ollie27 left a comment

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Luro02 commented Mar 17, 2020 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

JohnCSimon commented Mar 28, 2020

Uh oh!

Luro02 commented Mar 28, 2020 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

ollie27 left a comment

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Uh oh!

GuillaumeGomez left a comment

Choose a reason for hiding this comment

Uh oh!

ollie27 commented Mar 30, 2020

Uh oh!

bors commented Mar 30, 2020

Uh oh!

RalfJung commented Mar 30, 2020

Uh oh!

Luro02 commented Mar 30, 2020 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

bors commented Mar 30, 2020

Uh oh!

bors commented Mar 30, 2020

Uh oh!

Luro02 commented Mar 31, 2020 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

GuillaumeGomez commented Mar 31, 2020

Uh oh!

GuillaumeGomez commented Mar 31, 2020

Uh oh!

bors commented Mar 31, 2020

Uh oh!

bors commented Mar 31, 2020

Uh oh!

Luro02 commented Feb 25, 2020 •

edited

Loading

Luro02 commented Mar 17, 2020 •

edited

Loading

Luro02 commented Mar 28, 2020 •

edited

Loading

Luro02 commented Mar 30, 2020 •

edited

Loading

Luro02 commented Mar 31, 2020 •

edited

Loading