Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Filenames with plus signs #11

Open
koivunej opened this issue Oct 26, 2020 · 2 comments
Open

Filenames with plus signs #11

koivunej opened this issue Oct 26, 2020 · 2 comments

Comments

@koivunej
Copy link

Expected: example filenames with + to work.

Actual: test_generator::test_resources panicked with message: "some_fn_name" is not a valid identifier

Looks like the issue is at

// Form canonical name without any punctuation/delimiter or special character
fn canonical_fn_name(s: &str) -> String {
// remove delimiters and special characters
s.replace(
&['"', ' ', '.', ':', '-', '*', '/', '\\', '\n', '\t', '\r'][..],
"_",
)
}
but I am not really sure what would be the exhaustive list of characters needing to be translated to _. Might be that there are so many caveats that #5 is the way to go when defaults fail to work? Though in this case I'd just re-create the name mangling with this additional character, so not sure how good overall solution that'd be.

@koivunej
Copy link
Author

koivunej commented Oct 26, 2020

Another idea: use raw identifiers for the test names.

EDIT: Sadly that wouldn't work. I am thinking a inverted regex matching the opposite of https://doc.rust-lang.org/reference/identifiers.html character classes might be the exhaustive solution.

@koivunej
Copy link
Author

This is simple to implement:

 // Form canonical name without any punctuation/delimiter or special character
 fn canonical_fn_name(s: &str) -> String {
     // remove delimiters and special characters
-    s.replace(
-        &['"', ' ', '.', ':', '-', '*', '/', '\\', '\n', '\t', '\r'][..],
-        "_",
-    )
+    // essentially `s/[^A-Za-z0-9]/_//`, see https://doc.rust-lang.org/reference/identifiers.html
+    // we probably cannot dedup multiple subsequent '_' as there is a chance of collision here as
+    // well.
+
+    s.chars()
+        .map(|ch| if ch.is_ascii_alphanumeric() { ch } else { '_' })
+        .collect()
 }

Trying it out will however manifest another issue: many files are mapped to same fn names.

The current code needs to perform PathBuf -> String conversion in order to do the path to fn name mangling, and I am not sure if this is cross platform:

let path_as_str = path
.expect("No such file or directory")
.into_os_string()
.into_string()
.expect("bad encoding");

If it was or could be made as such, creating a mapping from canonicalized_path_as_identifier -> Vec<PathBuf> would be sufficient to make unique function name per filename even if they end up having very much similar fn names with a numerical suffix for example.

I did not yet look but from the subjects I'd assume that this work would conflict with the open PRs by @svenfoo, so perhaps I'll revisit this later.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

1 participant