You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
This issue was labelled with: A-libs, A-unicode, E-easy, E-mentor in the Rust repository
glob::glob() does not have any support right now for matching non-utf8 filenames. Not only are its patterns restricted to strings, but it also explicitly skips any non-utf8 filenames it encounters (which should at least be able to match a * pattern).
Tasks that need to be done:
glob() needs to accept both strings and byte-vectors. It can do this using std::path::BytesContainer
glob() needs to process its pattern as a byte vector instead of a string, which will allow it to process filenames as byte vectors. This includes matching non-utf8 filenames against * and ? tokens (for the latter, matching a single byte is appropriate; ideally, it would match however many bytes are supposed to be consumed to create a U+FFFD REPLACEMENT CHARACTER as per the unicode standard)
This is a sub-task of #9639.
The text was updated successfully, but these errors were encountered:
I've ran into this. I wanted to emulate globs on Windows, but it allows unpaired surrogates in filenames, and therefore filename patterns also need to support unpaired surrogates.
Looks like the original issue refers to an old Rust trait. I suppose post Rust-1.0 it's going to be glob<P: AsRef<OsStr>>(pattern: P).
Issue by kballard
Wednesday Jan 29, 2014 at 23:12 GMT
For earlier discussion, see rust-lang/rust#11916
This issue was labelled with: A-libs, A-unicode, E-easy, E-mentor in the Rust repository
glob::glob()
does not have any support right now for matching non-utf8 filenames. Not only are its patterns restricted to strings, but it also explicitly skips any non-utf8 filenames it encounters (which should at least be able to match a*
pattern).Tasks that need to be done:
glob()
needs to accept both strings and byte-vectors. It can do this usingstd::path::BytesContainer
glob()
needs to process its pattern as a byte vector instead of a string, which will allow it to process filenames as byte vectors. This includes matching non-utf8 filenames against*
and?
tokens (for the latter, matching a single byte is appropriate; ideally, it would match however many bytes are supposed to be consumed to create aU+FFFD REPLACEMENT CHARACTER
as per the unicode standard)This is a sub-task of #9639.
The text was updated successfully, but these errors were encountered: