Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Cargo does not handle non-UTF8 gitignores gracefully #11311

Closed
MattWindsor91 opened this issue Oct 28, 2022 · 2 comments · Fixed by #11321
Closed

Cargo does not handle non-UTF8 gitignores gracefully #11311

MattWindsor91 opened this issue Oct 28, 2022 · 2 comments · Fixed by #11321
Assignees
Labels
A-diagnostics Area: Error and warning messages generated by Cargo itself. C-bug Category: bug Command-init E-easy Experience: Easy

Comments

@MattWindsor91
Copy link

MattWindsor91 commented Oct 28, 2022

Problem

When attempting to run cargo init --bin . on an existing Git repository in cargo 1.64, cargo failed with the somewhat cryptic

thread 'main' panicked at 'called `Result::unwrap()` on an `Err` value: Error { kind: InvalidData, message: "stream did not contain valid UTF-8" }', src/tools/cargo/src/cargo/ops/cargo_new.rs:601:78
note: run with `RUST_BACKTRACE=1` environment variable to display a backtrace

In that version of cargo, that unwrap corresponds to this line, which in main is currently line 600.

It appears that the cause is that when a .gitignore is present that is not UTF-8, Cargo assumes by unwrap that it must be UTF-8, and panics because the assumption does not hold. In this case, the existing Git repository had a .gitignore that had been checked in by a Windows user that had accidentally been checked in as UTF-16LE with BOMs and CRLF terminators! While the presence of such .gitignores is definitely a user problem (git doesn't seem to work with such files), the error handling on cargo's side should probably be more graceful.

Steps

  1. Create a new empty Git repository
  2. Create a .gitignore that is not valid UTF-8, for instance printf '\xFF\xFE' > .gitignore
  3. cargo init

Possible Solution(s)

As previously mentioned, the bug is an unwrap that should be a user-facing error, so cargo should IMO be returning an error to the user here to state that their existing ignore file is badly formed. cargo could try to rectify the problem but this feels like scope creep.

Notes

No response

Version

cargo 1.64.0 (387270bc7 2022-09-16)
release: 1.64.0
commit-hash: 387270bc7f446d17869c7f208207c73231d6a252
commit-date: 2022-09-16
host: aarch64-apple-darwin
libgit2: 1.4.2 (sys:0.14.2 vendored)
libcurl: 7.84.0 (sys:0.4.55+curl-7.83.1 system ssl:(SecureTransport) LibreSSL/3.3.6)
os: Mac OS 13.0.0 [64-bit]
@MattWindsor91 MattWindsor91 added the C-bug Category: bug label Oct 28, 2022
@weihanglo
Copy link
Member

Did a simple research.

  • Git seems to ignore .gitignore in non-UTF8 encoding.1
  • Mecurial respects local encoding.2
  • Have yet dived deep enough to find encoding info about fossil and pijul.

For now, I would recommend enriching the error message by calling out the path of the non-UTF8 file and telling users Cargo currently not support non-UTF8 ignore files.

In the future, we could try to respect the local encoding and newline, though it is not a high priority IMO.

Footnotes

  1. https://stackoverflow.com/questions/52472435/why-doesnt-git-natively-support-utf-16

  2. https://www.mercurial-scm.org/wiki/FileFormats

@weihanglo weihanglo added A-diagnostics Area: Error and warning messages generated by Cargo itself. Command-init E-easy Experience: Easy E-help-wanted labels Oct 29, 2022
@FrankYang0529
Copy link
Contributor

@rustbot claim

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
A-diagnostics Area: Error and warning messages generated by Cargo itself. C-bug Category: bug Command-init E-easy Experience: Easy
Projects
None yet
Development

Successfully merging a pull request may close this issue.

3 participants