Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Strip BOM from output in interactive mode #1938

Merged
merged 9 commits into from
Sep 6, 2022
1 change: 1 addition & 0 deletions CHANGELOG.md
Original file line number Diff line number Diff line change
Expand Up @@ -10,6 +10,7 @@
- Prevent fork nightmare with `PAGER=batcat`. See #2235 (@johnmatthiggins)
- Make `--no-paging`/`-P` override `--paging=...` if passed as a later arg, see #2201 (@themkat)
- `--map-syntax` and `--ignored-suffix` now works together, see #2093 (@czzrr)
- Strips byte order mark from output when in non-loop-through mode. See #1922 (@dag-h)

## Other

Expand Down
11 changes: 10 additions & 1 deletion src/printer.rs
Original file line number Diff line number Diff line change
Expand Up @@ -419,7 +419,7 @@ impl<'a> Printer for InteractivePrinter<'a> {
let line = if self.config.show_nonprintable {
replace_nonprintable(line_buffer, self.config.tab_width)
} else {
match self.content_type {
let line = match self.content_type {
Some(ContentType::BINARY) | None => {
return Ok(());
}
Expand All @@ -430,6 +430,15 @@ impl<'a> Printer for InteractivePrinter<'a> {
.decode(line_buffer, DecoderTrap::Replace)
.map_err(|_| "Invalid UTF-16BE")?,
_ => String::from_utf8_lossy(line_buffer).to_string(),
};
// Remove byte order mark from the first line if it exists
if line_number == 1 {
match line.strip_prefix('\u{feff}') {
Some(stripped) => stripped.to_string(),
Copy link
Owner

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I wonder if there is an alternative implementation where we do not have to allocate a new string? by operating on &str instead of String.

But .. this is only relevant for files starting with a BOM. So maybe it's not the most urgent question.

None => line,
}
} else {
line
}
};

Expand Down
1 change: 1 addition & 0 deletions tests/examples/test_BOM.txt
Original file line number Diff line number Diff line change
@@ -0,0 +1 @@
hello world
56 changes: 54 additions & 2 deletions tests/integration_tests.rs
Original file line number Diff line number Diff line change
Expand Up @@ -758,14 +758,66 @@ fn config_read_arguments_from_file() {

#[test]
fn utf16() {
// The output will be converted to UTF-8 with a leading UTF-8 BOM
// The output will be converted to UTF-8 with the leading UTF-16
// BOM removed. This behavior is wanted in interactive mode as
// some terminals seem to display the BOM character as a space,
// and it also breaks syntax highlighting.
bat()
.arg("--plain")
.arg("--decorations=always")
.arg("test_UTF-16LE.txt")
.assert()
.success()
.stdout(std::str::from_utf8(b"\xEF\xBB\xBFhello world\n").unwrap());
.stdout("hello world\n");
}

// Regression test for https://github.com/sharkdp/bat/issues/1922
#[test]
fn bom_not_stripped_in_loop_through_mode() {
bat()
.arg("--plain")
.arg("--decorations=never")
.arg("--color=never")
.arg("test_BOM.txt")
.assert()
.success()
.stdout("\u{feff}hello world\n");
}

// Regression test for https://github.com/sharkdp/bat/issues/1922
#[test]
fn bom_stripped_when_colored_output() {
bat()
.arg("--color=always")
.arg("--decorations=never")
.arg("test_BOM.txt")
.assert()
.success()
.stdout(
predicate::str::is_match("\u{1b}\\[38;5;[0-9]{3}mhello world\u{1b}\\[0m\n").unwrap(),
);
}

// Regression test for https://github.com/sharkdp/bat/issues/1922
#[test]
fn bom_stripped_when_no_color_and_not_loop_through() {
bat()
.arg("--color=never")
.arg("--decorations=always")
.arg("--style=numbers,grid,header")
.arg("--terminal-width=80")
.arg("test_BOM.txt")
.assert()
.success()
.stdout(
"\
─────┬──────────────────────────────────────────────────────────────────────────
│ File: test_BOM.txt
─────┼──────────────────────────────────────────────────────────────────────────
1 │ hello world
─────┴──────────────────────────────────────────────────────────────────────────
",
);
}

#[test]
Expand Down
2 changes: 1 addition & 1 deletion tests/syntax-tests/highlighted/PowerShell/test.ps1
Original file line number Diff line number Diff line change
@@ -1,4 +1,4 @@
# PowerShell script for testing syntax highlighting
# PowerShell script for testing syntax highlighting

function Get-FutureTime {
 param (
Expand Down