Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Give credit to syntax definition and theme authors with new --acknowledgements option #1971

Merged
merged 9 commits into from
Dec 11, 2021
3 changes: 2 additions & 1 deletion CHANGELOG.md
Original file line number Diff line number Diff line change
Expand Up @@ -6,6 +6,7 @@
- Support for `--ignored-suffix` argument. See #1892 (@bojan88)
- `$BAT_CONFIG_DIR` is now a recognized environment variable. It has precedence over `$XDG_CONFIG_HOME`, see #1727 (@billrisher)
- Support for `x:+delta` syntax in line ranges (e.g. `20:+10`). See #1810 (@bojan88)
- Add new `--acknowledgements` argument that gives credit to theme and syntax definition authors. See #1971 (@Enselic)

## Bugfixes

Expand Down Expand Up @@ -43,7 +44,7 @@
## `bat` as a library

- Deprecate `HighlightingAssets::syntaxes()` and `HighlightingAssets::syntax_for_file_name()`. Use `HighlightingAssets::get_syntaxes()` and `HighlightingAssets::get_syntax_for_path()` instead. They return a `Result` which is needed for upcoming lazy-loading work to improve startup performance. They also return which `SyntaxSet` the returned `SyntaxReference` belongs to. See #1747, #1755, #1776, #1862 (@Enselic)
- Remove `HighlightingAssets::from_files` and `HighlightingAssets::save_to_cache`. Instead of calling the former and then the latter you now make a single call to `bat::assets::build`. See #1802 (@Enselic)
- Remove `HighlightingAssets::from_files` and `HighlightingAssets::save_to_cache`. Instead of calling the former and then the latter you now make a single call to `bat::assets::build`. See #1802, #1971 (@Enselic)
- Replace the `error::Error(error::ErrorKind, _)` struct and enum with an `error::Error` enum. `Error(ErrorKind::UnknownSyntax, _)` becomes `Error::UnknownSyntax`, etc. Also remove the `error::ResultExt` trait. These changes stem from replacing `error-chain` with `thiserror`. See #1820 (@Enselic)
- Add new `MappingTarget` enum variant `MapExtensionToUnknown`. Refer to its docummentation for more information. Clients are adviced to treat `MapExtensionToUnknown` the same as `MapToUnknown` in exhaustive matches. See #1703 (@cbolgiano)

Expand Down
2 changes: 2 additions & 0 deletions Cargo.lock

Some generated files are not rendered by default. Learn more about how customized files appear on GitHub.

4 changes: 3 additions & 1 deletion Cargo.toml
Original file line number Diff line number Diff line change
Expand Up @@ -34,7 +34,7 @@ minimal-application = [
git = ["git2"] # Support indicating git modifications
paging = ["shell-words", "grep-cli"] # Support applying a pager on the output
# Add "syntect/plist-load" when https://github.com/trishume/syntect/pull/345 reaches us
build-assets = ["syntect/yaml-load", "syntect/dump-create"]
build-assets = ["syntect/yaml-load", "syntect/dump-create", "regex", "walkdir"]

# You need to use one of these if you depend on bat as a library:
regex-onig = ["syntect/regex-onig"] # Use the "oniguruma" regex engine
Expand Down Expand Up @@ -63,6 +63,8 @@ clircle = "0.3"
bugreport = { version = "0.4", optional = true }
dirs-next = { version = "2.0.0", optional = true }
grep-cli = { version = "0.1.6", optional = true }
regex = { version = "1.0", optional = true }
walkdir = { version = "2.0", optional = true }

[dependencies.git2]
version = "0.13"
Expand Down
Binary file added assets/acknowledgements.bin
Binary file not shown.
2 changes: 1 addition & 1 deletion assets/create.sh
Original file line number Diff line number Diff line change
Expand Up @@ -53,7 +53,7 @@ bat cache --clear
done
)

bat cache --build --blank --source="$ASSET_DIR" --target="$ASSET_DIR"
bat cache --build --blank --acknowledgements --source="$ASSET_DIR" --target="$ASSET_DIR"

(
cd "$ASSET_DIR"
Expand Down
10 changes: 10 additions & 0 deletions src/assets.rs
Original file line number Diff line number Diff line change
Expand Up @@ -46,6 +46,9 @@ pub(crate) const COMPRESS_SYNTAXES: bool = true;
/// Compress for size of ~20 kB instead of ~200 kB at the cost of ~30% longer deserialization time
pub(crate) const COMPRESS_THEMES: bool = true;

/// Compress for size of ~10 kB instead of ~120 kB
pub(crate) const COMPRESS_ACKNOWLEDGEMENTS: bool = true;

impl HighlightingAssets {
fn new(serialized_syntax_set: SerializedSyntaxSet, theme_set: ThemeSet) -> Self {
HighlightingAssets {
Expand Down Expand Up @@ -295,6 +298,13 @@ pub(crate) fn get_integrated_themeset() -> ThemeSet {
from_binary(include_bytes!("../assets/themes.bin"), COMPRESS_THEMES)
}

pub fn get_acknowledgements() -> String {
from_binary(
include_bytes!("../assets/acknowledgements.bin"),
COMPRESS_ACKNOWLEDGEMENTS,
)
}

pub(crate) fn from_binary<T: serde::de::DeserializeOwned>(v: &[u8], compressed: bool) -> T {
asset_from_contents(v, "n/a", compressed)
.expect("data integrated in binary is never faulty, but make sure `compressed` is in sync!")
Expand Down
24 changes: 23 additions & 1 deletion src/assets/build_assets.rs
Original file line number Diff line number Diff line change
Expand Up @@ -3,10 +3,14 @@ use syntect::highlighting::ThemeSet;
use syntect::parsing::{SyntaxSet, SyntaxSetBuilder};

use crate::assets::*;
use acknowledgements::build_acknowledgements;

mod acknowledgements;

pub fn build(
source_dir: &Path,
include_integrated_assets: bool,
acknowledgements: bool,
target_dir: &Path,
current_version: &str,
) -> Result<()> {
Expand All @@ -16,9 +20,17 @@ pub fn build(

let syntax_set = syntax_set_builder.build();

let acknowledgements = build_acknowledgements(source_dir, acknowledgements)?;

print_unlinked_contexts(&syntax_set);

write_assets(&theme_set, &syntax_set, target_dir, current_version)
write_assets(
&theme_set,
&syntax_set,
&acknowledgements,
target_dir,
current_version,
)
}

fn build_theme_set(source_dir: &Path, include_integrated_assets: bool) -> ThemeSet {
Expand Down Expand Up @@ -87,6 +99,7 @@ fn print_unlinked_contexts(syntax_set: &SyntaxSet) {
fn write_assets(
theme_set: &ThemeSet,
syntax_set: &SyntaxSet,
acknowledgements: &Option<String>,
target_dir: &Path,
current_version: &str,
) -> Result<()> {
Expand All @@ -104,6 +117,15 @@ fn write_assets(
COMPRESS_SYNTAXES,
)?;

if let Some(acknowledgements) = acknowledgements {
asset_to_cache(
acknowledgements,
&target_dir.join("acknowledgements.bin"),
"acknowledgements",
COMPRESS_ACKNOWLEDGEMENTS,
)?;
}

print!(
"Writing metadata to folder {} ... ",
target_dir.to_string_lossy()
Expand Down
199 changes: 199 additions & 0 deletions src/assets/build_assets/acknowledgements.rs
Original file line number Diff line number Diff line change
@@ -0,0 +1,199 @@
use std::fs::read_to_string;
use std::path::Path;

use crate::error::*;

// Sourced from the License section in the README.md
const PREAMBLE: &str = "Copyright (c) 2018-2021 bat-developers (https://github.com/sharkdp/bat).
Enselic marked this conversation as resolved.
Show resolved Hide resolved

bat is made available under the terms of either the MIT License or the Apache
License 2.0, at your option.

See the LICENSE-APACHE and LICENSE-MIT files for license details.
";

/// Looks for LICENSE and NOTICE files in `source_dir`, does some rudimentary
/// analysis, and compiles them together in a single string that is meant to be
/// used in the output to `--acknowledgements`
pub fn build_acknowledgements(source_dir: &Path, acknowledgements: bool) -> Result<Option<String>> {
if !acknowledgements {
return Ok(None);
}

let mut acknowledgements = String::new();
acknowledgements.push_str(PREAMBLE);

// Sort entries so the order is stable over time
let entries = walkdir::WalkDir::new(source_dir).sort_by(|a, b| a.path().cmp(b.path()));
for entry in entries {
let entry = match entry {
Ok(entry) => entry,
Err(_) => continue,
};
Enselic marked this conversation as resolved.
Show resolved Hide resolved

let path = entry.path();
let stem = match path.file_stem().and_then(|s| s.to_str()) {
Some(stem) => stem,
None => continue,
};

handle_file(&mut acknowledgements, path, stem)?
}

Ok(Some(acknowledgements))
}

fn handle_file(acknowledgements: &mut String, path: &Path, stem: &str) -> Result<()> {
if stem == "NOTICE" {
handle_notice(acknowledgements, path)?;
} else if stem.to_ascii_uppercase() == "LICENSE" {
handle_license(acknowledgements, path)?;
}

Ok(())
}

fn handle_notice(acknowledgements: &mut String, path: &Path) -> Result<()> {
// Assume NOTICE as defined by Apache License 2.0. These must be part of acknowledgements.
let license_text = read_to_string(path)?;
append_to_acknowledgements(acknowledgements, &license_text)
}

fn handle_license(acknowledgements: &mut String, path: &Path) -> Result<()> {
let license_text = read_to_string(path)?;

if include_license_in_acknowledgments(&license_text) {
append_to_acknowledgements(acknowledgements, &license_text)
Enselic marked this conversation as resolved.
Show resolved Hide resolved
} else if license_not_needed_in_acknowledgements(&license_text) {
Ok(())
} else {
Err(format!("ERROR: License is of unknown type: {:?}", path).into())
}
}

fn include_license_in_acknowledgments(license_text: &str) -> bool {
let markers = vec![
// MIT
"The above copyright notice and this permission notice shall be included in all copies or substantial portions of the Software.",

// BSD
"Redistributions in binary form must reproduce the above copyright notice,",

// Apache 2.0
"Apache License Version 2.0, January 2004 http://www.apache.org/licenses/",
"Licensed under the Apache License, Version 2.0 (the \"License\");",
];

license_contains_marker(license_text, &markers)
}

fn license_not_needed_in_acknowledgements(license_text: &str) -> bool {
let markers = vec![
// Public domain
"This is free and unencumbered software released into the public domain.",

// Special license of assets/syntaxes/01_Packages/LICENSE
"Permission to copy, use, modify, sell and distribute this software is granted. This software is provided \"as is\" without express or implied warranty, and with no claim as to its suitability for any purpose."
];

license_contains_marker(license_text, &markers)
}

fn license_contains_marker(license_text: &str, markers: &[&str]) -> bool {
let normalized_license_text = normalize_license_text(license_text);
for marker in markers {
if normalized_license_text.contains(marker) {
return true;
}
}
false
Enselic marked this conversation as resolved.
Show resolved Hide resolved
}

fn append_to_acknowledgements(acknowledgements: &mut String, license_text: &str) -> Result<()> {
// Most license texts wrap at 80 chars so our horizontal divider is 80 chars
acknowledgements.push_str(
"――――――――――――――――――――――――――――――――――――――――――――――――――――――――――――――――――――――――――――――――\n",
);

// Now add the license text itself
acknowledgements.push_str(license_text);

// Make sure the last char is a newline to not mess up formatting later
if acknowledgements
.chars()
.last()
.expect("acknowledgements is not the empty string")
!= '\n'
{
acknowledgements.push('\n');
}

Ok(())
}

/// Replaces newlines with a space character, and replaces multiple spaces with one space.
/// This makes the text easier to analyze.
fn normalize_license_text(license_text: &str) -> String {
use regex::Regex;

let whitespace_and_newlines = Regex::new(r"\s").unwrap();
let as_single_line = whitespace_and_newlines.replace_all(license_text, " ");

let many_spaces = Regex::new(" +").unwrap();
many_spaces.replace_all(&as_single_line, " ").to_string()
}

#[cfg(test)]
mod tests {
#[cfg(test)]
use super::*;

#[test]
fn test_normalize_license_text() {
let license_text = "This is a license text with these terms:
* Complicated multi-line
term with indentation";

assert_eq!(
"This is a license text with these terms: * Complicated multi-line term with indentation".to_owned(),
normalize_license_text(license_text),
);
}

#[test]
fn test_normalize_license_text_with_windows_line_endings() {
let license_text = "This license text includes windows line endings\r
and we need to handle that.";

assert_eq!(
"This license text includes windows line endings and we need to handle that."
.to_owned(),
normalize_license_text(license_text),
);
}

#[test]
fn test_append_to_acknowledgements_adds_newline_if_missing() {
let mut acknowledgements = "preamble\n".to_owned();

append_to_acknowledgements(&mut acknowledgements, "line without newline").unwrap();
assert_eq!(
"preamble
――――――――――――――――――――――――――――――――――――――――――――――――――――――――――――――――――――――――――――――――
line without newline
",
acknowledgements
);

append_to_acknowledgements(&mut acknowledgements, "line with newline\n").unwrap();
assert_eq!(
"preamble
――――――――――――――――――――――――――――――――――――――――――――――――――――――――――――――――――――――――――――――――
line without newline
――――――――――――――――――――――――――――――――――――――――――――――――――――――――――――――――――――――――――――――――
line with newline
",
acknowledgements
);
}
}
12 changes: 12 additions & 0 deletions src/bin/bat/clap_app.rs
Original file line number Diff line number Diff line change
Expand Up @@ -508,6 +508,12 @@ pub fn build_app(interactive_output: bool) -> ClapApp<'static, 'static> {
.hidden_short_help(true)
.help("Show diagnostic information for bug reports.")
)
.arg(
Arg::with_name("acknowledgements")
.long("acknowledgements")
.hidden_short_help(true)
.help("Show acknowledgements."),
)
.arg(
Arg::with_name("ignored-suffix")
.number_of_values(1)
Expand Down Expand Up @@ -578,6 +584,12 @@ pub fn build_app(interactive_output: bool) -> ClapApp<'static, 'static> {
"Create completely new syntax and theme sets \
(instead of appending to the default sets).",
),
)
.arg(
Arg::with_name("acknowledgements")
.long("acknowledgements")
.requires("build")
.help("Build acknowledgements.bin."),
),
)
}
Expand Down
13 changes: 10 additions & 3 deletions src/bin/bat/main.rs
Original file line number Diff line number Diff line change
Expand Up @@ -47,9 +47,13 @@ fn build_assets(matches: &clap::ArgMatches) -> Result<()> {
.map(Path::new)
.unwrap_or_else(|| PROJECT_DIRS.cache_dir());

let blank = matches.is_present("blank");

bat::assets::build(source_dir, !blank, target_dir, clap::crate_version!())
bat::assets::build(
source_dir,
!matches.is_present("blank"),
matches.is_present("acknowledgements"),
target_dir,
clap::crate_version!(),
)
}

fn run_cache_subcommand(matches: &clap::ArgMatches) -> Result<()> {
Expand Down Expand Up @@ -324,6 +328,9 @@ fn run() -> Result<bool> {
} else if app.matches.is_present("cache-dir") {
writeln!(io::stdout(), "{}", cache_dir())?;
Ok(true)
} else if app.matches.is_present("acknowledgements") {
writeln!(io::stdout(), "{}", bat::assets::get_acknowledgements())?;
Ok(true)
} else {
run_controller(inputs, &config)
}
Expand Down
Loading