Skip to content

attributes matching #400

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Merged
merged 14 commits into from
Apr 14, 2023
Merged
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension


Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
479 changes: 288 additions & 191 deletions Cargo.lock

Large diffs are not rendered by default.

2 changes: 2 additions & 0 deletions Cargo.toml
Original file line number Diff line number Diff line change
Expand Up @@ -173,9 +173,11 @@ members = [
"gix-tempfile",
"gix-lock",
"gix-attributes",
"gix-ignore",
"gix-pathspec",
"gix-refspec",
"gix-path",
"gix-utils",
"gix",
"gitoxide-core",
"gix-hashtable",
Expand Down
2 changes: 2 additions & 0 deletions README.md
Original file line number Diff line number Diff line change
Expand Up @@ -76,6 +76,7 @@ is usable to some extent.
* [gix-discover](https://github.com/Byron/gitoxide/blob/main/crate-status.md#gix-discover)
* [gix-path](https://github.com/Byron/gitoxide/blob/main/crate-status.md#gix-path)
* [gix-attributes](https://github.com/Byron/gitoxide/blob/main/crate-status.md#gix-attributes)
* [gix-ignore](https://github.com/Byron/gitoxide/blob/main/crate-status.md#gix-ignore)
* [gix-pathspec](https://github.com/Byron/gitoxide/blob/main/crate-status.md#gix-pathspec)
* [gix-index](https://github.com/Byron/gitoxide/blob/main/crate-status.md#gix-index)
* [gix-revision](https://github.com/Byron/gitoxide/blob/main/crate-status.md#gix-revision)
Expand All @@ -84,6 +85,7 @@ is usable to some extent.
* [gix-refspec](https://github.com/Byron/gitoxide/blob/main/crate-status.md#gix-refspec)
* `gitoxide-core`
* **very early** _(possibly without any documentation and many rough edges)_
* [gix-utils](https://github.com/Byron/gitoxide/blob/main/crate-status.md#gix-utils)
* [gix-worktree](https://github.com/Byron/gitoxide/blob/main/crate-status.md#gix-worktree)
* [gix-bitmap](https://github.com/Byron/gitoxide/blob/main/crate-status.md#gix-bitmap)
* [gix-date](https://github.com/Byron/gitoxide/blob/main/crate-status.md#gix-date)
Expand Down
33 changes: 23 additions & 10 deletions crate-status.md
Original file line number Diff line number Diff line change
Expand Up @@ -101,10 +101,15 @@ and itself relies on all `git-*` crates. It's not meant for consumption, for app
* [x] write the table of contents

### gix-hashtable

* [x] hashmap
* [x] hashset

### gix-utils

* **filesystem**
* [x] probe capabilities
* [x] symlink creation and removal
* [x] file snapshots

### gix-object
* *decode (zero-copy)* borrowed objects
Expand Down Expand Up @@ -323,11 +328,13 @@ Check out the [performance discussion][gix-traverse-performance] as well.
* [ ] Some examples

### gix-attributes
* [x] parse git-ignore files (aka gix-attributes without the attributes or negation)
* [x] parse gix-attributes files
* [ ] create an attributes stack, ideally one that includes 'ignored' status from .gitignore files.
* [ ] support for built-in `binary` macro for `-text -diff -merge`

* [x] parse `.gitattribute` files
* [ ] an attributes stack for matching paths to their attributes, with support for built-in `binary` macro for `-text -diff -merge`

### gix-ignore
* [x] parse `.gitignore` files
* [x] an attributes stack for checking if paths are excluded

### gix-quote
* **ansi-c**
* [x] quote
Expand Down Expand Up @@ -440,7 +447,7 @@ Make it the best-performing implementation and the most convenient one.
- [ ] handle submodules
- [ ] handle sparse directories
- [ ] handle sparse index
- [ ] linear scaling with multi-threading up to IO saturation
- [x] linear scaling with multi-threading up to IO saturation
- supported attributes to affect working tree and index contents
- [ ] eol
- [ ] working-tree-encoding
Expand All @@ -450,8 +457,10 @@ Make it the best-performing implementation and the most convenient one.
- [ ] `ident`
- [ ] filter processes
- [ ] single-invocation clean/smudge filters
* [x] access to all .gitignore/exclude information
* [ ] access to all attributes information
* manage multiple worktrees
* access to per-path information, like `.gitignore` and `.gitattributes` in a manner well suited for efficient lookups
* [x] _exclude_ information
* [ ] attributes

### gix-revision
* [x] `describe()` (similar to `git name-rev`)
Expand Down Expand Up @@ -602,6 +611,8 @@ See its [README.md](https://github.com/Byron/gitoxide/blob/main/gix-lock/README.
* [x] tree with other tree
* [ ] respect case-sensitivity of host filesystem.
* [x] a way to access various diff related settings or use them
* [ ] respect `diff.*.textconv`, `diff.*.cachetextconv` and external diff viewers with `diff.*.command`,
[along with support for reading `diff` gitattributes](https://github.com/git/git/blob/73876f4861cd3d187a4682290ab75c9dccadbc56/Documentation/gitattributes.txt#L699:L699).
* **rewrite tracking**
* **deviation** - git keeps up to four candidates whereas we use the first-found candidate that matches the similarity percentage.
This can lead to different sources being found. As such, we also don't consider the filename at all.
Expand All @@ -614,7 +625,7 @@ See its [README.md](https://github.com/Byron/gitoxide/blob/main/gix-lock/README.
* [x] renames
* [x] copies
* [x] 'find-copies-harder' - find copies with the source being the entire tree.
* [ ] tree with working tree
* [ ] tree or index with working tree
* [x] diffs between modified blobs with various algorithms
* [ ] tree with index
* [x] initialize
Expand Down Expand Up @@ -673,6 +684,8 @@ See its [README.md](https://github.com/Byron/gitoxide/blob/main/gix-lock/README.
* [ ] obtain 'prunable' information
* [x] proper handling of worktree related refs
* [ ] create, move, remove, and repair
* [x] access exclude information
* [ ] access attribute information
* [x] respect `core.worktree` configuration
- **deviation**
* The delicate interplay between `GIT_COMMON_DIR` and `GIT_WORK_TREE` isn't implemented.
Expand Down
2 changes: 1 addition & 1 deletion gitoxide-core/src/index/checkout.rs
Original file line number Diff line number Diff line change
Expand Up @@ -56,7 +56,7 @@ pub fn checkout_exclusive(
}

let opts = gix::worktree::index::checkout::Options {
fs: gix::worktree::fs::Capabilities::probe(dest_directory),
fs: gix::utils::FilesystemCapabilities::probe(dest_directory),

destination_is_initially_empty: true,
overwrite_existing: false,
Expand Down
5 changes: 1 addition & 4 deletions gitoxide-core/src/repository/exclude.rs
Original file line number Diff line number Diff line change
Expand Up @@ -35,10 +35,7 @@ pub fn query(
.worktree()
.with_context(|| "Cannot check excludes without a current worktree")?;
let index = worktree.index()?;
let mut cache = worktree.excludes(
&index,
Some(gix::attrs::MatchGroup::<gix::attrs::Ignore>::from_overrides(overrides)),
)?;
let mut cache = worktree.excludes(&index, Some(gix::ignore::Search::from_overrides(overrides)))?;

let prefix = repo.prefix().expect("worktree - we have an index by now")?;

Expand Down
9 changes: 6 additions & 3 deletions gix-attributes/Cargo.toml
Original file line number Diff line number Diff line change
Expand Up @@ -14,23 +14,26 @@ doctest = false

[features]
## Data structures implement `serde::Serialize` and `serde::Deserialize`.
serde1 = ["serde", "bstr/serde", "gix-glob/serde1"]
serde1 = ["serde", "bstr/serde", "gix-glob/serde1", "kstring/serde"]

[dependencies]
gix-features = { version = "^0.28.0", path = "../gix-features" }
gix-path = { version = "^0.7.2", path = "../gix-path" }
gix-path = { version = "^0.7.3", path = "../gix-path" }
gix-quote = { version = "^0.4.3", path = "../gix-quote" }
gix-glob = { version = "^0.5.5", path = "../gix-glob" }

bstr = { version = "1.3.0", default-features = false, features = ["std", "unicode"]}
smallvec = "1.10.0"
kstring = "2.0.0"
unicode-bom = "2.0.2"
thiserror = "1.0.26"
serde = { version = "1.0.114", optional = true, default-features = false, features = ["derive"]}
log = "0.4.17"

document-features = { version = "0.2.1", optional = true }

[dev-dependencies]
gix-testtools = { path = "../tests/tools"}
gix-utils = { path = "../gix-utils" }

[package.metadata.docs.rs]
all-features = true
Expand Down
83 changes: 36 additions & 47 deletions gix-attributes/src/lib.rs
Original file line number Diff line number Diff line change
@@ -1,30 +1,31 @@
//! Parse `.gitattribute` and `.gitignore` files and provide utilities to match against them.
//! Parse `.gitattribute` files and provide utilities to match against them.
//!
//! ## Feature Flags
#![cfg_attr(
feature = "document-features",
cfg_attr(doc, doc = ::document_features::document_features!())
)]
#![cfg_attr(docsrs, feature(doc_cfg, doc_auto_cfg))]
#![deny(missing_docs, rust_2018_idioms)]
#![forbid(unsafe_code)]
#![deny(missing_docs, rust_2018_idioms, unsafe_code)]

use std::path::PathBuf;

use bstr::{BStr, BString};
pub use gix_glob as glob;
use kstring::{KString, KStringRef};

mod assignment;
///
pub mod name;
mod state;
///
pub mod state;

mod match_group;
pub use match_group::{Attributes, Ignore, Match, Pattern};
///
pub mod search;

///
pub mod parse;
/// Parse attribute assignments line by line from `bytes`.

/// Parse attribute assignments line by line from `bytes`, and fail the operation on error.
///
/// For leniency, ignore errors using `filter_map(Result::ok)` for example.
pub fn parse(bytes: &[u8]) -> parse::Lines<'_> {
parse::Lines::new(bytes)
}
Expand All @@ -42,7 +43,7 @@ pub enum StateRef<'a> {
/// The attribute is set to the given value, which followed the `=` sign.
/// Note that values can be empty.
#[cfg_attr(feature = "serde1", serde(borrow))]
Value(&'a BStr),
Value(state::ValueRef<'a>),
/// The attribute isn't mentioned with a given path or is explicitly set to `Unspecified` using the `!` sign.
Unspecified,
}
Expand All @@ -59,19 +60,19 @@ pub enum State {
Unset,
/// The attribute is set to the given value, which followed the `=` sign.
/// Note that values can be empty.
Value(BString), // TODO(performance): Is there a non-utf8 compact_str/KBString crate? See https://github.com/cobalt-org/kstring/issues/37#issuecomment-1446777265 .
Value(state::Value),
/// The attribute isn't mentioned with a given path or is explicitly set to `Unspecified` using the `!` sign.
Unspecified,
}

/// Represents a validated attribute name
#[derive(PartialEq, Eq, Debug, Hash, Ord, PartialOrd, Clone)]
#[cfg_attr(feature = "serde1", derive(serde::Serialize, serde::Deserialize))]
pub struct Name(pub(crate) String); // TODO(performance): See if `KBString` or `compact_string` could be meaningful here.
pub struct Name(pub(crate) KString);

/// Holds a validated attribute name as a reference
#[derive(PartialEq, Eq, Debug, Hash, Ord, PartialOrd)]
pub struct NameRef<'a>(&'a str);
#[derive(Copy, Clone, PartialEq, Eq, Debug, Hash, Ord, PartialOrd)]
pub struct NameRef<'a>(KStringRef<'a>);

/// Name an attribute and describe it's assigned state.
#[derive(PartialEq, Eq, Debug, Hash, Ord, PartialOrd, Clone)]
Expand All @@ -84,54 +85,42 @@ pub struct Assignment {
}

/// Holds validated attribute data as a reference
#[derive(PartialEq, Eq, Debug, Hash, Ord, PartialOrd)]
#[derive(Copy, Clone, PartialEq, Eq, Debug, Hash, Ord, PartialOrd)]
pub struct AssignmentRef<'a> {
/// The name of the attribute.
pub name: NameRef<'a>,
/// The state of the attribute.
pub state: StateRef<'a>,
}

/// A grouping of lists of patterns while possibly keeping associated to their base path.
/// A grouping of lists of patterns while possibly keeping associated to their base path in order to find matches.
///
/// Pattern lists with base path are queryable relative to that base, otherwise they are relative to the repository root.
#[derive(PartialEq, Eq, Debug, Hash, Ord, PartialOrd, Clone, Default)]
pub struct MatchGroup<T: Pattern = Attributes> {
pub struct Search {
/// A list of pattern lists, each representing a patterns from a file or specified by hand, in the order they were
/// specified in.
///
/// During matching, this order is reversed.
pub patterns: Vec<PatternList<T>>,
/// When matching, this order is reversed.
patterns: Vec<gix_glob::search::pattern::List<search::Attributes>>,
}

/// A list of patterns which optionally know where they were loaded from and what their base is.
/// A list of known global sources for git attribute files in order of ascending precedence.
///
/// Knowing their base which is relative to a source directory, it will ignore all path to match against
/// that don't also start with said base.
#[derive(PartialEq, Eq, Debug, Hash, Ord, PartialOrd, Clone, Default)]
pub struct PatternList<T: Pattern> {
/// Patterns and their associated data in the order they were loaded in or specified,
/// the line number in its source file or its sequence number (_`(pattern, value, line_number)`_).
/// This means that values from the first variant will be returned first.
#[derive(Clone, Copy, Debug, Eq, PartialEq, Hash, Ord, PartialOrd)]
pub enum Source {
/// The attribute file that the installation itself ships with.
GitInstallation,
/// System-wide attributes file. This is typically defined as
/// `$(prefix)/etc/gitattributes` (where prefix is the git-installation directory).
System,
/// This is `<xdg-config-home>/git/attributes` and is git application configuration per user.
///
/// During matching, this order is reversed.
pub patterns: Vec<PatternMapping<T::Value>>,

/// The path from which the patterns were read, or `None` if the patterns
/// don't originate in a file on disk.
pub source: Option<PathBuf>,

/// The parent directory of source, or `None` if the patterns are _global_ to match against the repository root.
/// It's processed to contain slashes only and to end with a trailing slash, and is relative to the repository root.
pub base: Option<BString>,
/// Note that there is no `~/.gitattributes` file.
Git,
/// The configuration of the repository itself, located in `$GIT_DIR/info/attributes`.
Local,
}

/// An association of a pattern with its value, along with a sequence number providing a sort order in relation to its peers.
#[derive(PartialEq, Eq, Debug, Hash, Ord, PartialOrd, Clone)]
pub struct PatternMapping<T> {
/// The pattern itself, like `/target/*`
pub pattern: gix_glob::Pattern,
/// The value associated with the pattern.
pub value: T,
/// Typically the line number in the file the pattern was parsed from.
pub sequence_number: usize,
}
mod source;
Loading