-
Notifications
You must be signed in to change notification settings - Fork 139
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Move all regex usage to separate module to add support for fancy-regex #270
Merged
Merged
Changes from all commits
Commits
Show all changes
16 commits
Select commit
Hold shift + click to select a range
5901e1b
Move all regex usage to separate module
robinst d05bde4
Bump fancy-regex to 0.3.0
robinst df56e71
Restore optimization of reusing Regions
robinst f1af918
Change feature cfg so that regex-onig wins if both features are enabled
robinst f37b17b
Add YAML parsing test
robinst 0b78764
Compile regexes in multi-line mode for the "newlines" syntaxes
robinst 5caa56a
Replace POSIX character classes so that they match Unicode as well
robinst bece70d
Replace ^ with \A in multi-line mode regexes
robinst d8eeff9
Fix code that skips a character to work with unicode
robinst 62d79e7
Fix rewriting of "newlines" mode regexes
robinst fa92de0
Remove special treatment of look-behind
robinst a7045b1
Bump fancy-regex to 0.3.2
robinst 4f09143
Only load regex module for features that need it
robinst 0a74d87
Test fancy-regex mode in CI
trishume da2d4b5
Make packs to fix html::tests::strings test for fancy
robinst 9cbe524
Add section to Readme about new fancy-regex mode.
trishume File filter
Filter by extension
Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
There are no files selected for viewing
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Binary file not shown.
Binary file not shown.
Binary file not shown.
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Original file line number | Diff line number | Diff line change |
---|---|---|
|
@@ -8,11 +8,10 @@ use std::fs::File; | |
use std::io::BufReader; | ||
use std::str::FromStr; | ||
|
||
use lazycell::AtomicLazyCell; | ||
use onig::{Regex, SearchOptions}; | ||
use serde::{Deserialize, Deserializer, Serialize, Serializer}; | ||
use serde_json; | ||
|
||
use super::regex::Regex; | ||
use super::scope::{MatchPower, Scope}; | ||
use super::super::LoadingError; | ||
use super::super::highlighting::settings::*; | ||
|
@@ -23,13 +22,6 @@ type Dict = serde_json::Map<String, Settings>; | |
/// A String representation of a `ScopeSelectors` instance. | ||
type SelectorString = String; | ||
|
||
/// A simple regex pattern, used for checking indentation state. | ||
#[derive(Debug)] | ||
pub struct Pattern { | ||
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. I like how this refactor gets rid of the duplicate implementation of regex laziness. |
||
pub regex_str: String, | ||
pub regex: AtomicLazyCell<Regex>, | ||
} | ||
|
||
/// A collection of all loaded metadata. | ||
#[derive(Debug, Default, Clone, Serialize, Deserialize)] | ||
pub struct Metadata { | ||
|
@@ -54,11 +46,11 @@ pub struct MetadataSet { | |
#[derive(Debug, Clone, Serialize, Deserialize, PartialEq)] | ||
#[serde(rename_all = "camelCase")] | ||
pub struct MetadataItems { | ||
pub increase_indent_pattern: Option<Pattern>, | ||
pub decrease_indent_pattern: Option<Pattern>, | ||
pub bracket_indent_next_line_pattern: Option<Pattern>, | ||
pub disable_indent_next_line_pattern: Option<Pattern>, | ||
pub unindented_line_pattern: Option<Pattern>, | ||
pub increase_indent_pattern: Option<Regex>, | ||
pub decrease_indent_pattern: Option<Regex>, | ||
pub bracket_indent_next_line_pattern: Option<Regex>, | ||
pub disable_indent_next_line_pattern: Option<Regex>, | ||
pub unindented_line_pattern: Option<Regex>, | ||
pub indent_parens: Option<bool>, | ||
#[serde(default)] | ||
pub shell_variables: BTreeMap<String, String>, | ||
|
@@ -377,56 +369,6 @@ impl RawMetadataEntry { | |
} | ||
} | ||
|
||
impl Pattern { | ||
pub fn is_match<S: AsRef<str>>(&self, string: S) -> bool { | ||
self.regex() | ||
.match_with_options( | ||
string.as_ref(), | ||
0, | ||
SearchOptions::SEARCH_OPTION_NONE, | ||
None) | ||
.is_some() | ||
} | ||
|
||
pub fn regex(&self) -> &Regex { | ||
if let Some(regex) = self.regex.borrow() { | ||
regex | ||
} else { | ||
let regex = Regex::new(&self.regex_str) | ||
.expect("regex string should be pre-tested"); | ||
self.regex.fill(regex).ok(); | ||
self.regex.borrow().unwrap() | ||
} | ||
} | ||
} | ||
|
||
impl Clone for Pattern { | ||
fn clone(&self) -> Self { | ||
Pattern { regex_str: self.regex_str.clone(), regex: AtomicLazyCell::new() } | ||
} | ||
} | ||
|
||
impl PartialEq for Pattern { | ||
fn eq(&self, other: &Pattern) -> bool { | ||
self.regex_str == other.regex_str | ||
} | ||
} | ||
|
||
impl Eq for Pattern {} | ||
|
||
impl Serialize for Pattern { | ||
fn serialize<S>(&self, serializer: S) -> Result<S::Ok, S::Error> where S: Serializer { | ||
serializer.serialize_str(&self.regex_str) | ||
} | ||
} | ||
|
||
impl<'de> Deserialize<'de> for Pattern { | ||
fn deserialize<D>(deserializer: D) -> Result<Self, D::Error> where D: Deserializer<'de> { | ||
let regex_str = String::deserialize(deserializer)?; | ||
Ok(Pattern { regex_str, regex: AtomicLazyCell::new() }) | ||
} | ||
} | ||
|
||
#[derive(Serialize, Deserialize)] | ||
struct MetaSetSerializable { | ||
selector_string: String, | ||
|
@@ -525,14 +467,6 @@ mod tests { | |
assert!(metadata.items.increase_indent_pattern.is_none()); | ||
} | ||
|
||
#[test] | ||
fn serde_pattern() { | ||
let pattern: Pattern = serde_json::from_str("\"just a string\"").unwrap(); | ||
assert_eq!(pattern.regex_str, "just a string"); | ||
let back_to_str = serde_json::to_string(&pattern).unwrap(); | ||
assert_eq!(back_to_str, "\"just a string\""); | ||
} | ||
|
||
#[test] | ||
fn indent_rust() { | ||
let ps = SyntaxSet::load_from_folder("testdata/Packages/Rust").unwrap(); | ||
|
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Oops, something went wrong.
Add this suggestion to a batch that can be applied as a single commit.
This suggestion is invalid because no changes were made to the code.
Suggestions cannot be applied while the pull request is closed.
Suggestions cannot be applied while viewing a subset of changes.
Only one suggestion per line can be applied in a batch.
Add this suggestion to a batch that can be applied as a single commit.
Applying suggestions on deleted lines is not supported.
You must change the existing code in this line in order to create a valid suggestion.
Outdated suggestions cannot be applied.
This suggestion has been applied or marked resolved.
Suggestions cannot be applied from pending reviews.
Suggestions cannot be applied on multi-line comments.
Suggestions cannot be applied while the pull request is queued to merge.
Suggestion cannot be applied right now. Please check back later.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Not sure what the best way to structure the features is, any thoughts?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The key thing about features is that they're additive so that if multiple crates specify different features, the union of them works for both crates. The other nice thing to do is to preserve compatibility with existing crates that depend on us without default features, although I'm okay with breaking that if there's no good way otherwise.
I can't see a clean way of making the features backwards-compatible, so maybe just change the
cfg
statements in theregex
module so that if bothregex-fancy
andregex-onig
are set thenregex-fancy
takes precedence, although I could see the precedence going the other way too, as long as it works with both set.