Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Line ending detection #224

Merged
merged 28 commits into from
Jun 22, 2021
Merged
Show file tree
Hide file tree
Changes from 10 commits
Commits
Show all changes
28 commits
Select commit Hold shift + click to select a range
3756c21
rebase on branch line_ending_detection
janhrastnik Jun 16, 2021
17f69a0
ran cargo clippy and cargo fmt
janhrastnik Jun 11, 2021
5eb6918
resolved conflict in rebase
janhrastnik Jun 16, 2021
9c419fe
added more changes from pr review for line_ending_detection
janhrastnik Jun 16, 2021
e4849f4
fix typo
janhrastnik Jun 13, 2021
a9a718c
added some tests and a line_ending helper function in document.rs
janhrastnik Jun 13, 2021
a4f5a01
trying out line ending helper functions in commands.rs
janhrastnik Jun 14, 2021
7cf0fa0
doc.line_ending() now returns &'static str
janhrastnik Jun 16, 2021
9c3eadb
fixed some problems from rebasing
janhrastnik Jun 16, 2021
8bccd6d
applied changes from pr review
janhrastnik Jun 17, 2021
ecb884d
added get_line_ending from pr comment
janhrastnik Jun 19, 2021
97323dc
ran cargo fmt
janhrastnik Jun 19, 2021
cdd9347
Merge remote-tracking branch 'origin/master' into line_ending_detection
janhrastnik Jun 19, 2021
1e80fbb
fix merge issue
janhrastnik Jun 19, 2021
701eb0d
changed some hardcoded newlines, removed a else if in line_ending.rs
janhrastnik Jun 19, 2021
8634e04
added the line_end helper function
janhrastnik Jun 20, 2021
5d22e3c
Misc fixes and clean up of line ending detect code.
cessen Jun 20, 2021
4efd671
Work on moving code over to LineEnding instead of assuming '\n'.
cessen Jun 20, 2021
e686c3e
Merge branch 'master' of github.com:helix-editor/helix into line_endi…
cessen Jun 20, 2021
3d3149e
Silence clippy warning.
cessen Jun 20, 2021
7140020
Don't need getters/setters for line_ending property.
cessen Jun 21, 2021
07e2880
Add function to get the line ending of a str slice.
cessen Jun 21, 2021
23d6188
Update `replace` command to use document line ending setting.
cessen Jun 21, 2021
e436c30
Make split_selection_on_newline command handle all line endings.
cessen Jun 21, 2021
d333556
Convert remaining commands to use the document's line ending setting.
cessen Jun 21, 2021
7c4fa18
Fix clippy warnings.
cessen Jun 21, 2021
a18d50b
Add command to set the document's default line ending.
cessen Jun 21, 2021
f2954fa
Flesh out the line ending utility unit tests.
cessen Jun 21, 2021
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
5 changes: 5 additions & 0 deletions helix-core/src/lib.rs
Original file line number Diff line number Diff line change
Expand Up @@ -6,6 +6,7 @@ pub mod diagnostic;
pub mod graphemes;
pub mod history;
pub mod indent;
pub mod line_ending;
pub mod macros;
pub mod match_brackets;
pub mod movement;
Expand Down Expand Up @@ -100,6 +101,7 @@ pub use unicode_general_category::get_general_category;
#[doc(inline)]
pub use {regex, tree_sitter};

pub use graphemes::RopeGraphemes;
pub use position::{coords_at_pos, pos_at_coords, Position};
pub use selection::{Range, Selection};
pub use smallvec::SmallVec;
Expand All @@ -108,4 +110,7 @@ pub use syntax::Syntax;
pub use diagnostic::Diagnostic;
pub use state::State;

pub use line_ending::{
auto_detect_line_ending, rope_slice_to_line_ending, LineEnding, DEFAULT_LINE_ENDING,
};
pub use transaction::{Assoc, Change, ChangeSet, Operation, Transaction};
150 changes: 150 additions & 0 deletions helix-core/src/line_ending.rs
Original file line number Diff line number Diff line change
@@ -0,0 +1,150 @@
use crate::{Rope, RopeGraphemes, RopeSlice};

/// Represents one of the valid Unicode line endings.
#[derive(PartialEq, Copy, Clone, Debug)]
pub enum LineEnding {
Crlf, // CarriageReturn followed by LineFeed
LF, // U+000A -- LineFeed
CR, // U+000D -- CarriageReturn
Nel, // U+0085 -- NextLine
LS, // U+2028 -- Line Separator
VT, // U+000B -- VerticalTab
FF, // U+000C -- FormFeed
PS, // U+2029 -- ParagraphSeparator
}

impl LineEnding {
pub fn len(&self) -> usize {
match self {
Self::Crlf => 2,
_ => 1,
}
}
cessen marked this conversation as resolved.
Show resolved Hide resolved

pub fn as_str(&self) -> &str {
match self {
Self::Crlf => "\u{000D}\u{000A}",
Self::LF => "\u{000A}",
Self::Nel => "\u{0085}",
Self::LS => "\u{2028}",
Self::CR => "\u{000D}",
_ => panic!(
"Unexpected line ending: {:?}, expected Crlf, LF, CR, Nel, or LS.",
self
),
}
}
cessen marked this conversation as resolved.
Show resolved Hide resolved
}

pub fn rope_slice_to_line_ending(g: &RopeSlice) -> Option<LineEnding> {
if let Some(text) = g.as_str() {
str_to_line_ending(text)
} else if g == "\u{000D}\u{000A}" {
cessen marked this conversation as resolved.
Show resolved Hide resolved
Some(LineEnding::Crlf)
} else {
// Not a line ending
None
}
}

pub fn str_to_line_ending(g: &str) -> Option<LineEnding> {
cessen marked this conversation as resolved.
Show resolved Hide resolved
match g {
"\u{000D}\u{000A}" => Some(LineEnding::Crlf),
"\u{000A}" => Some(LineEnding::LF),
"\u{000D}" => Some(LineEnding::CR),
"\u{0085}" => Some(LineEnding::Nel),
"\u{2028}" => Some(LineEnding::LS),
"\u{000B}" => Some(LineEnding::VT),
"\u{000C}" => Some(LineEnding::FF),
"\u{2029}" => Some(LineEnding::PS),
// Not a line ending
_ => None,
}
}

pub fn auto_detect_line_ending(doc: &Rope) -> Option<LineEnding> {
// based on https://github.com/cessen/led/blob/27572c8838a1c664ee378a19358604063881cc1d/src/editor/mod.rs#L88-L162

let mut ending = None;
// return first matched line ending. Not all possible line endings are being matched, as they might be special-use only
for line in doc.lines().take(100) {
ending = match line.len_chars() {
1 => {
let g = RopeGraphemes::new(line.slice((line.len_chars() - 1)..))
.last()
.unwrap();
rope_slice_to_line_ending(&g)
}
n if n > 1 => {
let g = RopeGraphemes::new(line.slice((line.len_chars() - 2)..))
.last()
.unwrap();
rope_slice_to_line_ending(&g)
}
_ => None,
};
cessen marked this conversation as resolved.
Show resolved Hide resolved
if ending.is_some() {
match ending {
Some(LineEnding::VT) | Some(LineEnding::FF) | Some(LineEnding::PS) => {}
_ => return ending,
}
}
}
ending
}

#[cfg(target_os = "windows")]
pub const DEFAULT_LINE_ENDING: LineEnding = LineEnding::Crlf;
#[cfg(not(target_os = "windows"))]
pub const DEFAULT_LINE_ENDING: LineEnding = LineEnding::LF;

#[cfg(test)]
mod line_ending_tests {
use super::*;

#[test]
fn test_autodetect() {
assert_eq!(
auto_detect_line_ending(&Rope::from_str("\n")),
Some(LineEnding::LF)
);
assert_eq!(
auto_detect_line_ending(&Rope::from_str("\r\n")),
Some(LineEnding::Crlf)
);
assert_eq!(auto_detect_line_ending(&Rope::from_str("hello")), None);
assert_eq!(auto_detect_line_ending(&Rope::from_str("")), None);
assert_eq!(
auto_detect_line_ending(&Rope::from_str("hello\nhelix\r\n")),
Some(LineEnding::LF)
);
assert_eq!(
auto_detect_line_ending(&Rope::from_str("a formfeed\u{000C}")),
None
);
assert_eq!(
auto_detect_line_ending(&Rope::from_str("\n\u{000A}\n \u{000A}")),
Some(LineEnding::LF)
);
assert_eq!(
auto_detect_line_ending(&Rope::from_str(
"a formfeed\u{000C} with a\u{000C} linefeed\u{000A}"
)),
Some(LineEnding::LF)
);
assert_eq!(auto_detect_line_ending(&Rope::from_str("a formfeed\u{000C} with a\u{000C} carriage return linefeed\u{000D}\u{000A} and a linefeed\u{000A}")), Some(LineEnding::Crlf));
}

#[test]
fn test_rope_slice_to_line_ending() {
let r = Rope::from_str("\r\n");
assert_eq!(
rope_slice_to_line_ending(&r.slice(1..2)),
Some(LineEnding::LF)
);
assert_eq!(
rope_slice_to_line_ending(&r.slice(0..2)),
Some(LineEnding::Crlf)
);
}
}
24 changes: 15 additions & 9 deletions helix-term/src/commands.rs
Original file line number Diff line number Diff line change
Expand Up @@ -5,8 +5,8 @@ use helix_core::{
object, pos_at_coords,
regex::{self, Regex},
register::{self, Register, Registers},
search, selection, Change, ChangeSet, Position, Range, Rope, RopeSlice, Selection, SmallVec,
Tendril, Transaction,
search, selection, Change, ChangeSet, LineEnding, Position, Range, Rope, RopeSlice, Selection,
SmallVec, Tendril, Transaction,
};

use helix_view::{
Expand Down Expand Up @@ -185,7 +185,9 @@ pub fn move_line_end(cx: &mut Context) {

// Line end is pos at the start of next line - 1
// subtract another 1 because the line ends with \n
let pos = text.line_to_char(line + 1).saturating_sub(2);
let pos = text
.line_to_char(line + 1)
.saturating_sub(doc.line_ending().len() + 1);
cessen marked this conversation as resolved.
Show resolved Hide resolved
Range::new(pos, pos)
});

Expand Down Expand Up @@ -607,7 +609,9 @@ pub fn extend_line_end(cx: &mut Context) {

// Line end is pos at the start of next line - 1
// subtract another 1 because the line ends with \n
let pos = text.line_to_char(line + 1).saturating_sub(2);
let pos = text
.line_to_char(line + 1)
.saturating_sub(doc.line_ending().len() + 1);
Range::new(range.anchor, pos)
});

Expand Down Expand Up @@ -896,7 +900,7 @@ pub fn append_mode(cx: &mut Context) {
if selection.iter().any(|range| range.head == end) {
let transaction = Transaction::change(
doc.text(),
std::array::IntoIter::new([(end, end, Some(Tendril::from_char('\n')))]),
std::array::IntoIter::new([(end, end, Some(doc.line_ending().as_str().into()))]),
);
doc.apply(&transaction, view.id);
}
Expand Down Expand Up @@ -1523,7 +1527,7 @@ fn open(cx: &mut Context, open: Open) {
);
let indent = doc.indent_unit().repeat(indent_level);
let mut text = String::with_capacity(1 + indent.len());
text.push('\n');
text.push_str(doc.line_ending().as_str());
text.push_str(&indent);
let text = text.repeat(count);

Expand Down Expand Up @@ -2131,7 +2135,7 @@ pub mod insert {
);
let indent = doc.indent_unit().repeat(indent_level);
let mut text = String::with_capacity(1 + indent.len());
text.push('\n');
text.push_str(doc.line_ending().as_str());
text.push_str(&indent);

let head = pos + offs + text.chars().count();
Expand All @@ -2152,7 +2156,7 @@ pub mod insert {
if helix_core::auto_pairs::PAIRS.contains(&(prev, curr)) {
// another newline, indent the end bracket one level less
let indent = doc.indent_unit().repeat(indent_level.saturating_sub(1));
text.push('\n');
text.push_str(doc.line_ending().as_str());
text.push_str(&indent);
}

Expand Down Expand Up @@ -2269,7 +2273,9 @@ fn paste_impl(
);

// if any of values ends \n it's linewise paste
let linewise = values.iter().any(|value| value.ends_with('\n'));
let linewise = values
.iter()
.any(|value| value.ends_with(doc.line_ending().as_str()));

let mut values = values.iter().cloned().map(Tendril::from).chain(repeat);

Expand Down
4 changes: 2 additions & 2 deletions helix-term/src/ui/editor.rs
Original file line number Diff line number Diff line change
Expand Up @@ -7,7 +7,7 @@ use crate::{
};

use helix_core::{
coords_at_pos,
coords_at_pos, rope_slice_to_line_ending,
syntax::{self, HighlightEvent},
Position, Range,
};
Expand Down Expand Up @@ -179,7 +179,7 @@ impl EditorView {

// iterate over range char by char
for grapheme in RopeGraphemes::new(text) {
if grapheme == "\n" {
if rope_slice_to_line_ending(&grapheme).is_some() {
visual_x = 0;
line += 1;

Expand Down
18 changes: 17 additions & 1 deletion helix-view/src/document.rs
Original file line number Diff line number Diff line change
Expand Up @@ -5,10 +5,12 @@ use std::path::{Component, Path, PathBuf};
use std::sync::Arc;

use helix_core::{
auto_detect_line_ending,
chars::{char_is_linebreak, char_is_whitespace},
history::History,
syntax::{LanguageConfiguration, LOADER},
ChangeSet, Diagnostic, Rope, Selection, State, Syntax, Transaction,
ChangeSet, Diagnostic, LineEnding, Rope, Selection, State, Syntax, Transaction,
DEFAULT_LINE_ENDING,
};

use crate::{DocumentId, ViewId};
Expand Down Expand Up @@ -61,6 +63,7 @@ pub struct Document {

diagnostics: Vec<Diagnostic>,
language_server: Option<Arc<helix_lsp::Client>>,
line_ending: LineEnding,
}

use std::fmt;
Expand Down Expand Up @@ -171,6 +174,7 @@ impl Document {
history: Cell::new(History::default()),
last_saved_revision: 0,
language_server: None,
line_ending: DEFAULT_LINE_ENDING,
}
}

Expand All @@ -190,10 +194,14 @@ impl Document {
doc
};

// search for line endings
let line_ending = auto_detect_line_ending(&doc).unwrap_or(DEFAULT_LINE_ENDING);

let mut doc = Self::new(doc);
// set the path and try detecting the language
doc.set_path(&path)?;
doc.detect_indent_style();
doc.set_line_ending(line_ending);

Ok(doc)
}
Expand Down Expand Up @@ -452,6 +460,10 @@ impl Document {
self.selections.insert(view_id, selection);
}

pub fn set_line_ending(&mut self, line_ending: LineEnding) {
self.line_ending = line_ending;
}

fn _apply(&mut self, transaction: &Transaction, view_id: ViewId) -> bool {
let old_doc = self.text().clone();

Expand Down Expand Up @@ -715,6 +727,10 @@ impl Document {
pub fn set_diagnostics(&mut self, diagnostics: Vec<Diagnostic>) {
self.diagnostics = diagnostics;
}

pub fn line_ending(&self) -> LineEnding {
self.line_ending
}
}

#[cfg(test)]
Expand Down