Skip to content
This repository has been archived by the owner on Jun 26, 2020. It is now read-only.

T/ckeditor5 typing/92: Add support for "word" unit in modifySelection() helper. #1287

Merged
merged 11 commits into from
Feb 16, 2018
60 changes: 54 additions & 6 deletions src/model/utils/modifyselection.js
Original file line number Diff line number Diff line change
Expand Up @@ -13,6 +13,8 @@ import Range from '../range';
import { isInsideSurrogatePair, isInsideCombinedSymbol } from '@ckeditor/ckeditor5-utils/src/unicode';
import DocumentSelection from '../documentselection';

const wordBoundaryCharacters = ' ,.-():\'"';
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Is ' a word boundary? What about "it's"?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It looks that it is (as far as I can test it in textarea and Gnome Text editor:

it's a trap - the text will stop: on spaces and on '.

Also check this docs: https://www.unicode.org/reports/tr29/#Default_Word_Boundaries.

We might think of expanding this list to full unicode support but I've used the most common ones for the text.

Copy link
Contributor Author

@jodator jodator Feb 14, 2018

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yes it is - check the behovior on textarea or in text editor:

It's a trap!

In the above the cursor will stop on spaces and on '.

Also check this docs: https://www.unicode.org/reports/tr29/#Default_Word_Boundaries.

We might think of expanding this list to full unicode support but I've used the most common ones for the text.

edit: to be precise: https://www.unicode.org/reports/tr29/#Single_Quote

But examples shows that can't is one word...

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Hmm... we should expand the list of boundaryCharacters with at least: ?|; and probably some more common interpuntion characters.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@wwalc told me that we need to differentiate between "it's" (which is made of two words so the caret should stop after "s") and "Peter's" where it's a single word which should be removed at once.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'm joking of course. Since a single quote is popular in English and e.g. in macOS it's not treated as a word boundary, I'd not treat it as such.


/**
* Modifies the selection. Currently, the supported modifications are:
*
Expand All @@ -31,6 +33,7 @@ import DocumentSelection from '../documentselection';
* For example `𨭎` is represented in `String` by `\uD862\uDF4E`. Both `\uD862` and `\uDF4E` do not have any meaning
* outside the pair (are rendered as ? when alone). Position between them would be incorrect. In this case, selection
* extension will include whole "surrogate pair".
* * `'word'` - moves selection by whole word.
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

"a whole word"

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

So I'd go with:

// current:                    ' ,.-():\'"';
const wordBoundaryCharacters = ' ,.?!:;"-()';

*
* **Note:** if you extend a forward selection in a backward direction you will in fact shrink it.
*
Expand All @@ -39,7 +42,7 @@ import DocumentSelection from '../documentselection';
* @param {module:engine/model/selection~Selection} selection The selection to modify.
* @param {Object} [options]
* @param {'forward'|'backward'} [options.direction='forward'] The direction in which the selection should be modified.
* @param {'character'|'codePoint'} [options.unit='character'] The unit by which selection should be modified.
* @param {'character'|'codePoint'|'word'} [options.unit='character'] The unit by which selection should be modified.
*/
export default function modifySelection( model, selection, options = {} ) {
const schema = model.schema;
Expand Down Expand Up @@ -79,11 +82,13 @@ export default function modifySelection( model, selection, options = {} ) {
}

// Checks whether the selection can be extended to the the walker's next value (next position).
// @param {{ walker, unit, isForward, schema }} data
// @param {{ item, nextPosition, type}} value
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I guess that value is of the TreeWalkerValue type

function tryExtendingTo( data, value ) {
// If found text, we can certainly put the focus in it. Let's just find a correct position
// based on the unit.
if ( value.type == 'text' ) {
return getCorrectPosition( data.walker, data.unit );
return getCorrectPosition( data.walker, data.unit, data.isForward );
}

// Entering an element.
Expand Down Expand Up @@ -117,17 +122,48 @@ function tryExtendingTo( data, value ) {

// Finds a correct position by walking in a text node and checking whether selection can be extended to given position
// or should be extended further.
function getCorrectPosition( walker, unit ) {
const textNode = walker.position.textNode;
//
// @param {module:engine/model/treewalker~TreeWalker} walker
// @param {String} unit The unit by which selection should be modified.
// @param {Boolean} isForward Is the direction in which the selection should be modified is forward.
function getCorrectPosition( walker, unit, isForward ) {
let textNode = walker.position.textNode;

if ( textNode ) {
const data = textNode.data;
let data = textNode.data;
let offset = walker.position.offset - textNode.startOffset;
let isAtNodeBoundary = offset === ( isForward ? textNode.endOffset : 0 );

while ( isInsideSurrogatePair( data, offset ) || ( unit == 'character' && isInsideCombinedSymbol( data, offset ) ) ) {
while (
isInsideSurrogatePair( data, offset ) ||
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This could be a separate function.

( unit == 'character' && isInsideCombinedSymbol( data, offset ) ) ||
( unit == 'word' && ( !( isAtNodeBoundary || isAtWordBoundary( textNode.data, offset, isForward ) ) ) )
) {
walker.next();

// Check of adjacent text nodes with different attributes (like BOLD).
// Example : 'foofoo []bar<$text bold="true">bar</$text> bazbaz'
// should expand to : 'foofoo [bar<$text bold="true">bar</$text>] bazbaz'.
if ( unit == 'word' && !isAtNodeBoundary ) {
const nextNode = isForward ? walker.position.nodeAfter : walker.position.nodeBefore;

if ( nextNode ) {
// Check boundary char of an adjacent text node.
const boundaryChar = nextNode.data.charAt( isForward ? 0 : nextNode.data.length - 1 );

// Go to the next node if the character at the boundary of that node belongs to the same word.
if ( !wordBoundaryCharacters.includes( boundaryChar ) ) {
// If adjacent text node belongs to the same word go to it & reset values.
walker.next();

textNode = walker.position.textNode;
data = textNode.data;
}
}
}

offset = walker.position.offset - textNode.startOffset;
isAtNodeBoundary = offset === ( isForward ? textNode.endOffset : 0 );
}
}

Expand All @@ -144,3 +180,15 @@ function getSearchRange( start, isForward ) {
return new Range( searchEnd, start );
}
}

// Checks if selection is on word boundary.
//
// @param {module:engine/view/text~Text} textNode The text node to investigate.
// @param {Number} offset Position offset.
// @param {Boolean} isForward Is the direction in which the selection should be modified is forward.
function isAtWordBoundary( data, offset, isForward ) {
// The offset to check depends on direction.
const offsetToCheck = offset + ( isForward ? 0 : -1 );

return wordBoundaryCharacters.includes( data.charAt( offsetToCheck ) );
}
Loading