-
Notifications
You must be signed in to change notification settings - Fork 4.3k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
RichText state structure for value manipulation #7890
Conversation
590cbc0
to
4359e51
Compare
Thanks so much for working on this. We really need it. It would allow us to style autocompletions like @-mentions based on a pattern match rather than modifying the actual content like #6577 does. This would be cleaner, more flexible, and less error-prone. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This seems like a really good direction if some pattern transforms can be shared across all RichText
instances for things like @-mentions. In general, the changes felt like they were simplifying. I like how this is consolidating text-related logic in rich-text-structure
.
blocks/api/parser.js
Outdated
@@ -157,6 +159,11 @@ export function parseWithAttributeSchema( innerHTML, attributeSchema ) { | |||
*/ | |||
export function getBlockAttribute( attributeKey, attributeSchema, innerHTML, commentAttributes ) { | |||
let value; | |||
|
|||
if ( attributeSchema.source === 'rich-text' ) { | |||
attributeSchema.default = create(); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
It might be worth aliasing the create
import with a more specific name to clarify what is being created.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
We can rename all this sure, atm it's maybe a bit less important while it's in progress.
blocks/api/rich-text-structure.js
Outdated
} = settings; | ||
|
||
return Array.from( element.attributes ).reduce( ( acc, { name, value } ) => { | ||
if ( ! removeAttributeMatch( name ) ) { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Real, non-rhetorical question:
Would there be any value in providing more context like the element's name to removeAttributeMatch
?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Sure, maybe. Atm the goal is just to be able to strip mce specific stuff.
blocks/api/rich-text-structure.js
Outdated
formats.splice( start, end - start + 1 ); | ||
record = { formats, text }; | ||
} else { | ||
// Delete from highest to lowest. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
It may be good to say why we're deleting from highest to lowest.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Sorry if there are any poor comments, it's a WIP, this particular one will actually be removed again.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Sorry for the nitpicks. :) The intent was to share thoughts as I had them, but it wasn't super helpful for a WIP.
blocks/api/rich-text-structure.js
Outdated
@@ -0,0 +1,271 @@ | |||
/** |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
These tests were very helpful in understanding how the rich-text-structure
functions work. Thank you.
editor/components/rich-text/index.js
Outdated
this.savedContent = this.getContent(); | ||
const record = this.getContent(); | ||
|
||
this.savedContent = this.patterns.reduce( ( accu, transform ) => transform( accu ), record ); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Does this mean that transforms due to matched patterns will be persisted in the post content? That seems clear here but is different than I've been mentally modeling this.
Here's my understanding of how we would apply this for @-mentions:
- Adding formatting: Text matching a pattern like
/@\w+/
would be formatted as an @-mention by wrapping it like<span class="user-mention">
. - Removing formatting: A
<span class="user-mention">
whose text content does not begin with an@
would have the formatting removed by replacing the span with its text content.
Is that roughly correct?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This one currently is just for the patterns without the mentions in mind. They're a bit different. The patterns expect a piece of text to be transformed from one thing to another. The mentions merely expect some piece of text to be displayed differently (and maybe allow to have some meta info). I'll look at the mentions shortly.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Cool. Thanks for clarifying.
f67f1c2
to
7e474eb
Compare
@brandonpayton I update the autocompleter to use the rich text structure. I think this simplifies it a lot, but it needs some more polishing for sure there. Also wondering if it should maybe be moved in the rich text folder... |
ce220f5
to
c30eeed
Compare
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I love these changes, and it's great to see the Autocomplete component simplified. Thank you.
Also wondering if it should maybe be moved in the rich text folder...
That seems like a good move to me. Autocomplete
already had things that made it more editor-related than generic. Even before this PR, its implementation assumed wrapping something like a contenteditable and had the onReplace
prop which implies use in a block context. IMHO, bringing it even closer to rich-text to work with a more consistent and reliable editor layer is a win.
inputEvent.initEvent( 'input', true /* bubbles */, false /* cancelable */ ); | ||
} | ||
range.commonAncestorContainer.closest( '[contenteditable=true]' ).dispatchEvent( inputEvent ); | ||
replacement = replacement.slice( 1 + this.state.query.length ); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Is this to account for the prefix character that triggers the completer? If so, what do you think about multi-character triggers?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
No, it's to replace the prefix and the already typed characters to minimise updates. So a multi character trigger can still work, in fact, there's no longer such a thing, you can complete customise what to match and how to match it.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Ah, OK. That helps.
There are a couple of assumptions here that may not hold for all completers.
This line relies on completions always finishing what is being typed (e.g., +doto
being completed with +dotorg
), but completions may be anything and aren't constrained to what the user typed. For example, a fruit completer could complete "!apple" with '🍎'. This means we need to replace the entire substring corresponding to the completer query rather than simply adding the part that isn't there.
It's also possible for completers to return wp.element
declarations, so eventually this method needs to to account for that.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
That's a very good point, thank you!
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The apple example seems to be something that closer to a pattern like `test`
... I wonder if we can somehow unify all of these little functionalities.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I wonder if we can somehow unify all of these little functionalities.
I like the direction you're going with the token configuration here. I'm not sure if the current completer interface will ultimately have a role here and need to sleep on it to give my mind some time to adjust. :)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
It's also possible for completers to return
wp.element
declarations, so eventually this method needs to to account for that.
Since we're allowing matched tokens to be either selected from a list or typed in full, maybe we should reconsider allowing completions to be wp.element
declarations. If we restrict to text completions, completions would be simpler, and there could be a single place to declare token formatting and a single execution path for applying that formatting.
if ( richTextStructure.isCollapsed( record ) ) { | ||
const text = richTextStructure.getTextContent( record ); | ||
const prevText = richTextStructure.getTextContent( prevRecord ); | ||
const textToSearch = text.slice( 0, record.selection.start ); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I took some time to build and play with this PR and noticed typing "Hello, @-exampl" led to completer's test
and getQuery
functions being called with "Hello, @-exampl" instead of "@-exampl". This gives more power to completers, but I wonder whether it would be simpler for authors if we only passed "@-exampl" which is the text between the cursor and the nearest preceding space. What do you think?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yeah, for sure, that sounds good. Not sure about keeping the trigger, let's see when it all works properly. Atm the autocomplete also pops up for something like text @so|omeone
with an item for e.g. someone
. We'll need to avoid this somehow. Unsure how this worked before. This PR touches so many components that I'm wondering if you'd be able to help out a bit with the autocomplete component, mainly making sure all current functionality is present?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Sure, I'm happy to help update the autocomplete component.
My family is moving in a little over a week, and I'll be taking a few days off then. Hopefully, that won't be an issue, but I'll try to keep a close eye on things and do what I can around that time.
@@ -58,7 +58,17 @@ export function createBlockCompleter( { | |||
return { | |||
name: 'blocks', | |||
className: 'editor-autocompleters__block', | |||
triggerPrefix: '/', |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The power of the new test
and getQuery
functions is cool, but I'm wondering if it would be good to continue supporting a trigger
property for people getting started or for those who don't need the power.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Ok, let's leave the API alone until we work on the formatting API in the next PR. We should just focus on replacing the content change detection and manipulation logic.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yes, it would be another breaking change. What we could instead as an enhancement to allow a function to be passed as triggerPrefix
- this would be quite similar to what Webpack does with its options.
This comment has been minimized.
This comment has been minimized.
This comment has been minimized.
This comment has been minimized.
This comment has been minimized.
This comment has been minimized.
754bf06
to
91dbec9
Compare
This comment has been minimized.
This comment has been minimized.
This comment has been minimized.
This comment has been minimized.
2064dfe
to
ad5bae6
Compare
This comment has been minimized.
This comment has been minimized.
This comment has been minimized.
This comment has been minimized.
This comment has been minimized.
This comment has been minimized.
This comment has been minimized.
This comment has been minimized.
This comment has been minimized.
This comment has been minimized.
Fantastic @iseulde |
🎉 Thanks for the effort here @iseulde and thanks for "minding the mobile" and working closely with the mobile team! 🙇 |
"react-native": "src/index", | ||
"dependencies": { | ||
"@babel/runtime": "^7.0.0", | ||
"@wordpress/escape-html": "file:../escape-html", |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This introduces uncommitted changes to the local branch after npm install
. Please review #10234
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Do you mean package lock? See #10304.
export { insert } from './insert'; | ||
export { slice } from './slice'; | ||
export { split } from './split'; | ||
export { apply, toDom as unstableToDom } from './to-dom'; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
toDom
does not follow our naming conventions. It should be toDOM
.
Quick question about this rich text, previously if I had a template like this: <wp.editor.InnerBlocks template={[
['core/paragraph', {
content: ['Default content'],
}],
]} /> Now with array it doesn't work, string does: <wp.editor.InnerBlocks template={[
['core/paragraph', {
content: 'Default content',
}],
]} /> And adding a line break, for example, worked like this: <InnerBlocks template={[
['core/paragraph', {
content: [
'Hello',
{
type: 'br',
props: {
children: [],
},
},
'world',
],
}],
]} /> How would I do that now? I tried The docs don't really say much, |
@youknowriad Ok I see, thanks. Another thing that seems broken for me is when a block previously had an attribute
that is used in Do all the richtext attribute types need to be changed, and if so, to what type? And do they need a default value? I tried rendering Basically I shouldn't worry about current version of Gutenberg at all, it's supposed to be broken? I wanted to develop some blocks that I put aside waiting for the non-paragraph block inserter, now that it's merged I grabbed the latest version but I should have installed something earlier, this is completely WIP? |
Short answer, yes, we're working on it. The migration path is not fully decided. |
* | ||
* @return {string} Escaped HTML element value. | ||
*/ | ||
export function escapeHTML( value ) { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The module is called escape-html
but all it handles is specifically an attribute value? Isn't that a bit misleading?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
It also exports escapeHTML
?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
It also exports
escapeHTML
?
I think I misinterpreted this to only be handling attribute values.
|
||
append( tree, '' ); | ||
|
||
for ( let i = 0; i < formatsLength; i++ ) { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Since formats
is a sparse array, would it be worth considering using forEach
to skip over the unformatted segments?
forEach calls callbackfn [...] only for elements of the array which actually exist; it is not called for missing elements of the array.
https://www.ecma-international.org/ecma-262/5.1/#sec-15.4.4.18
When a format is encountered and at the end of the text, we could simply append the unformatted text in a single go. This would also be much more performant since append
is a non-trivial operation.
Benchmarks:
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yes, maybe we can do this. I'd have to explore it, so far didn't have the time. Formats that stay equal could also be appended in one go. Also note that appendData
, not append
, which probably doesn't make it any faster.
} | ||
|
||
function createEmpty( type ) { | ||
const { body } = document.implementation.createHTMLDocument( '' ); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This has frequently turned up in performance profiling as an expensive operation, which on reflection is understandable given that it's creating an entirely new document. I think we should hold on to a constant reference of a document body and reset it each time we call createEmpty
.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yes, I saw the other PR. We can do the same here.
const formatsLength = formats.length + 1; | ||
const tree = createEmpty( tag ); | ||
|
||
append( tree, '' ); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Why do we append an empty string?
@ellatrix I hope it's fine to ask you a question here. Can you point me to the right direction on adding custom markdown or patterns? Inline code markdown is registered on I've also figured out that |
@phpbits Currently there's no way to do this as a plugin as we do it in core. We still need an API for this. :) That's why the code is not in the format library yet. Feel free to create a separate issue. |
Thanks @ellatrix! I've finally figured it out after couple of days trying :) I've created my own component following what you already have on code formatting. Here's what I've got so far : https://twitter.com/phpbits/status/1132988500196773888 👍 |
Description
Read about the functions for manipulation.
Why a different
RichText
internal state structure?Currently
RichText
values is internally structured as tree of HTML nodes that can map to the displayed live DOM. When the value needs to be manipulated, we are either looking at the live DOM or the internal array structure.Current parsing and serialisation
Raw parsed blocks have a blob of HTML as their content, from which attributes are sourced according to the block’s attribute schema. This is done by parsing the HTML with the browser. For
RichText
values, the in memory DOM tree is then converted to an array structure of the same shape. In theRichText
editor, the values are applied by setting the HTML (element.innerHTML). On Input, the live DOM is converted as well to an array structure. Any manipulation that is done by us happens by manipulation the live DOM directly, or by manipulating the array structure, which is then set as HTML in the editor.Proposed parsing and serialisation: controlled component
Ideally, we’d like to avoid manipulating the live DOM directly, be it though TinyMCE APIs or not. It is both very fragile (prone to errors) and not so easy to do. Say you’d like to automatically format some text that matches
@user
, and complete it based on suggestion, you’d have to merge all text children of element nodes in the tree before the selection to be able to search, split elements and text nodes in order to format the matching text, and move selection using the Selection API. Formatting etc. is similarly complex within TinyMCE. See https://github.com/tinymce/tinymce/tree/master/src/core/main/ts/fmt. A tree shape is not so easy to manipulate, wether it is the live DOM or an array structure. A flat structure would be easier, where text and formats are separated. I’m not proposing an alternative format for saving or displaying the data. HTML remains perfect to store the value, and the DOM needs to be used, in the browser, to display the data in acontentEditable
container.The best structure I’ve found was one where the text and formats are separated. The text would be just a string, which becomes easy to search. The formats are stored in a sparse array of the same length as the text. Each index of the array maps to a character in the string, and formats are reference to a unique format variation.
RichText
values become very easy to manipulate like this, imagine for example splitting, merging, slicing, splicing, applying and removing formats. Empty checking is simply a matter of checking the string. Things like an empty HTML element and empty text nodes are not possible.Example:
New formats could be registered with their own view:
We could also support some meta date here if we need to save extra information e.g. attributes or a user ID.
Because there is a textual representation, RichText instances could be searched and formatted easily, e.g. think about auto-formatting
@mentions
and marking pieces of text for spell checking or comments.Another benefit is that there is only one right way to map this to HTML. Merging
will result in
<em>onetwo</em>
. Not<em>one</em><em>two</em>
.How has this been tested?
Gutenberg should load without errors. Test splitting and merging, formatting, inline tokens, text patterns, blocks transformations.
Screenshots
Types of changes
Checklist: