-
-
Notifications
You must be signed in to change notification settings - Fork 3.7k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Schema improvements #3784
Comments
TBH, I think that we're not going to be able to work on schema in iteration 2. We don't know enough yet and we're not going to have that much time. But it should be fine to work on it later, because it doesn't affect the code thaaat much. Ofc, each feature and many tests define the schema but it's not critical for feature development, unless we face some issues. But this will be the exact moment when we should start thinking on the schema. PS. So far the schema isn't even used by the existing features for things other than registering elements. |
Multiple features/parts of code uses
BTW. Other features also probably should use |
It's not said that we're not going to refactor any code after iteration 2. Until 1.0 we're going to rewrite some pieces many times still :). Schema can be one of them. I agree that it's an API used in many places. That makes this a harder decision. But I'd postpone any bigger refactoring of it until we have more use cases in hands. We don't know much more now about what features will need to define than we knew few months ago when we implemented it initially. So there's a high chance that we'd have to repeat refactoring in a couple of months. Let's maintain the concept that we've got right now for couple next iterations and let's get back to it with better image of what we need. This means that if link or image features will miss some features in the current implementation we shouldn't immediately try to refactor the whole code. |
I agree with @Reinmar. We will not freeze the API after iteration 2. Also, I do not think we have a clear understanding what schema features we need yet. We can introduce them as soon as we will need them. |
Another case to keep in mind: https://github.com/ckeditor/ckeditor5-headings/issues/27#issuecomment-245559218. |
Two new things to remember:
|
Another thing – |
And I found one more thing. Perhaps it's my mistake but if I don't get it then there's something wrong with the API: this.schema.check( { name: 'paragraph', inside: [ 'paragraph' ] } ); // -> false
this.schema.check( { name: 'paragraph', inside: position } ); // -> true
position.parent.name; // -> 'paragraph' |
Another thing. I was wondering why list items are filtered out. It turned out that checking whether a specific element can be inside some context is tricky because this isn't enough: schema.check( { name: el.name, inside: ctx } ); You also need to pass this element's attributes to the |
Poor |
Especially, taken that it knows that it's here only temporary :P One more thing:
And I've got e.g. 10 tests failing.
And I've got 20 tests failing. Very curious :D |
I'm now working on https://github.com/ckeditor/ckeditor5-engine/issues/732 because it turned to be a necessary improvement for autoparagraphing feature. After strengthing the schema check algorithm I need to fix now dozens of tests which (usually mistakenly) relied on weak schema checks. But this made me think that many of these tests are not interested in testing the schema anyway. Many should, like the entire delete content, insert content, data loading, parsing, stringifying, etc. What about modify selection for instance? Should it configure schema or can we introduce "allow all" for it? First I thought that it doesn't matter in this case. But then, how reliable the tests will be if we mock the schema ? Then the text will be allowed in every context, making even such algorithms like this vulnerable (modify selection may need to check whether where it wants to move the caret is a correct text position). So, after giving it a second thought I came to a conclusion that we should not use "allow all" in our tests. We'll need, however, to introduce some more useful syntax for configuring schema, because now you need to call |
Agree. We could have really unit tests wich check only a single feature, but I think that more real scenarios are more useful (as long tests are not too complex and not too hard for the debugging). At the same time note that we will need something like "allow all", for instance, the collaboration server may need such feature. |
I copy paste from https://github.com/ckeditor/ckeditor5-engine/pull/733/files#diff-f6cdfa67127a7d5095832bc5a9b5dec0R572 Okay it took me a while to analyze this. At first sight it seemed wrong to me that any path can be shorter. For OTOH, this will not be a problem if in Now, check will fail for both And now back to the original problem. If we suppose that So there are a few solutions:
|
I've been thinking about the message quote feature today and couple of related things – like retrieving selected blocks. This made me think about the schema. In order to identify block-like elements (paragraphs, list items, headings, etc.) we need to mark them in the schema. Currently, they all inherit from the So, we should separate the structure from element types. We can have default elements like I've been also thinking what kind of elements we need. I can list at least these (but I would keep this list open):
An element can be of many types at the same time. So, e.g., you'll have block limits (as in CKE4's Finally, I need the "get selected block elements" function and I wonder where to put it. It needs to accept a selection and know about the schema. It coooould be the schema's method perhaps or selection's or one of the controller's, but maybe you have better ideas? |
Special question to @scofalik – where's list feature's "get selected blocks" logic :D? And the same question to @szymonkups – where's the same piece of logic in the heading feature :D |
https://github.com/ckeditor/ckeditor5-list/blob/master/src/utils.js#L34 It checks schema, whether given element is a |
I agree with what you've written down. Structure "flags" will be also kept/connected with schema? I mean they will be specified at the same time and place as registering element name to schema? |
Yes. |
Okay, I had a doubt that we will describe one entity in three places (creating ModelElement, creating schema item and describing structure in other place). |
Schema attributes make sense for me (it's like meta elements attribute, what fit well to the tree structure of schema). My only doubt is the |
It will be this: https://github.com/ckeditor/ckeditor5-heading/blob/f0c19177afc4679349f0713a21bd9b8ba5c49378/src/headingcommand.js#L94-L95 And I created an issue to extract this logic: https://github.com/ckeditor/ckeditor5-engine/issues/811. For now, I'll base it on the |
In should go to the |
We could think also if it would be possible to store context for |
I'm thinking now how the message quote feature should get elements which can be wrapped with a Let's consider this content: <paragraph>x[x</paragraph>
<image></image>
<paragraph>y]y</paragraph> The heading and list commands will apply themselves to the two paragraphs but will ignore the image element. The message quote feature should, however, wrap the image too. This made me think that for the quote feature all the 3 elements are the same kind of content – wrappable blocks. We have a couple of options how to implement those two scenarios, but to keep this message short I'll only mention the one I like the most. In the schema, I'd mark the I'm going to continue implementing the quote feature with the above assumption. The image cannot be marked as a block now, so it won't be quotable for the time being. |
The const isMQAllowed = schema.check( {
name: 'messageQuote',
inside: Position.createBefore( firstBlock )
} );
const isBlockAllowed = schema.check( {
name: firstBlock.name,
inside: 'messageQuote'
} ); It's part of a logic checking if a block can be wrapped with a message quote. What's wrong here? That in the second check I'm not checking whether the real block I have can be wrapped with a message quote in the exact content in which it is now. I'm checking only its name and context-less message quote. Schema was meant to be context-aware and this information must be provided and used by schema checks. EDIT: Actually, the second check must also include attributes: const isBlockAllowed = schema.check( {
name: firstBlock.name,
attributes: Array.from( firstBlock.getAttributeKeys() ),
inside: 'messageQuote'
} ); |
Great news :) Another issue with the Schema: https://github.com/ckeditor/ckeditor5-list/issues/34 It's not possible to allow element attributes on a registered element (which inherits from another element). I'm going to workaround this in the messagequote feature by allowing the list item inside it explicitely, but this is an ugly hack which we must removed ASAP. |
Schema is not flexible enough. ATM, there is a bug in our features: // Allow link attribute on all inline nodes.
editor.document.schema.allow( { name: '$inline', attributes: 'linkHref' } ); the author just wanted to inform schema that This means that correct solution is: But the problem is that this doesn't go well with other features that are expanding But then the |
Schema is not clear enough. |
This will be fun. We gathered a huge list of things to consider. Most of them didn't even cross our minds before. Designing the new schema will be really challenging :) |
Things got seriously bad ;< First, we had a problem that we had attrs defined like: editor.document.schema.allow( { name: '$inline', attributes: 'linkHref' } ); This is incorrect because this allows editor.document.schema.allow( { name: '$inline', attributes: 'linkHref', inside: '$block' } ); Then, it turned out that this disallows links/bold/italic in image caption becuase Which... is totally wrong ;| It means that So... we knew that 💩 was deep. Now it also got serious. I don't see a simple way out of this ATM. I can path block quote so it doesn't apply to captions because |
I've been thinking a bit about the use case where you want to allow Currently, if you inherit from So, perhaps the way to go will be to split these aspects so you can define that This will also allow defining that Going a step further, we've been also thinking about basing schema API on an object format. From the top of my head: schema.register( 'image', {
allowedIn: schema.get( '$block' ).allowedIn,
allowedContent() {
return [ 'caption' ];
}
} ); I'm worried, though, how will this API allow runtime extensibility. This is, developers must be able to either redefine the default schema or make slight adjustments to it. E.g. someone may want to allow paragraphs and other blocks allowed in schema.get( 'caption' ).allowedContent = schema.get( '$root' ).allowedContent; However, you could not do "disallow links in captions" because links are allowed on So, when requesting whether schema.get( 'image' ).allowedAttributesInContext(
[ 'caption', '$inline' ],
schema.get( 'caption' ).allowedAttributesInContext(
[ '$inline' ],
schema.get( '$inline' ).allowedAttributes()
)
); Alternatively, context could be passed to |
To make it clear – when checking if
I wonder if this won't be too demanding in terms of how many things you need to take care of to define a simple thing. If so, we could introduce small helpers which'd simplify the job. E.g. |
Another thought – perhaps we can go the simple way for the declarative part of the schema. So, you'd be only able to define that So, a schema item would have a simple, declarative We only need to figure out whether the most common scenarios can be done using the declarative API. |
For now,
Finally I understand your idea :P. This may be a way to go but makes things complicated. Slowly we have complexity creep where you have to define more and more stuff. A "schema defining object" would have four "properties":
This seems overcomplicated and I don't really understand why? Anyway, no matter what will be the final implementation of "schema building blocks", there will have to be some kind of controller that will use all "building blocks". This is why I think that this idea might be good too:
But are we gaining anything in terms of capabilities or is it just a gain in implementation simplicity? I am sorry that my answer is not really insightful but this is a hard topic and it needs more focus than 15 minutes for response on GitHub. We really have to nail all the use cases we have, all the problems we encountered and how they would be resolved in solution A / B / C. This means that it will be really hard to come up with anything by discussing it on GitHub. Somebody have to gather all the data and start with the schema from scratch. |
One more case: allow specific marker in some content/disallow it. For instance, I want to disallow creating comments in empty paragraphs. I may add an attribute with the same name as the marker to the schema and then check if such fake attribute is allowed. It might be the way to go, but it sounds a little like a hack. In general markers & schema is a topic we did not touch yet. |
And one more case: #477. |
Schema stuff is complicated and we need to think about what tools we want to give to users and how they will use it. A common problem is that after some changes in model, nodes that previously had attributes now are incorrect. For example, let's say that a heading cannot have a bold text. If you have a heading and then a paragraph with bold text, merging the paragraph into heading (using Backspace for example) should end up in removing the bold. That was an easy case, though. But we enabled in Then, we not only have to check attributes but also child-parent chains. We not only have merging / moving but also So, in fact, if something changes in model tree, there is awful lot of things to change. Honestly, I'd remove the possibility to define "attribute groups" because they are seldom used, can be controlled (and fixed) directly by feature and add unnecessary complication to schema-related algorithms. |
Remember about attributes stored by |
Also, recently we've been discussing whether schema should support required attributes set, that is whether there should be possibility to tell the schema, that given node is "okay" only when it has all of given attributes (for example, This may make some of algorithms too complicated and we agreed that if that will be the case, we will skip this functionality. |
I had similar issue when I defined schema.allow( { name: '$block', attribute: 'alignment', inside: '$root' } ); I did this way probably due to misunderstanding of how schema works (to be honest I copied that from other plugin). Anyway such definition prevented to add The configuration that worked for me was: schema.allow( { name: '$block', attribute: 'alignment' } ); |
Another case - still I'm not 100% sure if valid or not. Given previous schema.check( {
name: 'listItem',
attributes: [ 'alignment' ]
} ); will fail, wheras: schema.check( {
name: 'listItem',
attributes: [ ...listItem.getAttributeKeys(), 'alignment' ]
} ); will not. It was counterintuitive for me to add current attributes to check but maybe it makes sense to pass them so schema can check all attributes (ie when some combination of attributes are not allowed). Probably a some kind of helper method for checking attribute on existing item might be handy. |
I've read through the entire discussion and all related tickets and I edited the initial post with a compiled list of requirements, ideas and doubts. I'll be now working on figuring out how to deal with them. One of the most important questions I see there is what we can actually do and what needs to be rejected as too complicated. |
All |
Another bug due to the schema: https://github.com/ckeditor/ckeditor5-alignment/issues/12. It will be fixed as well. |
Other: Rewritten the Schema API. Closes #532.
For the future reference – these are notes I took when working on the new schema API. Let's hope that we'll never need to read through all this again :) While analysing all use cases I recognised three main types of information which we try to retrieve from the schema:
OperationsThe first group of checks can be derrived from the types of deltas. We also have some additional helpers (like Methods implementing these checks should be able to return Can be inserted
I think that these checks can't care about the exact insertion positions because often we don't know where exactly the node will end up and what siblings it will have. We also don't know this node's children. E.g. during conversion, we need to make decisions about children nodes of some element before they are inserted into the tree and before their chilren are known. The information we can provide are: ancestors list (with their attributes), the node to insert (with its attributes, but without its children). However, it doesn't also seem to be feasible and reasonable to include node's attributes in this check. Why? Because the So how would attribute validation look like? How to disallow bold in headings? I think that the process should be two steps:
Note, that if you'd disallow bold in headings now, the current implementation of conversion would work fine because we do exactly what I described above – we first insert Example API: const el = doc.createElement( 'paragraph', { alignment: 'right' } );
schema.checkInsert( position, element );
schema.checkInsert( position, 'paragraph' );
schema.checkInsert( position, '$text' );
schema.checkInsert( position.parent, element );
// Sometimes you may need to check a virtual tree.
const ancestors = [
{ name: '$root' },
{ name: 'blockQuote', attributes: { some: 'one' } }
];
schema.checkInsert( position.getAncestors(), element ); I'm unsure still whether it should be Can be wrapped with
Here we need two "can be inserted" checks and additional cleanup:
Both checks need to be be pefromed before any changes are done to the model, so the second one proves that the "can be inserted" check must be able to work with virtual trees. We could mock how the tree would look like after changes (similarly to what the differ would do) and that would be powerful, but it's an overkill. If we'll limit the "check can be inserted" checks to an array of ancestors (and set a rule that schema's callbacks can't check their siblings) and the element to be inserted, then we don't need such magic. The problem here is that we also need to take care of attributes. If Another solution would be to limit attribute checks so they are not contextual (can't check ancestors chain). But that would make it impossible to prohibit bold in a heading, so it'd be a serious limitation. Can be renamed
Here, we need to do a couple of things:
The check is simple but it again proves that the "can be inserted" check needs to work with virtual elements. The attributes cleanup again opens a question who should do it. Should it be automatical or not? Can have an attribute
A simple check, but opens a couple of related questions:
No, because that would be against the rule that a check must concern a single trait and that we need to be able to make decisions based on those checks. Let's consider a case when on element
Image requires With list it seems to be a bit different. I thought that we can define that However, So, to sum this up – it doesn't make sense to have required attributes because in most cases those checks are very contextual.
This seems feasible. The value of an attribute (name) is its inherent pair, so both things can be checked together if we can always provide the value of the attribute in all checks that we need to do. It seems so, because I can't imagine a situation when you want to check whether element may have an attribute but you don't know its value yet. The only situation where the value is not yet known is when user hasn't entered it yet. But then, the feature's UI need to validate that value anyway because we won't go so far to enable configuring validation messages in the schema.
The use case is e.g. So, can we imagine cases where converters could not handle it? Or, is there any risk of attributes getting applied to one piece of text or one element by features, directly on the model? A separate question is whether we can handle exclusion at all. I think so – the "can have attribute" check will just return Finally, the rules for which attribute has priority over which may not be so simple. Hence, post-fixers seem a safer place again. This may introducing some coupling between features (one feature will need to predict that some other feature may apply some attribute) but those cases will be rather rare. Removing disallowed attributesThis seems to be a common action which needs to be performed after most of model structure changes. We agreed already that this should be done by a post-fixer registered by the model itself. The advantage of doing this in a post-fixer is that it will be done once per all changes, not after every single atomic change (if it was done by the writer). This will be more performant and secure (because intermediate model states may be incorrect on purpose). It's also important to remember that cleaning up attributes needs to be a recursive job – it needs to consider all descendants of changed nodes. Selection position
Currently, there are two correct selections:
Additionally, there are some rules which we may want to introduce in the future:
Similarly to the structure/attrs checks, we need to be able to query whether selection position is correct but also to fix incorrect selection (which is currently done by The biggest problem I can see here is what we should check. Always the entire selection? That will make the result hard to analyse (the problem with what I think that we again need to be smart and implement TODO... Defining schema rulesWhen designing this piece of API we need to take into consideration the following aspects:
While the first aspect is trivial, the second caused us a lot of pain in the past. Functional requirements:
Conclusions:
|
List of potential followups:
|
As we know,
Schema
is probably the weakest part of engine model right now. We didn't have clear requirements and idea how it will be used when we created it.Since iteration 2 is about bringing stable API, I think we need to include refactoring/rewritting/improving
Schema
. I open this issue so we can share requirements and thoughts on how it could work.Some conclusions that we aready drawn:
SchemaItem
s should be based on classes mechanism and extending.linkHref
orlinkName
but not both -- I remember there were cases like this).EDIT (by @Reinmar): I've read through the entire topic and created an excerpt from it which I'm adding below.
The following list of requirements, ideas and doubts was compiled from the discussion that we've lead in this ticket. It's mostly focused on stuff that works poorly now or which is not supported at all. The things which the current schema supports can be found in the code so the correctness of the new approach will be naturally validated when refactoring the code.
Attributes
indent
on<listItem>
,bold
on$text
.<listItem>
'stype
. Source: V->M conversion of a bare<li>
(on paste) needs to hardcode the default type now.<heading>
'slevel=1|2|3
. Source: https://github.com/ckeditor/ckeditor5-heading/issues/27#issuecomment-245559218.$text
which is in<heading1-6>
.x
on all$text
s regardless of their context (parent path). But that shouldn't allow$text
everywhere (as happens now). Source: https://github.com/ckeditor/ckeditor5-engine/issues/532#issuecomment-292892032 and https://github.com/ckeditor/ckeditor5-engine/issues/532#issuecomment-299204504.$text
in$clipboardHolder
but then all attributes which were allowed in$root > $block > $text
are not allowed in$clibpardHolder > $text
. Source: Inline styling (bold, italic, link) is lost on paste #477. It gets even more serious with Inline styling (bold, italic, link) is lost on paste #477 (comment) and terrifying with https://github.com/ckeditor/ckeditor5-engine/issues/532#issuecomment-340717901.bold
is allowed on$text
".linkHref
andlinkName
,sub
andsup
(in some implementations).indent
andtype
. If you'd check them one by one (e.g. when trying to uunderstand which element attrs need to be removed after some operations) you would get the false impression that neitherlistItem[indent]
is correct norlistItem[type]
, hence, both should be removed. Source: https://github.com/ckeditor/ckeditor5-engine/issues/532#issuecomment-325328882. Source 2: https://github.com/ckeditor/ckeditor5-engine/issues/532#issuecomment-341501249.Elements
<caption>
being a limit only when in an<image>
): https://github.com/ckeditor/ckeditor5-engine/issues/532#issuecomment-280599349. It think it would be an overkill. One needs to ensure element name uniqness.h1
is only allowed in$root1
, but not in$root2
.h1
is allowed in$root1
even as a deep node:$root > bQ > h1
.Other topics
The text was updated successfully, but these errors were encountered: