diff --git a/docs/guides/designing-the-delta-format.md b/docs/guides/designing-the-delta-format.md index c9aeceeb47..8b6531ee61 100644 --- a/docs/guides/designing-the-delta-format.md +++ b/docs/guides/designing-the-delta-format.md @@ -7,7 +7,7 @@ redirect_from: - /guides/working-with-deltas/ --- -Rich text editors lack a specification to express its own contents. Until recently, most rich text editors did not even know what was in their own edit areas. These editors just passes the user HTML, along with the burden of parsing and interpretting this. At any given time, this interpretation will differ from those of major browser vendors, leading to different editing experiences for users. +Rich text editors lack a specification to express its own contents. Until recently, most rich text editors did not even know what was in their own edit areas. These editors just pass the user HTML, along with the burden of parsing and interpretting this. At any given time, this interpretation will differ from those of major browser vendors, leading to different editing experiences for users. Quill is the first rich text editor to actually understand its own contents. Key to this is Deltas, the specification describing rich text. Deltas are designed to be easy to understand and use. We will walk through some of the thinking behind Deltas, to shed light on *why* things are the way they are. @@ -27,7 +27,7 @@ var content = [ ]; ``` -If we want to add italics, underline, and other formats, we can add this to the main object, but it is cleaner to separate `text` from all of this so we organize formatting under one field, which we will name `attributes`. +We can add italics, underline, and other formats to the main object if we want to; but it is cleaner to separate `text` from all of this so we organize formatting under one field, which we will name `attributes`. ```javascript var content = [ @@ -49,14 +49,14 @@ var content = [ ]; ``` -To solve this, we add the constraint that Deltas must be compact. With this compact constraint, the above representation is not a valid Delta, since it can be represented more compactly by the previous example, where "Hel" and "lo" were not separate. Similarly we cannot have `{ bold: false, italic: true, underline: null }`, because `{ italic: true }` is more compact. +To solve this, we add the constraint that Deltas must be compact. With this constraint, the above representation is not a valid Delta, since it can be represented more compactly by the previous example, where "Hel" and "lo" were not separate. Similarly we cannot have `{ bold: false, italic: true, underline: null }`, because `{ italic: true }` is more compact. ### Canonical We have not assigned any meaning to `bold`, just that it describes some formatting for text. We could very well have used different names, such as `weighted` or `strong`, or used a different range of possible values, such as a numerical or descriptive range of weights. An example can be found in CSS, where most of these ambiguities are at play. If we saw bolded text on a page, we cannot predict if its rule set is `font-weight: bold` or `font-weight: 700`. This makes the task of parsing CSS to discern its meaning, much more complex. -We do not define the set of possible attributes, nor their meanings, but we do add an additional contraint that Deltas must be canonical. If two Deltas are equal, the content they represent must be equal, and there cannot be two unequal Deltas that represent the same content. Programmically, this allows you to simply deep compare two Deltas to determine if the content they represent are equal. +We do not define the set of possible attributes, nor their meanings, but we do add an additional contraint that Deltas must be canonical. If two Deltas are equal, the content they represent must be equal, and there cannot be two unequal Deltas that represent the same content. Programmatically, this allows you to simply deep compare two Deltas to determine if the content they represent is equal. So if we had the following, the only conclusion we can draw is `a` is different from `b`, but not what `a` or `b` means. @@ -88,14 +88,14 @@ This canonicalization applies to both keys and values, `text` and `attributes`. - There is only one way to represent a newline which is with `\n`, not `\r` or `\r\n` - `text: "Hello World"` unambiguously means there are precisely two spaces between "Hello" and "World" -Some of these choices may be customized by the user, but the canonical contraint in Deltas dictate that the choice must be unique. +Some of these choices may be customized by the user, but the canonical constraint in Deltas dictate that the choice must be unique. -This unambiguous predictability makes Deltas easier to work with, both because you have fewer cases to handle, but also because there are no surprises in what a corresponding Delta well look like. Long term, this makes applications using Deltas easier to understand and maintain. +This unambiguous predictability makes Deltas easier to work with, both because you have fewer cases to handle and because there are no surprises in what a corresponding Delta will look like. Long term, this makes applications using Deltas easier to understand and maintain. ## Line Formatting -Line formats affect the contents of the entire line, so it present an interesting challenge for our compact and canonical constraint. A seemingly reasonable way to represent center aligned text is this: +Line formats affect the contents of an entire line, so they present an interesting challenge for our compact and canonical constraints. A seemingly reasonable way to represent centered text would be the following: ```javascript var content = [ @@ -123,7 +123,7 @@ var content = [ But if the answer is yes, then we violate the canonical constraint since any permutation of characters having an align attribute would represent the same content. -So we cannot just naively get rid of the newline character. We have to also either get rid of line attributes, or expand line attributes to fill all characters on the line. But what if we deleted the newline from this: +So we cannot just naively get rid of the newline character. We also have to either get rid of line attributes, or expand them to fill all characters on the line. But what if we removed the **newline** from it: ```javascript var content = [ @@ -135,7 +135,7 @@ var content = [ It is not clear if our resulting line is aligned center or right. We could delete both or have some ordering rule to favor one over the other, but our Delta is becoming more complex and harder to work with on this path. -This problem begs for atomicity, and we find this in the newline character itself. But we have an off by one problem in that if we have n lines, we only have n-1 newline characters. +This problem begs for atomicity, and we find this in the *newline* character itself. But we have an off by one problem in that if we have **n** lines, we only have **n-1** newline characters. To solve this, Quill "adds" a newline to all documents and always ends Deltas with "\n". @@ -152,7 +152,7 @@ var content = [ ## Embedded Content -We want to add embedded content like images or video. Strings were natural to use for text but we have a lot more options for embeds. Since there are different types of embeds, our choice just needs to include this type information and then the actual content. There are many reasonable options here but we will use an object whose only key is the embed type and the value is the content representation, which may be any type or value. +We want to add embedded content like images or video. Strings were natural to use for text but we have a lot more options for embeds. Since there are different types of embeds, our choice just needs to include this type information, and then the actual content. There are many reasonable options here but we will use an object whose only key is the embed type and the value is the content representation, which may have any type or value. ```javascript var img = { @@ -189,9 +189,9 @@ As the name Delta implies, our format can describe changes to documents, as well #### Delete -To describe deleting text, we need to know where and how many characters to delete. To delete embeds, there need not be any special treatment, other than to understand the length of an embed. If it is anything other than one, we would need to specify what happens when only part of an embed is deleted. There is currently no such specification, so embeds are all of length one, regardless of how many pixels make up an image, how many minutes long a video is, or how many slides are in a deck. +To describe deleting text, we need to know where and how many characters to delete. To delete embeds, there needs not be any special treatment, other than to understand the length of an embed. If it is anything other than one, we would then need to specify what happens when only part of an embed is deleted. There is currently no such specification, so regardless of how many pixels make up an image, how many minutes long a video is, or how many slides are in a deck; embeds are all of length _one_. -One reasonable way to describe deletion is to explictly store this deletion index and length. +One reasonable way to describe a deletion is to explicitly store its index and length. ```javascript var delta = [{ @@ -215,7 +215,7 @@ Now that Deltas may be describing changes to a non-empty document, `{ insert: "H #### Format -Similar to deletes, we need to specify the range of text to format, and format change. Formatting exists in the `attributes` object, so a simple solution is to provide another `attributes` object to merge with the existing `attributes` object. This merge is shallow to keep things simple. A use case that both requires a deep merge and is compelling enough to warrant the added complexity has not been found. +Similar to deletes, we need to specify the range of text to format, along with the format change itself. Formatting exists in the `attributes` object, so a simple solution is to provide an additional `attributes` object to merge with the existing one. This merge is shallow to keep things simple. We have not found an use case that is compelling enough to require a deep merge and warrants the added complexity. ```javascript var delta = [{ @@ -229,9 +229,9 @@ var delta = [{ }]; ``` -The only special case is when we want to remove formatting. We will use `null` for this purpose, so `{ bold: null }` would mean remove the bold format. We could have specified any falsy value, but there may be legitimate use cases for an attribute value to be `0` or the empty string. +The special case is when we want to remove formatting. We will use `null` for this purpose, so `{ bold: null }` would mean remove the bold format. We could have specified any falsy value, but there may be legitimate use cases for an attribute value to be `0` or the empty string. -We now have to be careful with indexes at the application layer. As noted earlier, Deltas do not ascribe any inherent meaning to any the `attributes`'s key-value pairs, nor any embed types or values. Deltas do not know that images do not have durations, text does not have alternative texts, and videos cannot be bolded. The following is a legal Delta that might have been the result of applying other legal Deltas, by an application that was not careful of format ranges. +**Note:** We now have to be careful with indexes at the application layer. As mentioned earlier, Deltas do not ascribe any inherent meaning to any the `attributes`' key-value pairs, nor any embed types or values. Deltas do not know an image does not have duration, text does not have alternative texts, and videos cannot be bolded. The following is a **legal** Delta that might have been the result of applying other **legal** Deltas, by an application not being careful of format ranges. ```javascript var delta = [{ @@ -258,7 +258,7 @@ var delta = [{ #### Pitfalls -First, we should be clear that this index must refer to the index in the document **before** any Operations are applied. Otherwise, a later Operation may delete a previous insert, unformat a previous format, etc, which would violate compactness. +First, we should be clear that an index must refer to its position in the document **before** any Operations are applied. Otherwise, a later Operation may delete a previous insert, unformat a previous format, etc., which would violate compactness. Operations must also be strictly ordered to satisfy our canonical constraint. Ordering by index, then length, and then type is one valid way this can be accomplished. @@ -268,17 +268,17 @@ The number of reasons a Delta might be invalid is piling up. A better format wou #### Retain -If we step back from our compactness formalities for a moment, we can describe a much simpler format to describe inserting, deleting, and formatting: +If we step back from our compactness formalities for a moment, we can describe a much simpler format to express inserting, deleting, and formatting: -- A Delta would have Operations that is at least as long as the document that it is modifying. +- A Delta would have Operations that are at least as long as the document being modified. - Each Operation would describe what happens to the character at that index. - Optional insert Operations may make the Delta longer than the document it describes. -This necessitates the creation of a new Operation, that simply means keep this character as is. We call this `retain`. +This necessitates the creation of a new Operation, that will simply mean "keep this character as is". We call this a `retain`. ```javascript // Starting with "HelloWorld", -// bold "Hello", and insert a space +// bold "Hello", and insert a space right after it var change = [ { format: true, attributes: { bold: true } }, // H { format: true, attributes: { bold: true } }, // e @@ -294,9 +294,9 @@ var change = [ ] ``` -Since every character is described, explicit indexes and lengths are no longer necessary. This makes out of order indexes and overlapping ranges impossible to express. +Since every character is described, explicit indexes and lengths are no longer necessary. This makes overlapping ranges and out-of-order indexes impossible to express. -From this, we can make the easy optimization to merge adjacent equal Operations, re-introducing length. If the last Operation is a `retain`, we can also simply drop this, since it instructs us to "do nothing to the rest of the document". +Therefore, we can make the easy optimization to merge adjacent equal Operations, re-introducing _length_. If the last Operation is a `retain` we can simply drop it, for it simply instructs to "do nothing to the rest of the document". ```javascript var change = [ @@ -305,7 +305,7 @@ var change = [ ] ``` -You might notice that a `retain` is in some ways just special case of a `format`. For example, there is no practical difference between `{ format: 1, attributes: {} }` and `{ retain: 1 }`. Compacting would drop the empty `attributes` object leaving us with just `{ format: 1 }`, creating a canonicalization conflict. So we simply combine `format` and `retain`, and keep the name `retain`. +Furthermore, you might notice that a `retain` is in some ways just a special case of `format`. For instance, there is no practical difference between `{ format: 1, attributes: {} }` and `{ retain: 1 }`. Compacting would drop the empty `attributes` object leaving us with just `{ format: 1 }`, creating a canonicalization conflict. Thus, in our example we will simply combine `format` and `retain`, and keep the name `retain`. ```javascript var change = [ @@ -314,13 +314,13 @@ var change = [ ] ``` -We have now have a Delta format that is very close to the actual Delta format. +We now have a Delta that is very close to the current standard format. -#### ops +#### Ops -Right now we have an easy to use JSON Array that describes rich text. This is great at the storage and transport layers, but applications could benefit from more functionality. We can add this by implementing Deltas as a class, that can be easily initialized from or exported to JSON, and providing relevant methods. +Right now we have an easy to use JSON Array that describes rich text. This is great at the storage and transport layers, but applications could benefit from more functionality. We can add this by implementing Deltas as a class, that can be easily initialized from or exported to JSON, and then providing it with relevant methods. -At the time of Delta's inception, it was not possible to sub-class an Array. So Deltas are Objects, with a single property `ops` that stores an array of Operations we have been discussing. +At the time of Delta's inception, it was not possible to sub-class an Array. For this reason Deltas are expressed as Objects, with a single property `ops` that stores an array of Operations like the ones we have been discussing. ```javascript var delta = {