Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Try block annotation, starting with full block annotation #3628

Closed
wants to merge 1 commit into from

Conversation

atimmer
Copy link
Member

@atimmer atimmer commented Nov 23, 2017

Description

This is a PR that explores how to do annotation on blocks. A lot of work has gone into exploration instead of actual code changes. But this is the PoC I have running locally, so I thought it would be good to share it.

How did I get here

I first started thinking about different annotation data formats. I started with the least amount of data:

{
	blockId: "",
	range: [ 0, 0 ],
	fullBlock: true|false,
	comment: "Some text comment",
}

After discussing this with @omarreiss we iterated on this and ended up with the following data structure:

{
	start: {
		block: "",
		position: 0,
	},
	end: {
		block: "",
		position: 0,
	},
	blockAnnotation: true|false,
	comments: [
		{
			body: "This is a comment.",
			date: Date
		},
		{
			body: "A reaction to that other comment.",
			date: Date
		}
	],
	persist: true|false,
	id: ""
};

start and end: We figured that an annotation always has a start and an end position. Blocks can either be fully annotated or the text inside blocks can be annotated. So an annotation starts at a block and a position. And an annotation ends at a block and a position.

blockAnnotation: An annotation can either be a full block annotation or a default text annotation. For non-text blocks, there might not be a distinction, but I could see that you might want to comment on one image in a gallery block.

comments: Comments are objects on their own. They probably have more fields than body and date, but those are the bare minimum. type might be something we want to add.

persist and id: Some annotations should be persisted, some shouldn't. So the id is only relevant when an annotation needs to be persisted. Annotations that haven't been persisted to the server can be detected because they have no id yet.

In this data model, a comment is always bound to an annotation. This is a trade-off that forces a user to always be specific about what a comment is about.

Challenges

There are a ton of challenges to be solved. Persisting annotations is one that comes to mind. Which I have commented on here: #3026 (comment).

Another one is handling moving blocks. I think this can be solved by handling moving of blocks in the annotations reducer and changing the annotations to still be correct.

How Has This Been Tested?

By running it in Chrome & Firefox

Screenshots (jpeg or gifs if applicable):

n.a. (It doesn't look good yet)

Types of changes

New feature (non-breaking change which adds functionality)

Checklist:

  • My code is tested.
  • My code follows the WordPress code style.
  • My code follows has proper inline documentation.

TODO:

  • Add unit tests.
  • Implement a proper annotation style.

See #2893 and #3026.

@codecov
Copy link

codecov bot commented Nov 23, 2017

Codecov Report

Merging #3628 into master will decrease coverage by 0.04%.
The diff coverage is 16.66%.

Impacted file tree graph

@@            Coverage Diff            @@
##           master   #3628      +/-   ##
=========================================
- Coverage   36.95%   36.9%   -0.05%     
=========================================
  Files         268     268              
  Lines        6673    6690      +17     
  Branches     1202    1203       +1     
=========================================
+ Hits         2466    2469       +3     
- Misses       3555    3569      +14     
  Partials      652     652
Impacted Files Coverage Δ
editor/actions.js 46.29% <0%> (-0.88%) ⬇️
editor/components/block-settings-menu/index.js 0% <0%> (ø) ⬆️
editor/components/block-list/block.js 0.74% <0%> (-0.01%) ⬇️
editor/selectors.js 89.39% <11.11%> (-3.73%) ⬇️
editor/reducer.js 89.41% <50%> (-0.86%) ⬇️

Continue to review full report at Codecov.

Legend - Click here to learn more
Δ = absolute <relative> (impact), ø = not affected, ? = missing data
Powered by Codecov. Last update 68bf34f...73c5eaf. Read the comment docs.

@mtias mtias added the [Feature] Block API API that allows to express the block paradigm. label Nov 23, 2017
@aduth
Copy link
Member

aduth commented Nov 27, 2017

Some initial thoughts coming into this mostly uninitiated, not yet having looked at the implementation:

  • Specifying block ID in both the start and end indicates to me that we can support annotations taking effect across multiple blocks? Is this true? If not, should the block identifier be moved out of these range offsets? If so, how do we apply partial annotations from one block to another, particularly if they're not of the same type? Can we always assume contiguous block selections (see Experiment: support for non-contiguous blocks in selection #3584)?
  • How is "position" determined in a paragraph block containing rich content, e.g. strong, em, etc?
  • For blockAnnotation, you mention the possibility of commenting on a single image in a gallery. How would you imagine this would look like? Would we reuse position? Can we always assume that annotations apply to ranges?
    • It's otherwise unclear to me what purpose blockAnnotation serves
  • Is it necessary to tie this behavior so strongly to the idea of comments? Are comments always temporal (needing date)? Can comments contain rich content, e.g. strong, em, etc?

@ephox-mogran
Copy link
Contributor

I would like to echo @aduth 's question about rich content:

How is "position" determined in a paragraph block containing rich content, e.g. strong, em, etc?

Using an index counter for position in a tree can cause a lot of problems. You will need to consistently convert between the tree and the flat data. HTML data is inherently tree-based because formatting elements can be nested. This kind of counter can work in situations where you are guaranteed to have only plain text.

Conversions can be simple if you are only allowing text and formatting elements. However, as soon as you have things which can't be represented as textContent (br, img etc), then this model starts presenting a lot of problems. Imagine you had some content, which was a word (hello), an image, a space, an image, and another word (world).

hello<img /> <img />world

Now, the textContent for this is just hello world. However, the tree for this data is a root node containing five children:

  1. "hello"
  2. image 1
  3. " "
  4. image 2
  5. "world"

If you get given position, 4 in this, that is obviously "hell|o". However, if you get given position 7, then that is probably "hello w|orld". But you can't just use offsets on the root container (or its only child) to work that out. You need to actually walk through the DOM, and identify which node that offset belongs to. As you walk, you need to accumulate how many index positions you've already walked, so you know when you reach your destination. This becomes much harder when you have levels of nesting with whitespace and non-breaking spaces and zero-width cursors. Then throw in all the random HTML content that can be added by plugins, hookins, metaboxes etc, and suddenly you need to ensure that you are supporting all of these elements in your HTML model.

It is certainly manageable, but don't necessarily think that it's going to be easier than finding a tree-based alternative.

@atimmer
Copy link
Member Author

atimmer commented Dec 8, 2017

@aduth Some thoughts after discussing this at WordCamp US.

Specifying block ID in both the start and end indicates to me that we can support annotations taking effect across multiple blocks? Is this true? If not, should the block identifier be moved out of these range offsets? If so, how do we apply partial annotations from one block to another, particularly if they're not of the same type? Can we always assume contiguous block selections (see #3584)?

After thinking about this more I think it makes more sense to have an array of blocks.

For blockAnnotation, you mention the possibility of commenting on a single image in a gallery. How would you imagine this would look like? Would we reuse position? Can we always assume that annotations apply to ranges?

There should be a way to specify if an annotation is on a block or in the block. I don't want to make this implicit, so that is where blockAnnotation comes in. We could reverse it to annotateWithinBlock and make it a property inside the array of blocks:

blocks: [
 {
    id: "blockId1",
    annotateWithinBlock: true,
    withinBlock: { // Paragraph block
      start: 50,
      end: 100
    } 
 },
 {
    id: "blockId2",
    annotateWithinBlock: true,
    withinBlock: {
      images: [ 0, 1, 3 ] // Gallery for example. Could also do start/end.
    }
 }
]

Given that, I will focus this PR completely on creating a block annotation API. Annotating inside blocks will be an iteration in the future. The blockAnnotation property will give us future compatibility.

Is it necessary to tie this behavior so strongly to the idea of comments?

I brainstormed with @omarreiss on this and we came to the conclusion that as a user you always want to know where an annotation comes from. So that is why we connect these two concepts so clearly.

Are comments always temporal (needing date)?

No, the date is probably the date of saving. So this field could be empty.

Can comments contain rich content, e.g. strong, em, etc?

Probably later. I would say we start without rich content. We can always add this later.

Position based annotation

@ephox-mogran The reason I chose positions is that those are always very easy to reason about. I don't think we allow img tags inside paragraphs blocks. The same goes for lists. Within galleries, the positions could represent the images indexes. <br/> could be represented as a single character, being \n. As long as it is unambiguous it is fine.

This also relates to #771, cc @iseulde. If all formatting on a paragraph block would be applied base on its position implementing annotations would be trivial. It is just another form of formatting.

@ephox-mogran Can you tell me how a node based alternative would look like? My intuition is that that would require a lot of keeping track of changes to nodes.

@jaswrks
Copy link
Contributor

jaswrks commented Dec 9, 2017

@ephox-mogran writes...

Conversions can be simple if you are only allowing text and formatting elements. However, as soon as you have things which can't be represented as textContent (br, img etc), then this model starts presenting a lot of problems. Imagine you had some content, which was a word (hello), an image, a space, an image, and another word (world).

I am in agreement with you on this. I'm currently researching the way others have approached this in other platforms. A tree-based approach seems more robust to me at the moment.

Having said that, there's another hairy thing to deal with. Something I've not had prior experience with is holding a selection to an editable chunk of content. Imagine that a user selects some text and annotates that selection, which is then highlighted. Later, another user comes along and fixes a typo, thereby altering the selection start/end offsets.

2017-12-08_15-15-01

On the surface, this looks tricky to deal with. Obviously it can be done though, because Pages on macOS supports it nicely, as do other platforms I've tested recently. It's a nice touch.

It seems to me that it requires:

  • Tracking changes to anything [[inside the current selection]] and then growing or shrinking the selection range appropriately, based on the revisions taking place. So long as the entire selection is not deleted, the highlight can remain and just change in size.

  • It also requires tracking changes outside of ]]the selection[[ too, so the offset locations can then be adjusted, based on their surroundings.

@aduth
Copy link
Member

aduth commented Sep 13, 2018

Can this be closed in favor of #7718 ?

@aduth aduth added the [Status] Stale Gives the original author opportunity to update before closing. Can be reopened as needed. label Sep 13, 2018
@atimmer
Copy link
Member Author

atimmer commented Sep 14, 2018

I think so

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
[Feature] Annotations Adding annotation functionality [Feature] Block API API that allows to express the block paradigm. [Status] Stale Gives the original author opportunity to update before closing. Can be reopened as needed.
Projects
None yet
Development

Successfully merging this pull request may close these issues.

6 participants