Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Replant the tree! 🌲 #23

Merged
merged 3 commits into from
Mar 2, 2019
Merged

Replant the tree! 🌲 #23

merged 3 commits into from
Mar 2, 2019

Conversation

linabutler
Copy link
Contributor

This started out as a fix for corruption corner cases, grew into
simplifying how we build the remote tree, and turned into a rewrite
(again >.>). Instead of managing two different sets of structure at
insert and merge time, we now store items and parent-child
relationships in a tree builder, then build a consistent tree that
flags divergent items.

The old tree was complicated because it had to maintain a valid
structure before and after every insert, but with references to
potentially unknown parents and children. Since we couldn't insert
an item without knowing its parent, we had to pick a canonical
structure (parents by children), insert children in level
order, and note divergent structure (by parentid) separately.

This meant Desktop's Store::fetch_remote_tree() had to left-join
mirror.items to mirror.structure, store the items in a
pseudo-tree, then recursively walk children to inflate the complete
tree. This was complicated enough with a valid tree, let alone
orphans and multiple parents.

With the new approach, we build the tree in three steps:

  1. First, we add all items, without structure.
  2. Next, we set parent-child relationships. Parents by children
    require the parent to exist, but not the child; this handles the
    case where a folder mentions nonexistent or deleted GUIDs in
    its children. Parents by parentid require the item to exist,
    but not its parent; this handles orphans that reference missing
    or non-folder parents.
  3. Finally, once we've added all entries to the tree, we have a
    complete view of the structure, so we can resolve missing, multiple,
    and conflicting parents.

For cases where we know the structure is valid and in level order,
like Desktop's Store::fetch_local_tree(), we handle steps 1 and 2
at the same time: builder.item(item)?.by_structure(&parent_guid)?.

For the remote tree, we insert all items and their parentids
first (builder.item(item)?.by_parent_guid(&parent_guid)?, where
parent_guid might not be in the tree), then add structure from
children later:
builder.parent_for(&child_guid)?.by_children(&parent_guid)?,
where child_guid might not be in the tree.

Closes #22.

@linabutler linabutler requested review from pjenvey and thomcc February 13, 2019 01:42
src/lib.rs Outdated
@@ -12,6 +12,8 @@
// See the License for the specific language governing permissions and
// limitations under the License.

extern crate smallbitvec;
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

You shouldn't need extern crate in rust 2018.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I didn't enable Rust 2018 yet because it prints a ton of variable does not need to be mutable warnings for the nodes! macro:

warning: variable does not need to be mutable
   --> src/tests.rs:60:13
    |
60  |           let mut item = Item::new(Guid::from($guid), Kind::$kind);
    |               ----^^^^
    |               |
    |               help: remove this `mut`
...
95  |       let remote_tree = nodes!({
    |  _______________________-
96  | |         ("unfiled_____", Folder[needs_merge = true], {
97  | |             ("folderBBBBBB", Folder[needs_merge = true], {
98  | |                 ("bookmarkDDDD", Bookmark[needs_merge = true]),
...   |
108 | |         })
109 | |     }).into_tree().unwrap();
    | |______- in this macro invocation

...When item definitely needs to be mutable, because we modify its fields!

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

In some cases it doesn't need to be mutable: e.g. when [ $( $name:ident = $value:expr ),* ] matches 0 tokens (also when there's 0 children for the macro's other case).

I suspect avoiding the muts in those case will either be difficult and or verbose. Since it's just for tests I'd say add #[allow(unused_mut)] to both muts in the macro and enjoy the 2018 edition

@codecov-io
Copy link

codecov-io commented Feb 13, 2019

Codecov Report

Merging #23 into master will decrease coverage by 1.48%.
The diff coverage is n/a.

Impacted file tree graph

@@            Coverage Diff             @@
##           master      #23      +/-   ##
==========================================
- Coverage   92.46%   90.97%   -1.49%     
==========================================
  Files           7        7              
  Lines        3144     2638     -506     
==========================================
- Hits         2907     2400     -507     
- Misses        237      238       +1
Impacted Files Coverage Δ
src/lib.rs 0% <0%> (-100%) ⬇️
src/error.rs 37.5% <0%> (-9.87%) ⬇️
src/guid.rs 83.67% <0%> (-3.25%) ⬇️
src/merge.rs 92.78% <0%> (-2.31%) ⬇️
src/tree.rs 80.42% <0%> (-0.06%) ⬇️
src/tests.rs 99.89% <0%> (-0.02%) ⬇️

Continue to review full report at Codecov.

Legend - Click here to learn more
Δ = absolute <relative> (impact), ø = not affected, ? = missing data
Powered by Codecov. Last update ef2479b...7f5f931. Read the comment docs.

@linabutler
Copy link
Contributor Author

Yikes, that diff isn't very helpful. The interesting parts are Builder::resolve, Builder::into_tree, and BuilderEntry::parents_by.

@linabutler
Copy link
Contributor Author

...And here's how it works on Desktop.

src/tests.rs Outdated
.children(&parent_guid)
.item(&parent_guid),
node.item.into())?;
match b.item(node.item) {
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

nit: map + or_else is maybe nicer? #negligble

Suggested change
match b.item(node.item) {
b.item(node.item)
.map(|_| ())
.or_else(|err| match err.kind() {
ErrorKind::DuplicateItem(_) => Ok(()),
_ => Err(err),
})?;

src/tests.rs Outdated
.and_then(|b| b.parent_for(&"folderBBBBBB".into()).by_children(&"folderAAAAAA".into()))
.expect("Should insert B");

match b.into_tree() {
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

nit: unwrap_err/expect_err's nice for these (not 100% equivalent, printing the Kind vs the wrapping err):

Suggested change
match b.into_tree() {
match b.into_tree().expect_err("Should not build tree with cycles").kind() {
ErrorKind::Cycle(guid) => assert_eq!(guid, &Guid::from("folderAAAAAA")),
err @ _ => assert!(false, "Wrong error kind for cycle: {:?}", err),
}

src/lib.rs Outdated
@@ -12,6 +12,8 @@
// See the License for the specific language governing permissions and
// limitations under the License.

extern crate smallbitvec;
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

In some cases it doesn't need to be mutable: e.g. when [ $( $name:ident = $value:expr ),* ] matches 0 tokens (also when there's 0 children for the macro's other case).

I suspect avoiding the muts in those case will either be difficult and or verbose. Since it's just for tests I'd say add #[allow(unused_mut)] to both muts in the macro and enjoy the 2018 edition

src/tree.rs Outdated
/// # Resolving divergences
///
/// Walking the tree, using `Tree::node_for_guid`, `Node::parent`, and
/// `Node::children`, resolves divergences using these rules:
/// Building a tree using `TreeBuilder::into_tree` resolves divergences using
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
/// Building a tree using `TreeBuilder::into_tree` resolves divergences using
/// Building a tree using `Builder::into_tree` resolves divergences using

Copy link
Member

@mhammond mhammond left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This looks great

This started out as a fix for corruption corner cases, grew into
simplifying how we build the remote tree, and turned into a rewrite
(again >.>). Instead of managing two different sets of structure at
insert and merge time, we now store items and parent-child
relationships in a tree builder, then build a consistent tree that
flags divergent items.

The old tree was complicated because it had to maintain a valid
structure before and after every insert, but with references to
potentially unknown parents and children. Since we couldn't insert
an item without knowing its parent, we had to pick a canonical
structure (parents by `children`), insert children in level
order, and note divergent structure (by `parentid`) separately.

This meant Desktop's `Store::fetch_remote_tree()` had to left-join
`mirror.items` to `mirror.structure`, store the items in a
pseudo-tree, then recursively walk children to inflate the complete
tree. This was complicated enough with a valid tree, let alone
orphans and multiple parents.

With the new approach, we build the tree in three steps:

1. First, we add all items, without structure.
2. Next, we set parent-child relationships. Parents by `children`
   require the parent to exist, but not the child; this handles the
   case where a folder mentions nonexistent or deleted GUIDs in
   its `children`. Parents by `parentid` require the item to exist,
   but not its parent; this handles orphans that reference missing
   or non-folder parents.
3. Finally, once we've added all entries to the tree, we have a
   complete view of the structure, so we can resolve missing, multiple,
   and conflicting parents.

For cases where we know the structure is valid and in level order,
like Desktop's `Store::fetch_local_tree()`, we handle steps 1 and 2
at the same time: `builder.item(item)?.by_structure(&parent_guid)?`.

For the remote tree, we insert all items and their `parentid`s
first (`builder.item(item)?.by_parent_guid(&parent_guid)?`, where
`parent_guid` might not be in the tree), then add structure from
children later:
`builder.parent_for(&child_guid)?.by_children(&parent_guid)?`,
where `child_guid` might not be in the tree.

Closes #22.
@linabutler linabutler merged commit 7f5f931 into master Mar 2, 2019
@linabutler linabutler deleted the tree-builder branch March 2, 2019 09:16
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Projects
None yet
Development

Successfully merging this pull request may close these issues.

Handle more kinds of structure corruption
5 participants