Refactor Graph de/serialization #2612

yann-lty · 2024-12-02T10:32:26Z

Description

Refactoring graph de/serialization for code-base simplification, fixing issues and better testability.

This PR introduces a major refactoring of the graph serialization/deserialization logic (aka graph IO).
With those changes, we address some of the known issues in this area, most notably when copy/pasting nodes.

Overall, the main concept is that any operation that needs to operate on serialized Graph data uses deserialization first, to work with Graph API, instead of manipulating a raw dictionary of serialized data.
It also creates a clear distinction in the API between the loading of a Meshroom file, the initialization of a Graph from a template and importing the content of another Graph.

Features

Fix copy pasting issues that was failling to re-connect nodes correctly on paste.
Deal with advanced copy/pasting scenarios, eg. handling compatibility nodes.
Improved graph loading performance.

Implementation details

The goal of this PR is to move away from a monolothic Graph class that mixes a lot of responsibilities, to make the code more manageable and testable.

NodeFactory

The nodeFactory function has been refactored to rely on an underlying class that manages the compatibility checks. While preserving the existing behavior, this class helps understanding the different steps that occurs during that that process.

Graph

Deserialization

The current implementation exposes a unique entry point, as a single load function with several booleans to manage the different cases, to deal with:

loading a Meshroom file
initializing a Graph instance from a template file
importing another graph's content

classDiagram
class Graph {
  load(filepath, setupProjectFile=True, importProject=False, publishOutputs=False)
}

This makes it complex to manage the possible combinations of options, as all of them are not necessarily meaningful.

In the new approach, those are decoupled into as many entry points as required to handle the distinct logic those operations have:

classDiagram

class Graph {
 + load(filepath: PathLike)
 + initFromTemplate(filepath: PathLike, publishOutputs: bool)
 + importGraphContentFromFile(filepath: PathLike)
 + importGraphContent(graph: Graph)
}

UID conflict check

The current implementation creates all nodes twice to deal with UID compatibility issues, which increases load times and may keep false-positives depending on the order in which nodes are handled.

In the new approach, we only consider nodes that actually have a UID conflict, and solve them by depth ordering.
Indeed, UID conflicts are contagious: an upstream node can impact the downstream ones, as UID flow through connections.
By replacing upstream nodes first with CompatibilityNodes, we restore their serialized UID, which may solve the "false-positive" conflicts downstream, that were only due to the contagion by this node.

Importing another graph content

This area benefits the most from the refactoring, which solves errors we may currently encounter in the copy-pasting of nodes.
Instead of manipulating raw dictionaries to solve conflicting node names w.r.t the current content of the Graph, it now relies on:

the Graph API to make changes to node names in a separate Graph instance
node serialization, to get the correctly updated link expressions (serialized Edges)
node deserialization, to import nodes in the target Graph instance

Serialization

Introducing new GraphSerializer classes, following a Template Method pattern.
This makes it easier to define several node serialization logic, while inheriting from the standard serialization structure.

Add a new test suite for graph template loading.

Rewrite `nodeFactory` to reduce cognitive complexity, while preserving the current behavior.

* API Instead of having a single `load` function that exposes in its API some elements only applicable to initializing a graph from a templates, split it into 2 distinct functions: `load` and `initFromTemplate`. Apply those changes to users of the API (UI, CLI), and simplify Graph wrapper classes to better align with those concepts. * Deserialization Reduce the cognitive complexity of the deserizalization process by splitting it into more atomic functions, while maintaining the current behavior.

Extract the logic of importing the content of a graph within a graph instance from the graph loading logic. Add `Graph.importGraphContent` and `Graph.importGraphContentFromFile` methods. Use the deserialization API to load the content in another temporary graph instance, to handle the renaming of nodes using the Graph API, rather than manipulating entries in a raw dictionnary.

…es for unknown File attributes When creating a compatibility description for an unknown attribute, do not consider a link expression as the default value for a File attribute. This is breaking how the link expression solving system works, as it's resetting the attribute to its default value after applying the link. If that expression is kept as the default value, it can be re-evaluated several times incorrectly. Added a test case that was failing before that change to illustrate the issue.

Move Graph.IO internal class to its own module, and rename it to `GraphIO`. This avoid nested classes within the core Graph class, and starts decoupling the management of graph's IO from the logic of the graph itself.

Move the serialization logic to dedicated serializer classes. Implement both `GraphSerializer` and `TemplateGraphSerializer` to cover for the existing serialization use-cases.

Add a new serializer class to manage partial graph serialization logic, ensuring to remove link expressions on attributes refering to nodes that are not in the subset of nodes to serialize.

Only perform uid check when we have both a serialized and a computed UID. If the node has not been serialized with a UID, it means that it does not expect to match a specific value on deserialization.

Re-implement node pasting by relying on the graph partial serializer, to serialize only the subset of selected nodes. On pasting, use standard graph deserialization and import the content of the serialized graph in the active graph instance. Simplify the positioning of pasted nodes to only consider mouse position or center of the graph, which works well for the major variety of use-cases. Compute the offset to apply to imported nodes by using the de-serialized graph content's bounding box.

Add a new method to create a copy of a graph instance, relying on chaining serialization and deserialization operations. Add test suite to validate its behavior, and the underlying serialization processes.

Factorize the logic of replacing a node with another one and re-creating output edges into `Graph.replaceNode` and `Graph._restoreOutEdges`.

At the end of the deserialization process, solve node uid conflicts iteratively by node depths, and only replace the conflicting nodes with a CompatibilityNode. Add new test suite for testing uid conflict handling.

… UidConflict Only set the expectedUid when undoing the upgrade of a uid conflicting node. Otherwise, let the other type of conflicts take precedence.

codecov · 2024-12-13T19:14:53Z

Codecov Report

Attention: Patch coverage is 96.40288% with 25 lines in your changes missing coverage. Please review.

Project coverage is 73.32%. Comparing base (c2d4159) to head (dbafe84).
Report is 6 commits behind head on develop.

Files with missing lines	Patch %	Lines
meshroom/core/graph.py	92.46%	11 Missing ⚠️
meshroom/core/graphIO.py	91.50%	9 Missing ⚠️
meshroom/core/nodeFactory.py	94.94%	5 Missing ⚠️

Additional details and impacted files

@@             Coverage Diff             @@
##           develop    #2612      +/-   ##
===========================================
+ Coverage    69.85%   73.32%   +3.46%     
===========================================
  Files          120      125       +5     
  Lines         7067     7449     +382     
===========================================
+ Hits          4937     5462     +525     
+ Misses        2130     1987     -143

☔ View full report in Codecov by Sentry.
📢 Have feedback on the report? Share it here.

yann-lty force-pushed the dev/graphIO branch 5 times, most recently from a5a016e to b3117de Compare December 10, 2024 11:53

yann-lty marked this pull request as ready for review December 12, 2024 15:38

yann-lty added 17 commits December 13, 2024 20:11

[core] Move nodeFactory to its own module

9dfd9cf

[tests] Add extra compatibility tests

d7f4034

Add a new test suite for graph template loading.

[core] Refactor nodeFactory function

5e8e900

Rewrite `nodeFactory` to reduce cognitive complexity, while preserving the current behavior.

[core] Introducing new graphIO module

7465823

Move Graph.IO internal class to its own module, and rename it to `GraphIO`. This avoid nested classes within the core Graph class, and starts decoupling the management of graph's IO from the logic of the graph itself.

[graphIO] Introduce graph serializer classes

f7ae76d

Move the serialization logic to dedicated serializer classes. Implement both `GraphSerializer` and `TemplateGraphSerializer` to cover for the existing serialization use-cases.

[core][graphIO] Introduce PartialGraphSerializer

4aa6eac

Add a new serializer class to manage partial graph serialization logic, ensuring to remove link expressions on attributes refering to nodes that are not in the subset of nodes to serialize.

[core] Graph: improve uid conflicts check on deserialization

44dc09b

Only perform uid check when we have both a serialized and a computed UID. If the node has not been serialized with a UID, it means that it does not expect to match a specific value on deserialization.

[core] Add Graph.copy method

85c0812

Add a new method to create a copy of a graph instance, relying on chaining serialization and deserialization operations. Add test suite to validate its behavior, and the underlying serialization processes.

[core] Graph: cleanup unused methods

c97d40f

[core][graphIO] Add "template" as an explicit key

5c556f9

[core] Graph: add replaceNode method

7ca0900

Factorize the logic of replacing a node with another one and re-creating output edges into `Graph.replaceNode` and `Graph._restoreOutEdges`.

[core] Graph: improved uid conflicts evaluation on deserialization

42550b6

At the end of the deserialization process, solve node uid conflicts iteratively by node depths, and only replace the conflicting nodes with a CompatibilityNode. Add new test suite for testing uid conflict handling.

[commands] UpgradeNode.undo: only set expected uid when "downgrading"…

357575d

… UidConflict Only set the expectedUid when undoing the upgrade of a uid conflicting node. Otherwise, let the other type of conflicts take precedence.

yann-lty force-pushed the dev/graphIO branch from 6477e00 to d33d0c5 Compare December 13, 2024 19:13

[test] Extra partial serialization tests

dbafe84

yann-lty force-pushed the dev/graphIO branch from d33d0c5 to dbafe84 Compare December 13, 2024 19:19

cbentejac added this to the Meshroom 2024.1.0 milestone Dec 16, 2024

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Refactor Graph de/serialization #2612

Refactor Graph de/serialization #2612

yann-lty commented Dec 2, 2024 •

edited

Loading

codecov bot commented Dec 13, 2024 •

edited

Loading

Refactor Graph de/serialization #2612

Are you sure you want to change the base?

Refactor Graph de/serialization #2612

Conversation

yann-lty commented Dec 2, 2024 • edited Loading

Description

Features

Implementation details

NodeFactory

Graph

Deserialization

UID conflict check

Importing another graph content

Serialization

codecov bot commented Dec 13, 2024 • edited Loading

Codecov Report

yann-lty commented Dec 2, 2024 •

edited

Loading

codecov bot commented Dec 13, 2024 •

edited

Loading