Refactor Integrate Asset #2898

BigRoy · 2022-03-16T21:17:04Z

Brief description

Draft attempt at refactoring/rewriting some logic of the Integrator. My first step was to try and figure out the pieces that are currently putting it together to really figure out all logic that's there and clean up what really was redundant.

Note: The current state it not tested and 100% sure won't work as is. It's a WIP to initiate first discussion - and I'd be happy to revamp a lot more. But I hope to find some time tomorrow to improve readability of the code further so it's easier to understand what we are actually altering. A lot of it might also be "commenting" specific areas of the code.

Notable changes

Wanted to mark these changes specifically since they were obvious changes that I felt could break things. So naming them here to keep an eye out. But the refactor is so massive that it's likely a ton more.

Removed Slate exception:

            # exception for slate workflow
            if index_frame_start and "slate" in instance.data["families"]:
                index_frame_start -= 1

This "fix" should be moved to outside of the integrator itself.

Removed this logic for 'single file' processing for representations.
It will now always use name of the representation

            template_data["representation"] = repre['ext']

I believe this behavior actually was a bug because other areas of the code believed it was actually using name instead?

BigRoy · 2022-03-16T21:20:14Z

openpype/plugins/publish/integrate_new.py

+
+ def register_subset(self, instance):
+ # todo: rely less on self.prepare_anatomy to create this value
+ asset = instance.data.get("assetEntity") # <- from prepare_anatomy :(


I'm actually quite amazed that after all of this refactoring on this very early draft this is really the only nitpick the hound has this time. 🥇

…e Transaction logic

BigRoy · 2022-03-17T11:46:56Z

openpype/plugins/publish/integrate_new.py

+ destination_indexes = list(src_collection.indexes)
+ destination_padding = len(get_first_frame_padded(src_collection))
+ if repre.get("frameStart") is not None:
+ index_frame_start = int(repre.get("frameStart"))

 # TODO use frame padding from right template group


@mkolar do you happen to know what the "right template group" would be? :) This is an old to do comment and I'm wondering if I could easily resolve this as I go.

At the moment it's always getting frame_padding from render template. However different families can use their own templates for the publishing, hence there is a chance for this to be wrong. In practice, we never saw it happen though. Usually, studios use the same padding across the board.

For completeness, this is where you can choose what template will be used for publishing in various situations project_settings/global/publish/IntegrateAssetNew/template_name_profiles

Hmm - I can see the entries there - and I also see the Templates in Anatomy however it's not clear how using a custom template name I'd overwrite the frame padding value instead? I'll leave the todo for now.

Hmm. I think we planned to allow defining custom padding in each template category, but eventually removed it to reduce complexity. Looks like it would be safe to remove this TODO

You can use anatomy.templates[template_name]["frame_padding"].

You can use anatomy.templates[template_name]["frame_padding"].

Not tested yet, but implemented with 3e095bc

BigRoy · 2022-03-17T14:30:50Z

openpype/plugins/publish/integrate_new.py

+ "parent": version["_id"],
+ "name": repre['name'],
+ "data": data,
+ "dependencies": instance.data.get("dependencies", "").split(),


Can someone confirm for me that this dependencies is unused/deprecated?

I believe it originated from the RigLoader in Maya which stored 'dependencies' on load. Oddly enough it's storing just the representation id of the actual rig that was loaded (which seems odd) - the value also didn't get updated on updates of the rigs.

It seems extra weird that the value on the instance wasn't just a list to begin with but expecting a concatenated string?

Anyway, with 'input dependencies' the source rig representation for a published alembic would already be retrievable and thus also more reliable than this old unused logic. @mkolar Correct?

Yes I believe this is is not used anywhere anymore

…as before (rudimentary tested only)

BigRoy · 2022-03-17T15:35:03Z

The current refactored state as of 56bcd8c now can actually integrate e.g. a simple Maya Look (with a texture) and a Maya Pointcache. So it's somewhat functional as before.

Likely Hero Versions are broken. Likely Reviews as well. Likely site sync functionality isn't completely functional either.

Also haven't tested what happens if a publish actually fails and whether the file transaction rolls back correctly. But separate "tests" can be implemented to define our expected behavior explicitly and adapt the FileTransaction class to behave as we'd want.

Nonetheless would love input at this stage @mkolar @iLLiCiTiT @m-u-r-p-h-y as to what I should continue to do from here.
I've mostly focused on breaking the code down into more understandable 'parts'.

I've noticed however that if representations fails that the Subset and Version are registered nonetheless. Would be extra nice if we could somehow even merge those transactions together so that we can write all of those even closer together in the code.

openpype/plugins/publish/integrate_new.py

# Conflicts: # openpype/plugins/publish/integrate_new.py

… in CollectAnatomyContextData and CollectAnatomyInstanceData. This currently was duplicated logic and should not be handled in the Integrator

…oser to where it's used

…families variable

BigRoy · 2022-03-23T23:52:34Z

With the current state being "barebones functional and highly untested" with "more than likely broken in major areas" I do feel it's a good moment to digest what the integrator currently does (and why - if I can) and note some of the changes made (and why).

What did the Integrator do that I have removed or have moved into its own

Separated registering "subset", "version", "representation" logic more into isolation where possible.
It previously prepared anatomy data which I mostly removed with this commit due to it being collected by CollectAnatomyContextData and CollectAnatomyInstanceData. It should not be the responsibility of the integrator.
- It still prepares some anatomy data - it only still adds intent value - which should be moved.
I've moved the "site sync" functionality in such a way that it's more encapsulated and we can discuss how to best clean that up. The logic now comes down to defining the sites once (because they got cached originally anyway)
- The available "sites" logic is here and only used by a single call to compute_resource_sync_sites here - other than that it's quite "contained".
- Then those sites are used to prepare the file info which is the data Site Sync uses.

What does the Integrator currently do?

It runs per instance and for each instance it does this:

Register the documents in the database
1. Register the subset
2. Register the version
3. Register the representation - still quite messy
Prepare data for template formatting to define publish file destinations
1. To be moved, see comment in code: Quite some template data preparation for subset group
2. Handle the template formatting to published files - still quite complex
Integrate the files safely

Why still the complexity?

The complexity comes mostly from the fact that:

It tries to "rollback" if it fails along the way but also needs to apply changes before it can do some other changes - making it non-trivial.
- The file transferring happens in-between registering subset/versions and the representations - making it still a bit hard to follow when what is happening. Especially because some data for the representation currently requires the files to be in destination location (like the file info/hash for site sync).
It can update data on an existing subset, like set a new subsetGroup and add new "families" into the subset
It can overwrite/update existing Versions.
- For which during register version it'll archive existing representations
- More complexity due to the instance also allowing to append where the original representations are kept unless when they are overwritten by new ones with the same name.
- Due to allowing to "overwrite" or "update" existing versions this means Integration also needs to safely handle potentially overwriting existing files with rollback. Now handled by FileTranscation class.
Confusion between instance.data["representation"] and actual representation documents for the database as they aren't a one to one match yet the it's hard to make the distinction exactly in the code as noted in this todo
Quite a lot of lines of code related to computing the available "sites" (Site Sync functionality)

Is the Hound failing to run or how come I'm not making mistakes? I'm a bit worried the dog is sleeping and bites me later. 🔥

…resentation write

BigRoy · 2022-03-24T15:38:04Z

Few questions!

Is it correct that the "archived_representation" type during the Integrator only got created 'during integration' and then directly deleted. Originally it would archive representations and then before registering the new representations delete all archived ones (existing_repres defined here). So if instead I'd turn the registering into a single Bulk Write I wouldn't need the intermediate "archived_representation". Correct?
Can I just move the "intent" anatomy data to CollectAnatomyContextData? Intent is defined at the beginning, no? Since it's set directly in Pyblish Pype UI. Or would there be a reason not to?
Could we move _get_subset_group into a global Collector of its own to just store instance.data["subsetGroup"]?
Is it correct that some publishes might not have a rootless path stored due to instance.data["source"] being taken directly if it exists? Looking over the code base e.g. After Effects expects instance.data["source"] set in a validator and Maya VRayScene collector sets it and Maya Collect Render.
- This is likely due to some publishers setting to "source" to "webpublisher" or "standalone_publisher"
- However there are other areas where another value is expected in source like expecting a valid path here in After Effects and here
Glancing over the original integrator code it looks to me as if site sync would not work for published Look textures - as those files not part of a representation do not get their file info sorted for site sync. For representation files they were originally collected here but I wasn't entirely sure how files directly in instance.data["transfers"] would be processed too. I'm pretty sure it's broken in the current refactored state but I'm also doubting whether it worked in the original. Anyone can confirm that published textures in a Maya look publish in Site Sync used to work as expected? Likely I'm misreading the original code somewhere due to how it overrides the global instance.data["transfers"]. E.g. maybe originally both representations would have the "resources" files added uniquely as site sync entries? If it does work I'd love to know whether this is the case that the entries are duplicated.
Shouldn't version data fps rely on instance.data["fps"] over context.data["fps"]? Originally context data would override instance data. Is that as expected?

BigRoy · 2022-03-24T16:31:02Z

openpype/plugins/publish/integrate_new.py

+ dst_collection.indexes.update(set(destination_indexes))
+ dst_collection.padding = destination_padding
+ assert len(src_collection.indexes) == \
+ len(dst_collection.indexes), "This is a bug"


YAY! The hound works - and I was expecting is to bark about this line! 🐕‍🦺 ❤️

mkolar · 2022-04-28T17:58:36Z

Just noting that this will need a merge of develop, that is removing avalon-core and some magic to make it compliant with that PR. Unfortunately the last step interfered with this.

# Conflicts: # openpype/plugins/publish/integrate_new.py

BigRoy · 2022-05-02T13:25:25Z

Just noting that this will need a merge of develop, that is removing avalon-core and some magic to make it compliant with that PR. Unfortunately the last step interfered with this.

@mkolar Should be fixed.

Please let me know what you need on this PR from my end to make this testable by OP team.

antirotor · 2022-05-02T13:27:29Z

Please let me know what you need on this PR from my end to make this testable by OP team.

I'll try to test it this week per battle plan. I'll let you know what happened :)

BigRoy · 2022-05-11T12:40:30Z

Looking forward to hearing about the test runs!

antirotor · 2022-05-13T17:09:49Z

BigRoy · 2022-05-16T07:21:00Z

So far so good!

Nice! We should actually add the following to testing too:

Preferably each should be tested for publishes with a single file, multiple files, sequences and publishes with resources like Maya lookdev

Hero versions
Site Sync
A failing publish (e.g. Failing Extractor - does the Integrator fail nicely?)
Farm Publishing (e.g. Maya Renders with Deadline)
A publish that writes into an existing version (e.g. Maya Render re-publish with different frame range -> customize frame range in Deadline)
A Python 2 host

antirotor · 2022-06-06T16:09:23Z

Would you be so kind to split it in separate integrator so we can slowly add hosts there for better testing? Because I have a feeling that for Maya, this can be used right away (for example).

iLLiCiTiT · 2022-06-23T10:04:48Z

openpype/plugins/publish/collect_anatomy_context_data.py

+ # todo: some code actually expects the dict itself and others doesn't
+ # question: what should it be?
+ intent = context.data.get("intent")
+ if intent and isinstance(intent, dict):
+ intent = intent.get("value")
+ if intent:
+ context_data["intent"] = intent
+


Intent should be a dictionary with "value" and "label", to be able tell if you want use value or label of the intent in templates.

Suggested change

# todo: some code actually expects the dict itself and others doesn't

# question: what should it be?

intent = context.data.get("intent")

if intent and isinstance(intent, dict):

intent = intent.get("value")

if intent:

context_data["intent"] = intent

Removed with recent commit. However, this logic with the if statement still feels like we should somehow correct other logic elsewhere so that we know it's always a dict to begin with.

mkolar · 2022-07-05T06:49:39Z

@BigRoy could you please separate the integrator into another file? that way we could actually merge it and start moving host by host to it. I've tested quite a bit in maya and the chances are if it works there it's good in other places, However, to be able to deply it into production I think we should expose the family lists of both the existing and the re-worked integrator to the settings temporarily. That way if something goes wrong with the new one, we can easily reassign a family to the old one and unblock production situation.

BigRoy · 2022-07-05T07:16:19Z

Sorry, I've been so busy with day to day production that this keeps slipping through. Beginning of new day now, let's start! :) On it.

# Conflicts: # openpype/plugins/publish/integrate_new.py

@iLLiCiTiT

@iLLiCiTiT says: Intent should be a dictionary with "value" and "label", to be able tell if you want use value or label of the intent in templates.

BigRoy · 2022-07-05T08:35:21Z

@mkolar Could you check if I missed anything crucial?

mkolar · 2022-07-05T09:06:03Z

Thx. gonna try now

mkolar · 2022-07-13T13:23:54Z

@BigRoy This should probably still be addressed #2898 (comment)

adding the hosts and families into the settings so we can safely switch between the integrators. To move it along. I'll merge this into an intermediate branch, where we can finish it off right away and get it inot develop for some long term testing.

First draft pass of refactoring the Integrator

d88ed91

BigRoy commented Mar 16, 2022

View reviewed changes

BigRoy added 2 commits March 17, 2022 11:49

More refactoring + draft (untested) implementation for separating Fil…

ae1a9ff

…e Transaction logic

Fix hound

9f6cc5d

BigRoy commented Mar 17, 2022

View reviewed changes

Continue refactor, restore functionality - now can correctly publish …

56bcd8c

…as before (rudimentary tested only)

iLLiCiTiT reviewed Mar 17, 2022

View reviewed changes

openpype/plugins/publish/integrate_new.py Outdated Show resolved Hide resolved

iLLiCiTiT reviewed Mar 17, 2022

View reviewed changes

openpype/plugins/publish/integrate_new.py Outdated Show resolved Hide resolved

BigRoy added 8 commits March 23, 2022 17:52

Merge remote-tracking branch 'upstream/develop' into refactor_integrator

4c03092

# Conflicts: # openpype/plugins/publish/integrate_new.py

Reduce duplicated logic by implementing resolve_profile method

8996280

Remove prepare anatomy data logic that is already collected/generated…

177e244

… in CollectAnatomyContextData and CollectAnatomyInstanceData. This currently was duplicated logic and should not be handled in the Integrator

Move logic to clarify what should be removed/moved and bring logic cl…

3fd2d02

…oser to where it's used

Simplify profile filtering

8edfb3f

Re-use get families logic

79286ea

Remove todo since assetEntity already comes from Collectors + re-use …

d6c6827

…families variable

Add todo to move get subset group logic

47259f8

BigRoy added 4 commits March 24, 2022 14:21

Override stored repre context udim for backwards compatibility

b128e0a

Encapsulate version data completely into its own function

9997acb

Move logic closer to where it's used

5b1f6eb

Preparation to delay Version document write to database closer to rep…

3369c15

…resentation write

BigRoy added the type: refactor Structural changes not affecting functionality label Mar 24, 2022

Fix get_profile_filter_criteria anatomy data key for app name

42175ff

Fix sequence functionality

7713af5

BigRoy commented Mar 24, 2022

View reviewed changes

Reformat code

229626b

Merge remote-tracking branch 'upstream/develop' into refactor_integrator

398ccf9

# Conflicts: # openpype/plugins/publish/integrate_new.py

iLLiCiTiT reviewed Jun 23, 2022

View reviewed changes

maxpareschi mentioned this pull request Jun 26, 2022

rescan frames and collections after slate render #3412

Closed

BigRoy added 7 commits July 5, 2022 09:18

Merge remote-tracking branch 'upstream/develop' into refactor_integrator

b5e0b3b

# Conflicts: # openpype/plugins/publish/integrate_new.py

Move IntegrateAsset

3e058c6

Revert integrator to latest develop

fd2d07e

Remove duplicate source family

271a829

Update USD families with latest develop

148ac26

Set up old vs. new integrator per host

035c4d2

Remove 'intent' context data override

a375763

@iLLiCiTiT says: Intent should be a dictionary with "value" and "label", to be able tell if you want use value or label of the intent in templates.

BigRoy requested a review from mkolar July 5, 2022 08:35

Refactor integrator labels

b4697b6

mkolar changed the base branch from develop to feature/refactor_integrator July 13, 2022 14:22

mkolar merged commit f2ce000 into ynput:feature/refactor_integrator Jul 13, 2022

mkolar mentioned this pull request Jul 18, 2022

Refactor Integrate Asset #3530

Merged

BigRoy deleted the refactor_integrator branch March 20, 2024 15:21

This was referenced Jul 9, 2024

Representation context UDIM value is stored as list on integration instead of str ynput/ayon-core#765

Closed

Chore: Make sure udim in representation is not a list ynput/ayon-core#764

Merged

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Refactor Integrate Asset #2898

Refactor Integrate Asset #2898

BigRoy commented Mar 16, 2022

BigRoy commented Mar 16, 2022 •

edited

Loading

BigRoy Mar 16, 2022

BigRoy Mar 17, 2022

mkolar Mar 17, 2022

BigRoy Mar 17, 2022 •

edited

Loading

mkolar Mar 31, 2022

iLLiCiTiT Apr 1, 2022

BigRoy Apr 2, 2022 •

edited

Loading

BigRoy Mar 17, 2022 •

edited

Loading

mkolar Mar 29, 2022

BigRoy commented Mar 17, 2022

BigRoy commented Mar 23, 2022

BigRoy commented Mar 24, 2022 •

edited

Loading

BigRoy Mar 24, 2022

mkolar commented Apr 28, 2022

BigRoy commented May 2, 2022 •

edited

Loading

antirotor commented May 2, 2022

BigRoy commented May 11, 2022

antirotor commented May 13, 2022 •

edited by mkolar

Loading

BigRoy commented May 16, 2022 •

edited by mkolar

Loading

antirotor commented Jun 6, 2022

iLLiCiTiT Jun 23, 2022

BigRoy Jul 5, 2022 •

edited

Loading

mkolar commented Jul 5, 2022

BigRoy commented Jul 5, 2022

BigRoy commented Jul 5, 2022

mkolar commented Jul 5, 2022

mkolar commented Jul 13, 2022 •

edited

Loading

Refactor Integrate Asset #2898

Refactor Integrate Asset #2898

Conversation

BigRoy commented Mar 16, 2022

Brief description

Related topics

BigRoy commented Mar 16, 2022 • edited Loading

Notable changes

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

BigRoy Mar 17, 2022 • edited Loading

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

BigRoy Apr 2, 2022 • edited Loading

Choose a reason for hiding this comment

BigRoy Mar 17, 2022 • edited Loading

Choose a reason for hiding this comment

Choose a reason for hiding this comment

BigRoy commented Mar 17, 2022

BigRoy commented Mar 23, 2022

What did the Integrator do that I have removed or have moved into its own

What does the Integrator currently do?

Why still the complexity?

BigRoy commented Mar 24, 2022 • edited Loading

Choose a reason for hiding this comment

mkolar commented Apr 28, 2022

BigRoy commented May 2, 2022 • edited Loading

antirotor commented May 2, 2022

BigRoy commented May 11, 2022

antirotor commented May 13, 2022 • edited by mkolar Loading

I've tested:

BigRoy commented May 16, 2022 • edited by mkolar Loading

antirotor commented Jun 6, 2022

Choose a reason for hiding this comment

BigRoy Jul 5, 2022 • edited Loading

Choose a reason for hiding this comment

mkolar commented Jul 5, 2022

BigRoy commented Jul 5, 2022

BigRoy commented Jul 5, 2022

mkolar commented Jul 5, 2022

mkolar commented Jul 13, 2022 • edited Loading

BigRoy commented Mar 16, 2022 •

edited

Loading

BigRoy Mar 17, 2022 •

edited

Loading

BigRoy Apr 2, 2022 •

edited

Loading

BigRoy Mar 17, 2022 •

edited

Loading

BigRoy commented Mar 24, 2022 •

edited

Loading

BigRoy commented May 2, 2022 •

edited

Loading

antirotor commented May 13, 2022 •

edited by mkolar

Loading

BigRoy commented May 16, 2022 •

edited by mkolar

Loading

BigRoy Jul 5, 2022 •

edited

Loading

mkolar commented Jul 13, 2022 •

edited

Loading