Partial publish / rebase breaks change projection indefinitely in Beta14 #5327

mhsdesign · 2024-10-27T14:10:41Z

A partial publish results in events annotated for "live" workspace yet are not in the "live" content stream.

This is due to a fork of the live content stream being created with the partially published events in, which is then published to the actual live content stream.
This leaves behind duplicate events both containing "workspaceName: live" yet only one of them is in the live content stream. A catchup or replay will fail however due to duplicate database entries for the projections relying on anything with "workspaceName: live" being actually in the live content stream.

The pr #5293 provides a test which fails showing the behaviour.

This bug was introduced with the introduction of workspace names in events #5002, which happened some time ago but with the merge and release of #5149 we actually started relying on the new workspace name also which lead to this bug.

Properly solving it should also allow catchup hooks to not implement hacks as such to skip events:

neos-development-collection/Neos.Neos/Classes/AssetUsage/CatchUpHook/AssetUsageCatchUpHook.php

Lines 51 to 60 in b11026d

    
               // Safeguard for temporary content streams created during partial publish -> We want to skip these events, because their workspace doesn't match current content stream. 
        
               try { 
        
                   $contentGraph = $this->contentRepository->getContentGraph($eventInstance->getWorkspaceName()); 
        
               } catch (WorkspaceDoesNotExist) { 
        
                   return; 
        
               } 
        
               if (!$contentGraph->getContentStreamId()->equals($eventInstance->getContentStreamId())) { 
        
                   return; 
        
               } 
        
           }

solutions

We discussed a lot of options for solving the bug here and here as well as countless hours on the Dresden sprint and we investigated into a few directions:

1.) Dont lie by using real temporary workspaces

Create a new "real" workspace on the fly, to simulate virtual workspaces (hacky and defeats the purpose of workspaces a litte) -> #5293

2.) Lie and introduce some kind of virtual workspace names

Introduce virtual workspace names and use union everywhere WorkspaceName|VirtualWorkspaceName -> #5302
This was also discussed as a "light" approach as how it was once proposed by tracking the "virtual" state in the actual WorkspaceName directly and "asking" if its virtual: WorkspaceName::isVirtual() like #5167
Or by using a TemporaryWorkspaceName builder or something from the outside, which would conflict if users could use the same prefix for actual workspace names as we wouldnt be able to surely detect that (detecting it is required so catchups dont run rogue:)):

final readonly class TemporaryWorkspaceName
{
    private const PREFIX = 'tmp-';

    private function __construct(
        public string $value
    ) {
    }

    public static function fromContentStreamId(ContentStreamId $contentStreamId): self
    {
        return new self(WorkspaceName::transliterateFromString(self::PREFIX . md5($contentStreamId->value))->value);
    }

    // public static function isTemporaryWorkspaceName(WorkspaceName $workspaceName): bool
    // {
    //     return str_starts_with($workspaceName->value, self::PREFIX);
    // }

    public function getWorkspaceName(): WorkspaceName
    {
        return WorkspaceName::fromString($this->value);
    }
}

We were again really close to giving in the temptation to introduce VirtualWorkspaceName but as we could avoid that 2 or 3 times because the complexity is really high:

virtual workspace names must be allowed everywhere in the core, like the content graph and node read model
but carefully disabled in other places, e.g. the NodeAddress used for routing must NOT allow direct access on any stream
security is harder to implement - or impossible - if nodes can be access through virtual names
catchup hooks have to skip virtual nodes as they are still persisted in the events

3.) Dont lie by not persisting the events during the rebase attempt and only forward it to the main projection which will be rollbacked (simulated publishing)

while discussing options the idea came up here to just not emit events during the rebase to the catchups, at first the solution was just an in memory event store and doing an additional fork in the database and then clean it up again, but with a lot of thought we basically created a new way of publishing leveraging a transaction and rollback to even safe a lot of unnecessary expensive forking:

Publishing Version 3 -> #5301

The text was updated successfully, but these errors were encountered:

mhsdesign self-assigned this Oct 27, 2024

This was referenced Oct 27, 2024

!!! FEATURE: Publishing Version 3 #5301

Merged

BUGFIX: Partial publish breaks uri and change Projection #5293

Closed

mhsdesign closed this as completed in #5301 Oct 28, 2024

mhsdesign mentioned this issue Oct 30, 2024

FEATURE: Overhaul content stream pruner to work on event stream #5297

Merged

13 tasks

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Partial publish / rebase breaks change projection indefinitely in Beta14 #5327

Partial publish / rebase breaks change projection indefinitely in Beta14 #5327

mhsdesign commented Oct 27, 2024

Partial publish / rebase breaks change projection indefinitely in Beta14 #5327

Partial publish / rebase breaks change projection indefinitely in Beta14 #5327

Comments

mhsdesign commented Oct 27, 2024

solutions

1.) Dont lie by using real temporary workspaces

2.) Lie and introduce some kind of virtual workspace names

3.) Dont lie by not persisting the events during the rebase attempt and only forward it to the main projection which will be rollbacked (simulated publishing)