Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Partial publish / rebase breaks change projection indefinitely in Beta14 #5327

Closed
mhsdesign opened this issue Oct 27, 2024 · 0 comments · Fixed by #5301
Closed

Partial publish / rebase breaks change projection indefinitely in Beta14 #5327

mhsdesign opened this issue Oct 27, 2024 · 0 comments · Fixed by #5301
Assignees

Comments

@mhsdesign
Copy link
Member

A partial publish results in events annotated for "live" workspace yet are not in the "live" content stream.

This is due to a fork of the live content stream being created with the partially published events in, which is then published to the actual live content stream.
This leaves behind duplicate events both containing "workspaceName: live" yet only one of them is in the live content stream. A catchup or replay will fail however due to duplicate database entries for the projections relying on anything with "workspaceName: live" being actually in the live content stream.

The pr #5293 provides a test which fails showing the behaviour.

This bug was introduced with the introduction of workspace names in events #5002, which happened some time ago but with the merge and release of #5149 we actually started relying on the new workspace name also which lead to this bug.

Properly solving it should also allow catchup hooks to not implement hacks as such to skip events:

// Safeguard for temporary content streams created during partial publish -> We want to skip these events, because their workspace doesn't match current content stream.
try {
$contentGraph = $this->contentRepository->getContentGraph($eventInstance->getWorkspaceName());
} catch (WorkspaceDoesNotExist) {
return;
}
if (!$contentGraph->getContentStreamId()->equals($eventInstance->getContentStreamId())) {
return;
}
}


solutions

We discussed a lot of options for solving the bug here and here as well as countless hours on the Dresden sprint and we investigated into a few directions:

1.) Dont lie by using real temporary workspaces

Create a new "real" workspace on the fly, to simulate virtual workspaces (hacky and defeats the purpose of workspaces a litte) -> #5293

2.) Lie and introduce some kind of virtual workspace names

Introduce virtual workspace names and use union everywhere WorkspaceName|VirtualWorkspaceName -> #5302
This was also discussed as a "light" approach as how it was once proposed by tracking the "virtual" state in the actual WorkspaceName directly and "asking" if its virtual: WorkspaceName::isVirtual() like #5167
Or by using a TemporaryWorkspaceName builder or something from the outside, which would conflict if users could use the same prefix for actual workspace names as we wouldnt be able to surely detect that (detecting it is required so catchups dont run rogue:)):

final readonly class TemporaryWorkspaceName
{
    private const PREFIX = 'tmp-';

    private function __construct(
        public string $value
    ) {
    }

    public static function fromContentStreamId(ContentStreamId $contentStreamId): self
    {
        return new self(WorkspaceName::transliterateFromString(self::PREFIX . md5($contentStreamId->value))->value);
    }

    // public static function isTemporaryWorkspaceName(WorkspaceName $workspaceName): bool
    // {
    //     return str_starts_with($workspaceName->value, self::PREFIX);
    // }

    public function getWorkspaceName(): WorkspaceName
    {
        return WorkspaceName::fromString($this->value);
    }
}

We were again really close to giving in the temptation to introduce VirtualWorkspaceName but as we could avoid that 2 or 3 times because the complexity is really high:

  • virtual workspace names must be allowed everywhere in the core, like the content graph and node read model
  • but carefully disabled in other places, e.g. the NodeAddress used for routing must NOT allow direct access on any stream
  • security is harder to implement - or impossible - if nodes can be access through virtual names
  • catchup hooks have to skip virtual nodes as they are still persisted in the events

3.) Dont lie by not persisting the events during the rebase attempt and only forward it to the main projection which will be rollbacked (simulated publishing)

while discussing options the idea came up here to just not emit events during the rebase to the catchups, at first the solution was just an in memory event store and doing an additional fork in the database and then clean it up again, but with a lot of thought we basically created a new way of publishing leveraging a transaction and rollback to even safe a lot of unnecessary expensive forking:

Publishing Version 3 -> #5301

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging a pull request may close this issue.

1 participant