Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

!!! FEATURE: Publishing Version 3 #5301

Conversation

mhsdesign
Copy link
Member

@mhsdesign mhsdesign commented Oct 19, 2024

Resolves: #5327
Resolves: #5303

Replaces: #5300
Replaces: #5302

Introduces simulated publishing for rebasing commands and their constraint checks

Bugfixes

  • emit $pointWorkspaceToNewContentStream event before applying remaining events in publish partial and rebase and partial discard
  • do not emit events during simulating the rebase (e.g checking if there are conflicts)
  • only publish all preserved the initiating timestamp, now a rebase or publish/discard individual nodes does the same
  • proper exception strategy when the remainder of publish individual nodes cannot be applied (nothing gets published!)

Publish workspace with automatic rebase for changes

We discussed whether a publish should only be allowed if the workspace is up to date, as technically a non conflicting change could sneak in like: The pages title was changed in live and the to be published text references the old title.

But this is only true in theory. It's more annoying having to rebase whenever multiple people work on potentially completely different parts of the site (or even a different site altogether).
To avoid the mentioned issue we could introduce at some point an indicator if someone else is working on the same document (or if that document has new changes in the target workspace).

Allowing publish and publish individual nodes both even if the workspace was outdated optimises performance as both will now do a rebase as part of the publish strategy making an explicit rebase from the outside obsolete and thus speeding up the process a lot.

Publish/Discard individual vs all #5303

Publishing all changes but using the individual publish (like the neos ui does) will now lead to the event WorkspaceWasPublished being emitted as this more performant to handle in the projections.

Similarly we do the same with discard individual and discard all

Publish/Discard/Rebase as no-op if there are no changes

  • the individual publish / discard will directly return if there are no nodes specified
  • if there are no changes publish and publish individual is a no-op and will just reopen the content stream
  • if there are no things to be discarded discard (discard individual) is a no-op

Performance old vs new

We also discussed if the performance of leveraging the db's rollback is acceptable:

Count of Nodes Time to create Time for simulated publish Time for previous publish Time for previous rebase & publish
100 0.5s 0.4s
1000 4.6s 3.7s
5000 24s 23s
10000 46s - 49s 83s 24s (unlikely if up-to-date) 192s (rebase) + 176s (publish all) = 368s

Tested with 1G memory limit but behat seems to have only used around 150Mb each run no matter how many nodes.

It seems that the publish is even slightly faster than the creation of the nodes until we exceed 5.000 newly created nodes. With 10.000 nodes it takes twice as much time.

The previous publish implementation wich can only work without changes in the target only copied the events which was quite performant, but as this is the edge-case as one needs to be up to date AND publish all changes (across sites) this optimisation was removed in favour of a more consistent behaviour that also is able to rebase if required.

The Neos Ui doesnt use the publish all but rather an explicit rebase coupled with a publish individual nodes. This is super slow and we are able to optimise this a lot by doing the rebase and publish in one step.

For the main use case the new implementation is not worse but even better than the old one as the explict rebase is extremely slow in combination with the the publish afterwards.

Disclaimer: The measurements are to be taken with a grain of salt. Christian also mentioned that the performance possibly gets worse at one point due to all the events being stored in memory (his 100k publication crashed due to memory limitation)

Implementation of the CommandSimulator

The CommandSimulator is used during the publishing process, for partial publishing and workspace rebasing.

For this case, we want to apply commands including their constraint checks step by step, to see whether this
set of commands applies cleanly without errors, and which events would be created by them, but we do NOT
want to commit the updated projections or events.

Internally, we do the following:

  • Create a database transaction in the GraphProjection which we will roll back lateron (to dry-run
    projection updates) (via CommandSimulator::run()).
  • Create an InMemoryEventStore which buffers created events by command handlers.
  • execute all commands via CommandSimulator::handle()
  • -> this will do all constraint checks based on the projection in the open transaction (so it sees
    previously modified projection state which is not committed)
  • -> it will run the command handlers, buffer all emitted events in the InMemoryEventStore
    -> note to avoid full recursion the workspace command handler is not included in the bus
  • -> update the GraphProjection, but WITHOUT committing the transaction ContentGraphProjectionInterface::inSimulation()

This is quite performant because we do not need to fork a new content stream

Implementation of graph projection inSimulation($fn)

We introduce a new dedicated method inSimulation for simulated rebasing on the ContentGraphProjectionInterface (the only projection needed for soft constraint checks)

The implementation must ensure that the function passed is invoked and that any changes via ContentGraphProjectionInterface::apply() are executed "in simulation" e.g. NOT persisted after returning.

The projection state ContentGraphReadModelInterface must reflect the current changes of the simulation as well during the execution of the function.

For the doctrine content graph projection this is done by leveraging a transaction and rollback.

We discussed if relying on the global Connection (wired via flows and the entity manger) is safe as it must be ensured that NO one can hook into the process during rebase and commit the simulated state, especially not our own code in the projection or sub classes. By leveraging $connection->setRollbackOnly() (which will be unset after the rollback) we can exactly do that which is of course only a slim layer but gives enough insurance and doesn't come with the costs of having too many dedicated connections.

Upgrade instructions

Migration will be possible via #5297

./flow migrateevents:backup
./flow contentStream:removeDangling
./flow contentStream:pruneRemovedFromEventStream
./flow cr:projectionReplayAll

Review instructions

Relation to other changes:

1.) Requires #5272 as the content graph projection is required as special projection for the simulation with a inSimulation method.

2.) Based on #5315 to introduce yield for command handlers

3.) As part of the migration we require the content streams to be pruned. This is currently unstable and will be fixed with the content stream pruner overhaul followup: #5297. The change will also remove the left-over content stream commands that are not required by the workspace command handler anymore.

Checklist

  • Code follows the PSR-2 coding style
  • Tests have been created, run and adjusted as needed
  • The PR is created against the lowest maintained branch
  • Reviewer - PR Title is brief but complete and starts with FEATURE|TASK|BUGFIX
  • Reviewer - The first section explains the change briefly for change-logs
  • Reviewer - Breaking Changes are marked with !!! and have upgrade-instructions

@github-actions github-actions bot added the 9.0 label Oct 19, 2024
…ace-command-handler' into task/schnappsidee-rebase-and-partial-publish-in-memory-event-stream
…ebase to use yield

Also the object wiring was cleaned up and simplified.
A full recursion is now no longer but was never needed: Eg rebase a CreateWorkspace command or the likes.
kitsunet and others added 10 commits October 23, 2024 17:55
A partial publish results in events annotated for "live"
workspace yet are not in the "live" content stream.

This is due to a fork of the live content stream being
created with the partially published events in, which is
then published to the actual live content stream.
This leaves behind duplicate events both containing
"workspaceName: live" yet only one of them is in the live
content stream. A catchup or replay will fail however due
to duplicate database entries for the reflections relying
on anything with "workspaceName: live" being actually in
the live content stream.

The provided test fails showing the behavior.
by creating live changes we ensure that the rebase is never a noop
... and that the user content stream did not change and is still open!
…atchingPart

... as we dont do two forks anymore but try to handle the commands in simulation
@kitsunet
Copy link
Member

Do we see any risks in that the one connection used here is shared between ORM and eventstore and the CR? Given we must ensure nothing commits the transaction we started for the simulation. So no hooks, signals or anything that might lead to userland code being executed that could in some way start/commit a transaction.

Copy link
Member Author

@mhsdesign mhsdesign left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

i commented the bits and pieces where you guys might want to add your last bit of mustard to:) (the changes of the last two days without christians help xd)

/**
* @internal
*/
final readonly class InitiatingEventMetadata
Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

new utility to encapsulate access instead of passing magic strings around :)

see 4ee595e

@@ -41,6 +41,7 @@ Feature: If content streams are not in use anymore by the workspace, they can be
When the command RebaseWorkspace is executed with payload:
| Key | Value |
| workspaceName | "user-test" |
| rebaseErrorHandlingStrategy | "force" |
Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

for these tests ... as we dont make changes live and rebase is now a noop if the workspace is not outdated, we made "force" reflect the old behaviour to always fork even if there are no changes in the base.

Comment on lines 82 to 94
try {
$eventsToPublish = $this->commandBus->handle($commandInWorkspace);
} catch (\Exception $exception) {
$this->commandsThatFailedDuringRebase = $this->commandsThatFailedDuringRebase->withAppended(
new CommandThatFailedDuringRebase(
$rebaseableCommand->originalSequenceNumber,
$rebaseableCommand->originalCommand,
$exception
)
);

return;
}
Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

as this got really repetitive over the 5 usages i added the logic to collect the failures to the command simulator.

$commandSimulator = $this->commandSimulatorFactory->createSimulator($baseWorkspace->workspaceName);

$commandSimulator->run(
    static function ($handle) use ($rebaseableCommands): void {
        foreach ($rebaseableCommands as $rebaseableCommand) {
            $handle($rebaseableCommand);
        }
    }
);

if ($commandSimulator->hasCommandsThatFailed()) {
    throw WorkspaceRebaseFailed::duringPublish($commandSimulator->getCommandsThatFailed());
}

i thought of returning it first but that would mean that the return value of $handle has to be manually collected or we create a lot of logic in the run closure ... as we do the same with $commandSimulator->eventStream() i think its fine to do:) The object has state already ^^

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Does it have state? I thought by putting handle as closure we got rid of all the state? But anyways looks good.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

not itself ... it was readonly but as the class contains and exposes the in-memory event store this is really stateful already ;)

Comment on lines +279 to 289
/**
* @throws WorkspaceRebaseFailed is thrown if the workspace was outdated and an automatic rebase failed due to conflicts.
* No changes would be published for this case.
*/
private function publishNodes(
ContentRepository $contentRepository,
WorkspaceName $workspaceName,
NodeIdsToPublishOrDiscard $nodeIdsToPublish
): void {
/**
* TODO: only rebase if necessary!
* Also, isn't this already included in @see WorkspaceCommandHandler::handlePublishIndividualNodesFromWorkspace ?
*/
$contentRepository->handle(
RebaseWorkspace::create($workspaceName)
);

$contentRepository->handle(
PublishIndividualNodesFromWorkspace::create(
Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We throw now the WorkspaceRebaseFailed in case one of the commands could not be handled, previously we would just throw the raw exception in PublishIndividualNodesFromWorkspace instead.

The WorkspaceRebaseFailed was only due to the now obsolete RebaseWorkspace here which is required for the ui to react accordingly: neos/neos-ui#3769

see 8ad0a3e

@kitsunet
Copy link
Member

I am still happy with this, thanks for doing the extra mile.

@mhsdesign mhsdesign force-pushed the task/schnappsidee-rebase-and-partial-publish-in-memory-event-stream branch from c8d81e4 to 38835ca Compare October 28, 2024 12:18
mhsdesign added a commit to neos/neos-ui that referenced this pull request Oct 28, 2024
…at failed

instead we can just generate a uuid as the `sequenceNumber` was also unique for each command that failed

see neos/neos-development-collection#5301 (comment)
Copy link
Member

@bwaidelich bwaidelich left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I admire the amount of work you have put into this, thank you!

I don't feel qualified to do a proper review of the inner workings. But the parts that I understood make a lot of sense to me and the complexity is not that much higher – and you added quite a lot of tests, too.

So I'd say, let's get this merged asap so that we can test it further.

I just added some rather nitpicky comments, feel free to ignore those

mhsdesign and others added 2 commits October 28, 2024 16:47
Co-authored-by: Bastian Waidelich <b.waidelich@wwwision.de>
@mhsdesign mhsdesign merged commit 73137e3 into neos:9.0 Oct 28, 2024
8 checks passed
@mhsdesign mhsdesign deleted the task/schnappsidee-rebase-and-partial-publish-in-memory-event-stream branch October 28, 2024 15:59
neos-bot pushed a commit to neos/contentgraph-doctrinedbaladapter that referenced this pull request Oct 28, 2024
neos-bot pushed a commit to neos/contentgraph-postgresqladapter that referenced this pull request Oct 28, 2024
neos-bot pushed a commit to neos/contentrepository-core that referenced this pull request Oct 28, 2024
neos-bot pushed a commit to neos/contentrepository-testsuite that referenced this pull request Oct 28, 2024
mhsdesign added a commit to mhsdesign/neos-development-collection that referenced this pull request Nov 8, 2024
neos#5301 (comment)

> Alright, at least according to the tests this works now. I also went through all Rebasable commands to check if the events get enriched. the Dimension ones were the only missing. IMHO we should centralize this behavior, I opted against doing it here though as I think we need to consider if there would have to be any more logic involved to decide what gets enriched with commands, in what way, when we override the command if there is already metadata and finally what the causation ids are. I guess we could ignore all these questions and centralize it, but it warrants a closer look and is therefore out of scope of this change.
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
4 participants