Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

FEATURE: Overhaul content stream pruner to work on event stream #5297

Merged
merged 31 commits into from
Oct 31, 2024

Conversation

mhsdesign
Copy link
Member

@mhsdesign mhsdesign commented Oct 17, 2024

Bugfix: Harden content stream pruner and remove removed flag from content streams

Resolves: #4913
Resolves: #5220

With #5096 i introduced the ContentStream::removed flag and got rid of the sql query findUnusedAndRemovedContentStreamIds and calculated this information in php: 041f239
While i thought this was a good idea, it seems it was a right step in the right direction but we can improve on that:

  • the removed flag is a bad idea, as it will complicate checking if a content stream now really exists
    • $commandHandlingDependencies->contentStreamExists() is allowed to return false if a contentstream existed once, as we fork with ExpectedVersion::NO_STREAM() and thus would still check that the fork is REALLY the first event. So any "hacked" case like deleting a content stream and using the same again to create a workspace is safe.

Additionally as part of the migration for the repaired publishing #5301 we require the dangling content streams to be pruned. This is currently unstable as we rely on the projection to be up to date which is not the case for users of beta14 if they run into the publish bug.

For this reason we iterate over the events of the event store directly - filtering already the desired event types to limit the set - and build up the state of which content streams were removed, are no longer in use or in use by workspace.
That allows us to remove the same logic from the projection.

The change will also remove the left-over content stream commands that are not required by the workspace command handler anymore, by publishing events directly to replace their use in tests and in the pruner.

New command flow contentstream:status

Detects if dangling content streams exists and which content streams could be pruned from the event stream

Dangling content streams

Content streams that are not removed via the event ContentStreamWasRemoved and are not in use by a workspace
(not a current's workspace content stream).

Previously before Neos 9 beta 15 (#5301), dangling content streams were not removed during publishing, discard or rebase.

./flow contentStream:removeDangling

Pruneable content streams

Content streams that were removed ContentStreamWasRemoved e.g. after publishing, and are not required for a full
replay to reconstruct the current projections state. The ability to reconstitute a previous state will be lost.

./flow contentStream:pruneRemovedFromEventStream

Example output

Dangling content streams that are not removed (ContentStreamWasRemoved) and not in use by workspace:
  id: c94b2a06-bbb0-4892-a92c-405baa0dc03a no longer in use
  id: 079ff29d-6900-4108-9eba-83c127da00e0 no longer in use
  id: 77769dc0-7b11-4405-ba6d-da4ad45b9dc2 no longer in use
  id: 7f64b63f-46c0-4d2d-8e2b-fdf73118aeb2 no longer in use
  id: e542b7d2-a7c1-4bd8-b02a-3e8450829965 no longer in use
  id: 7c41e343-bdff-4719-9dc7-8763cbb9d3db no longer in use
  id: 81a50ad4-41bb-43b9-8e2a-fac6b9beb3ee temporary forked at 2024-04-19 19:49
  id: 0db932cb-13bc-4f38-a4de-d0c304c7305f temporary forked at 2024-04-19 19:50
  id: ce276d40-3cba-4295-ab5c-c2d63dc8207f no longer in use
....

Removed content streams that can be pruned from the event store
  id: d1ce5fe8-9360-4886-9824-1788f2c0a327 previous state: forked
  id: 5771cc51-fb30-4571-8121-1c5c1d80dee8 previous state: no longer in use
  id: 2da82adc-3ec5-40fb-b4e3-977a896ae512 previous state: no longer in use
  id: 58c3a876-c8cf-460f-8ace-5cb9554c076a previous state: no longer in use
  id: 6541e994-9bf8-4df6-a071-bd51f36355f2 previous state: forked
  id: ca8357c7-2b05-4361-aa67-2102ec1969bf previous state: no longer in use
  id: 7a973dc1-f12e-4182-80ac-760a4fd7ec91 previous state: forked
  id: a749762d-b743-4368-8600-20d5d7658bd5 previous state: no longer in use
  id: 419102db-b28d-4cde-85dd-446ae2a163a8 previous state: forked

Changes to flow contentStream:removeDangling (previously contentStream:prune)

The command was renamed to better distinct between pruning and removing.
We will now always include temporary content streams like FORKED or CREATED in the removal if they are older than 1 day. See --remove-temporary-before=-1day option. That way we can be sure not to remove temporary content streams of a users current action.

Upgrade instructions

Review instructions

  • remove legacy rebased failed state from documentation
  • legacy rebased failed event is correctly handled
  • content stream pruner works fully on event streams
  • content stream will be removed in projection instead of removed flag
  • content stream state/status was removed from the projection
  • content stream close status will now be represented in closed column
  • low level content stream commands were removed and so the content stream command handler

Checklist

  • Code follows the PSR-2 coding style
  • Tests have been created, run and adjusted as needed
  • The PR is created against the lowest maintained branch
  • Reviewer - PR Title is brief but complete and starts with FEATURE|TASK|BUGFIX
  • Reviewer - The first section explains the change briefly for change-logs
  • Reviewer - Breaking Changes are marked with !!! and have upgrade-instructions

Followup to

neos@041f239

By diffing `findAllContentStreamEventNames` with the `$transitiveUsedStreams` we can remove the `removed` flag from content streams, which would just introduce to much complexity.
Copy link
Member

@bwaidelich bwaidelich left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks!
I don't have the brains anymore to do a proper review right now, so just a nitpicky-comment for now. But it looks great at first sight already!

Via neos#4965 the status REBASE_ERROR is obsolete.
Instead of still using this status on replay we mimic the new logic:
> In case of a [rebase error}, reopen the old content stream and remove the newly created

Additionally, to not break `findContentStreams` when fetching all content streams we make sure that the streams with `REBASE_ERROR` are transformed to `NO_LONGER_IN_USE` so they can be cleaned up.

Related neos#5101
Copy link
Member

@kitsunet kitsunet left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

this works fine for me, I checked the status cases and they all make sense to me.

With the pruner now building up the state from scratch we dont need to project it any further.

The only exception is the content stream state that wasnt one to begin with: the closed state. It will get its own column.
Also as we dont expose the command anymore we can get rid of the constraint checks and:

> the command CloseContentStream is executed with payload and exceptions are caught
During rebase of the base workspace, its content stream will be now fully removed from the projection (no soft removal) thats why we have to adjust the calculation to not rely on the select being `null` ONLY for when it has no base.
Its also null if the source content stream was removed.

```
001 Scenario: Publishing to the root workspace render dependents outdated # Features/Workspaces/WorkspaceState.feature:71
      Then workspace user-ws-two has status OUTDATED
```
@mhsdesign
Copy link
Member Author

mhsdesign commented Oct 30, 2024

I tested out that one can use

./flow migrateevents:backup
./flow contentStream:removeDangling
./flow contentStream:pruneRemovedFromEventStream
./flow cr:projectionReplayAll

to recover from #5327 as the content streams with the fake events (with workspace=live) will just be pruned.

Does anyone else wants to have a final look? (And try it out yourself?)
cc @bwaidelich cc @kitsunet :D

Copy link
Member

@kitsunet kitsunet left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Great job, this should do well!

@mhsdesign mhsdesign merged commit a159fb4 into neos:9.0 Oct 31, 2024
9 checks passed
@mhsdesign mhsdesign deleted the bugfix/harden-content-stream-pruner branch October 31, 2024 10:13
@mhsdesign mhsdesign changed the title BUGFIX: Harden content stream pruner and remove removed flag from content streams FEATURE: Rewrite content stream pruner to work on event stream Nov 1, 2024
@mhsdesign mhsdesign changed the title FEATURE: Rewrite content stream pruner to work on event stream FEATURE: Overhaul content stream pruner to work on event stream Nov 1, 2024
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
3 participants