-
Notifications
You must be signed in to change notification settings - Fork 36
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
No clear way to handle sorting with versioned object #180
Comments
This comment has been minimized.
This comment has been minimized.
Great write up! A few clarifications: This is assuming
Could you detail what the query looks like? Sorry, too lazy to go hunting for it. Only found some logic in
I don't understand why changing the sort order on a draft record without publishing anything would change sort orders on any published pages?
Isn't this just a matter of |
You're right I didn't really consider
Sure. You can find all objects between the old and new position of the moved object and either increment or decrement the sort order of them all. Consider items numbers 1 to 5 (1 2 3 4 5). Move 4 to the first position: (4 1 2 3 5). Before you move it, find all objects from 1st to 3rd positions (inclusive) and because we're moving block 4 to a higher position we must increment all of those (1 2 3 all get +1). Here's a snippet from a PR in elemental for seeing it in code: https://github.com/dnadesign/silverstripe-elemental/blob/034e100826fd4f349bd30cc4c62b8534d7f73b25/src/Services/ReorderElements.php#L63-L84
That example was the idea where we only create a version for the moved object, meaning that other objects that will have their sort orders affected will be updated on the live version. Referring back to the example:
It's actually incorrect, the sort on the published pages will be (1 3 4 4 5). It might make more sense if I list the live sort order on each of the objects at the end:
All the objects other than 4 (the one that was actually moved) have been updated to make way for the sort order of 4 that will be published, but it's not published yet!
If option 2 was chosen, but we don't update sort orders of live blocks until publish time then we'd have to calculate what the updated order of non-moved blocks would be. Another example: Initial list:
Working through this example I'm realising that we can't update sort of everything else on publish, because there's a LOT of ambiguity when you have many sorts one after the other. Sorry! |
This comment has been minimized.
This comment has been minimized.
This comment has been minimized.
This comment has been minimized.
I've unassigned @micmania1 because the work he did was captured in other issues & PRs. This issue really refers to the high-level how is sorting supposed to be managed with versioned objects. I think the issue got a bit carried away - perhaps the title should be more descriptive. |
Actually because it got a bit side tracked I'm going to re-raise the issue if that's okay with you @chillu? I'd really like the goal of this ticket to be a solution to this problem statement:
Currently all our instances of this already ( This came up again in community Slack related to elemental. |
Okay in the end I've just hidden @chrispenny's comments. That was some stellar investigation and we've identified this as a pretty big issue in #195. I've updated the OP with a few examples and some possible solutions. I'm not sure about the actual impact of this because it's been like this since the dawn of versioning. I've tagged this as needing feedback from @silverstripe/core-team because hopefully we can at least get consensus on a potential solution... |
I'd say that in the first instance we want to close off #195 as best we can without this, acknowledge any limitations, and merge it into Secondly, it occurs to me that what we might need to do is create "the ordering of a set of siblings" as a separate piece of data for the purposes of versioning. So you have, say, Then, when you need to get historic data of SiteTree, you join SiteTree_SortVersion and grab the Sort value form there, rather than using SiteTree_Versions.Sort. It's just a high-level sketch but this might be more robust? To allow for sufficient backward compatibility, I would probably retain the existence of the SiteTree_Versions.Sort field as-is, and maybe have a build task or dev/build book to build out the initial version of the SiteTree_SortVersion dataset. This would probably require some special knowledge added to the system to say "oh this field is for sorting". The improved behaviour would only apply if this feature was activated. We could enable it for SiteTree and blocks, but others would need to add it manually. As long as the system worked with its current bugs if you didn't do that, that seems fine to introduce in a minor release. |
Yes to be clear I consider this an entirely different issue. #195 is about tracking changes across an relationship graph. This is about sorting individual objects and only involves siblings when talking about how they need to be updated when moving an object around. Mention of "parents" is just as a potential solution. I don't think we should relate this to capturing changes in the "relationship graph" - #195.
Having Sort on a different table isn't a bad idea from my PoV but I think it still makes more sense to have the sort against a parent - If you think about moving a block around you don't really expect to look at the history of that block to see it's movement within a list, you'd look at the page (or "elemental area") to see how things were re-ordered. The problem with this is that there's nothing that's versioned that's a "parent" to
This kind of scares me. I think you're right so far as preventing a fatal on trying to update non-existent fields - I'm sure there's plenty of code in the wild that's I still am thinking of this like an extension that needs to be configured in some way:
And then you can:
|
So there's four kinds of sort (assuming versioned objects on both sides of the relationship):
Only Sort Variation 3 actually requires sorts to be stored on the record itself, because there's no relationship owner on the other side. We've somewhat abused I'm a bit worried about table explosion if we introduce How about making Aside: We could make this column a JSON type, but that would require MySQL 5.7 (see discussion). And you'd need MySQL 8.0 for proper JSON table functions to do the required virtual table joins, so it's a non starter as well. We could just support sorting on Postgres? ;) I'm really tempted to just say "sort isn't versioned". We've spent an enormous amount of time on versioning edge cases in the last years, and it's exponentially increased the complexity both core committers and the average dev has to deal with. We spend weeks across a dozen people just discussing edge cases before we're in a position to implement anything. Would unversioned sorts really be so bad for author experience? /cc @clarkepaul |
So you're recommending an array field over a separate table? How confident are we that this will make faster rather than slower queries? I would expect a join to a separate table to be at least as fast, with appropriate indexes, and more consistently supported across database backends. Other than "separate table vs array field" we're largely recommending the same thing. The other thing I would say is that you can think of the sorting as being optionally "partitioned by" another field. So SiteTree is sorted by Sort, partitioned by ParentID. ParentID = 0 is a legal 'partition' in that regard.
So we're saying:
Not ideal, but doesn't seem like a showstopper. I wouldn't close as wontfix but maybe not put it on the must-have list for 4.4? |
I like framing this as optional partitioning, makes a lot of sense. "separate table vs. array field": My first concern is amount of tables. I've seen projects with 450 tables, and think SilverStripe is already pretty insane in terms of getting your head around the data model of your site as an average developer. It also increases dev/build times (more to inspect and adjust). Speed of array fields depends on database implementation - I'd wager it's just as fast as joins in Postgres, and could be slower in MySQL. it's a non-starter anyway, since we'd need to bump MySQL support to 8.0 for that. It's definitely slower than joins if we do this in-memory. Can we get a way with one or two tables for sort versioning? To clarify, an "unversioned sort" implementation really means "synced sort between stages". So when you change sort of an item, it directly writes every draft and every live record in the same partition without creating a new version. This introduces new issues with rollback though: You'd have to reset all items in the same partition to the date point in time (closest version). That might be just as time intensive to implement as actually fixing sort versioning in the first place. Aaaaargh. |
If ObjectClass was an enum included in the index that would probably be fine.
It's probably worth clarifying that actual downsides of it, beyond heebie-jeebies. I've tended to recognise that heebie-jeebies can be a sign that something is problematic, but equally that something is merely unusual. |
I had a similar problem with fluent localisation; The issue was I needed the ability to whitelist which fields needed localisation (or in this case, versioning) and not localise others. Similarly, I had separate tables, except that the Perhaps what versioning needs is a way to whitelist which fields are versioned, conditionally joining the base table when loading / lazy-loading fields that are not? This solution would suit problems other than simply sorting (e.g. 'ViewCount' may not need versioning). In other words, We've essentially dropped the concept of I prefer joining existing tables (and trimming the complexity of one of them) much more than adding yet another table to the schema. |
Flagging that we've spent literally hundreds of hours on versioning behaviour in the Open Sourcerers team in the last three months, so from my team's perspective I'm going to park this. It's a long standing issue, which can be remedied from an author perspective by simply publishing everything to get it back into a consistent state. |
Would it be worth noting the quirk in our user and/or dev docs? |
Overview
Sorting with versioned on has_many and many_many through is not predictable and probably hasn't really been properly considered. There are workarounds for common use cases like sorting of blocks in elemental, but no core support.
Acceptance Criteria
Out of Scope
MyRelationVersion
in addition toMyRelationID
). Previewing of older versions and rollbacks are based on closest version for this timestamp, not following references in a relational database.Notes
Context
Problem
Currently when you sort a versioned object you may do one of three things to update the
Sort
column on a versioned DataObject:1) Always use
->write
(Every object gets a new version)Update the object you are sorting with
->write()
, then update all objects in between the new and old positions (exclusive) by looping and also using->write
.Observations:
2) Use
->write
only on the re-ordered object (Only that object gets a version)Update the object you are sorting with
->write()
, update all other objects with anUPDATE
query.Note: Elemental does this (in version 4 of elemental) and CMS already does this with SiteTree (although it still loops the objects and runs individual updates - I can raise an issue for that)
Here's a pretty detailed example of this option:
Consider pages numbered 1 to 5 (1 2 3 4 5). Move page 4 to after 1 (1 4 2 3 5). The sort on the published pages will be (1 3 4 4 5). It might make more sense if I list the live sort order on each of the objects at the end:
All the objects other than 4 (the one that was actually moved) have been updated on Live to make way for the sort order of 4 that will be published, but it's not published yet!
Another example:
Initial list:
A B C D E F G H I J
Move I to beginning:
I A B C D E F G H J
Move B to end:
I A C D E F G H J B
Move H to after A:
I A H C D E F G J B
Move I again:
A H C D E I F G J B
Final sort values (only moved objects have been updated):
Observations:
->write
on the moved object and a multi-row update on the draft and live tables of the object)3) Direct update everything (No versioning)
Update all objects with direct queries
Solution
I don't know.
Here are some ideas I can think of:
We advocate a solution
From simplest to hardest:
We write some code/extension that handles re-ordering collections of versioned object
Some notes on this:
$page->moveAfter($pageId)
?)The text was updated successfully, but these errors were encountered: