Support for grouped changesets of draft modifications #290

ormsbee · 2025-03-19T16:02:59Z

New Draft change-tracking models

This introduces DraftChangeLog and DraftChangeLogRecord, which are mostly draft equivalents of the PublishLog and PublishLogRecord. A DraftChangeLog entry is created for every group of changes (e.g. an import or a reset), and DraftChangeLogRecord has a record for every individual publishable entity that was changed.

The motivation for these models:

Batch changes into logical groupings, e.g. "discard changes in library" or "import a course's content into this library".
Accurate history reconstruction: We don't currently track reset-to-published operations anywhere, so we can't completely faithfully reconstruct historical draft information based purely off of the timestamps of when PublishableEntityVersions are created.

Side-effects

It also introduces a new model DraftSideEffect, which should also have an equivalent PublishSideEffect. This is to capture the idea that sometimes a change in one publishable entity will affect another one, even we don't explicitly create a new version of the affected entity.

For instance, we define containers to have unpinned references to their children. When a Unit is defined this way, the Unit's version is only updated when the Unit's own metadata (e.g. its title) changes, or when it adds, removes, or reorders some of its children. The Unit's version does not increment when a child Component is updated with new edits. However the Unit is still affected, and is still logically part of the change or publish.

So every time a child of a container is modified, its container will be represented in the corresponding DraftChangeLog or PublishLog. In the case where only the child has been edited, the container's DraftChangeLogRecord or PublishLogRecord will show the same version for its old_version and new_version fields. This brings our backend more in line with user expectations, e.g. the Unit will be "published" whenever one of its Components is "published", even if that publish doesn't change the metadata we store for the definition of the Unit itself.

For now, the only planned side-effects are that changes in child elements affect their parent containers. However, the DraftSideEffect model could be used more broadly. For instance, if multiple LTIBlocks relied on some shared common configuration, we could make it so that changing that configuration caused side-effects to be written out that affect the appropriate LTI blocks.

That being said, we're going to want to be extremely thoughtful about when and where else we might apply this. For instance, we could use this to model inheritance, but (a) that would lead to an explosion in writes; and (b) that would not correlate to what users intuitively expect (e.g. you don't expect setting a due date on a subsection to count as an "update" of all the problems inside, because due dates are not something you define at the problem level--the fact that it's read by problems through inheritance is an implementation quirk).

.vscode/settings.json

ormsbee · 2025-03-31T15:04:38Z

openedx_learning/apps/authoring/publishing/api.py

+@set_draft_version.register(int)
+def _(
+    publishable_entity_id: int,
+    publishable_entity_version_pk: int | None,
+    /,
+    set_at: datetime | None = None,
+    set_by: int | None = None,  # User.id
+    create_transaction: bool = True,
+) -> None:
+    """
+    Alias for set_draft_version taking PublishableEntity.id instead of a Draft.
+    """


Just wanted to call out that this is the first time I'm introducing this pattern into openedx-learning (or really anywhere in Open edX code that I'm aware of), where we use functools.singledispatch to give multiple versions of the same function and switch based on the first parameter type. Wanted to get people's thoughts.

Very very cool. I like this a lot more than taking a big Union of types and then having to handle them with a bunch of if statements at the top of the function. It's also nice because it encourages (a) static typing and (b) having one canonical version of function, and then one or more "aliases" which just lightly wrap the canonical verison.

I like that it encourages static typing, but it seems to make the code longer and slightly harder to follow compared to just having different names for the different versions of each function.

Okay, a major knock against it is that it seems to deeply confuse Pylance, making it so that the auto-complete is useless. The more I read about singledispatch and its handling in various tools, the more it seems to fall in the category of "this is a weird thing that has to be special cased everywhere". I'm leaning towards actually making it "a big Union of types and then having to handle them with a bunch of if statements at the top of the function" at this point because of this. It's "ugly" but it's simple and tooling-friendly.

ormsbee · 2025-03-31T15:06:29Z

openedx_learning/apps/authoring/publishing/api.py

+    if set_at is None:
+        set_at = datetime.now(tz=timezone.utc)
+
+    tx_context = atomic() if create_transaction else nullcontext()


New convention: optionally creating a transaction, based on passed in parameters. This is to avoid unnecessarily opening transactions when we're being invoked from things that already open transactions.

ormsbee · 2025-03-31T15:21:24Z

openedx_learning/apps/authoring/publishing/contextmanagers.py

+        with bulk_draft_changes_for(learning_package.id):
+            for section in course:
+                update_section_drafts(learning_package_id, section)


Another new/different thing I'm doing in this PR: Using a context manager to allow parts of the publishing API to access the active DraftChangeLog.

ormsbee · 2025-03-31T15:27:01Z

openedx_learning/apps/authoring/publishing/models/draft_published.py

@@ -1,95 +0,0 @@
-"""


I split the two models in this module into draft_log.py and publish_log.py

ormsbee · 2025-03-31T15:28:07Z

openedx_learning/apps/authoring/publishing/api.py

+    active_change_log = DraftChangeLogContext.get_active_draft_change_log(learning_package_id)
+
+    # If there's an active DraftChangeLog, we're already in a transaction, so
+    # there's no need to open a new one.
+    if active_change_log:
+        tx_context = nullcontext()
+    else:
+        tx_context = bulk_draft_changes_for(
+            learning_package_id, changed_at=reset_at, changed_by=reset_by
+        )


A more extreme version of optional transaction creation. Before this, I was adding a lot of redundant logic in reset_drafts_to_published to avoid the overhead of calling set_draft_version a bunch of times.

ormsbee · 2025-03-31T15:40:48Z

openedx_learning/apps/authoring/publishing/models/publish_log.py

        verbose_name_plural = "Publish Log Records"
+
+
+class Published(models.Model):


I didn't make any changes to this model beyond moving it.

ormsbee · 2025-03-31T15:41:01Z

openedx_learning/apps/authoring/publishing/models/draft_log.py

+from .publishable_entity import PublishableEntity, PublishableEntityVersion
+
+
+class Draft(models.Model):


I didn't make any changes to this model beyond moving it.

ormsbee · 2025-04-01T16:46:56Z

I have a couple of tests that I still need to write around less common cases (nesting bulk_draft_changes_for calls and side-effect calculation when there are multiple layers of containers). But I'm not likely to get to that until tonight, and I'd like to get eyes on other parts of this PR sooner if possible.

There would be at least two follow-up PRs:

a small edx-platform one to send the user information to some calls, e.g. ("who did this soft-delete")
one in openedx-learning that introduces a PublishSideEffect analog to DraftSideEffect.

openedx_learning/apps/authoring/publishing/api.py

openedx_learning/apps/authoring/publishing/models/draft_log.py

kdmccormick

Looking really solid so far. Will continue reviewing tonight.

openedx_learning/apps/authoring/publishing/models/draft_log.py

kdmccormick · 2025-04-02T13:22:31Z

openedx_learning/apps/authoring/publishing/models/draft_log.py

+    We have one unusual convention here, which is that if we have a
+    DraftChangeLogRecord where the old_version == new_version, it means that a
+    Draft's defined version hasn't changed, but the data associated with the
+    Draft has changed because some other entity has changed.


The current wording almost reads as if it's an accident or an edge case that old_version will sometimes equal new_version. I think that's unfair; it's really a solid model of what's happening and intuitively makes sense once you grok the whole system. Here's a suggested rewording:

Suggested change

We have one unusual convention here, which is that if we have a

DraftChangeLogRecord where the old_version == new_version, it means that a

Draft's defined version hasn't changed, but the data associated with the

Draft has changed because some other entity has changed.

Changes often take form of a direct change to the content of the entity, which

result in a bump of that entity's version, and thus new_version > old_version.

However, this will not always be the case: if the data associated with the Draft

has changed purely as a side effect of some other entity changing, then this will

be represented here as a change log record where new_version == old_version.

Clarifying question: There will be instances where new_version > old_version, and also there's a side-effect record, right? As an example, any time an import includes parent-child relationships, I expect that a child and its parent can both have content changes. In other words: new_version==old_version implies side-effect, but not the other way around.

Clarifying question:

OK, "Scenario 2" in your comment below confirms this very clearly. Cool!

kdmccormick · 2025-04-02T13:59:26Z

openedx_learning/apps/authoring/publishing/models/draft_log.py

+    draft_change_log = models.ForeignKey(
+        DraftChangeLog,
+        on_delete=models.CASCADE,
+        related_name="records",
+    )
+    entity = models.ForeignKey(PublishableEntity, on_delete=models.RESTRICT)
+    old_version = models.ForeignKey(
+        PublishableEntityVersion,
+        on_delete=models.RESTRICT,
+        null=True,
+        blank=True,
+        related_name="+",
+    )
+    new_version = models.ForeignKey(
+        PublishableEntityVersion, on_delete=models.RESTRICT, null=True, blank=True
+    )


If I'm reading the on_deletes correctly: deleting the changelog row will delete all its records, but deleting any PEs or PEVs which are actively referenced by a changelog record is disallowed. As an implication, it seems that pruning any given PEV becomes contingent upon first pruning the changelog records associated with it. Is that all as intended?

@kdmccormick and I talked about this a bit offline (notes here), but the upshot is that I'm going to keep this for now and re-evaluate how important pruning is going to be down the line. We had a couple of thoughts around hybrid pruning where the publishing models stay but other things are removed, as well as deeper history pruning that could remove old (and less interesting) things from the DraftChangeLog.

kdmccormick · 2025-04-02T15:15:25Z

openedx_learning/apps/authoring/publishing/models/draft_log.py

+    old_version = models.ForeignKey(
+        PublishableEntityVersion,
+        on_delete=models.RESTRICT,
+        null=True,
+        blank=True,
+        related_name="+",
+    )
+    new_version = models.ForeignKey(
+        PublishableEntityVersion, on_delete=models.RESTRICT, null=True, blank=True
+    )


Confirming my understanding--let me know if these are wrong. These might be useful as comments on this model.

It's valid for multiple DraftChangeLogRecords to exist with the same (entity, old_version, new_version), as long as draft_change_log is distinct for each one. For example, if a user is repeatedly editing a container within a unit U @ v1, we will have a series of DraftChangeLogRecords (U.v1 -> U.v1), (U.v1 -> U.v1), ... , (U.v1 -> U.v1).

We cannot assume new_version >= old_version, because discarding changes will be modelled as setting the Draft pointer to an older version--specifically, the last-published version.

Both of those statements are correct. I'll add comments for them.

I rewrote a lot of the docstring for DraftChangeLogRecord in order to illustrate these and other possible scenarios.

kdmccormick · 2025-04-02T15:25:20Z

openedx_learning/apps/authoring/publishing/api.py

+            _create_container_side_effects_for_draft_change(change)
+
+
+@set_draft_version.register(int)


Suggested change

@set_draft_version.register(int)

@set_draft_version.register

Looks like you can omit (int) this since the first argument is type-annotated..

I don't understand why, but it doesn't infer/dispatch properly if I don't include (int). A bunch of the tests break with error messages that show that it called the base version expecting a Draft with an int argument, e.g.:

tx_context = atomic() if create_transaction else nullcontext() with tx_context: > old_version_id = draft.version_id E AttributeError: 'int' object has no attribute 'version_id'

Weird, gotcha. I might poke at this if I have time, but no need to block on it.

ormsbee · 2025-04-02T18:43:04Z

Self note: It's a super-edge case, but if someone uses one bulk_draft_changes_for to add and then delete an entity, we'd get a (old_version, new_version) entry of (None, None), which doesn't really mean anything because the Draft state was the same before and after and it was not a side-effect. Another weird edge case is if someone edits an entry to make it go from v1 -> v2 and then sets the draft version back to v1. In which case, we'd get (v1, v1), but again, with no side-effect.

Possible remedy: Make it so that our callback to generate side-effects first prunes out the entries that look like they would end up as side-effects, but which can't be because we haven't generated them yet.

bradenmacdonald · 2025-04-03T20:20:12Z

A lot of our API methods, like create_unit_version() as one random example, accept an optional parameter like created_by: int | None = None,. But because it's optional, I've noticed that we haven't been very thorough in populating those fields when using Learning Core APIs within edx-platform. e.g. many of these libraries APIs don't even accept a user parameter so don't pass anything in to Learning Core.

I'm wondering if we can auto-set the created_by and similar fields from the current DraftChangeLogContext changed_by value. We'd still have a problem of making sure edx-platform explicitly creates a DraftChangeLogContext with the current user set, but there'd be a lot less manual passing of user IDs around through each level. Maybe the same for the changed_at values too.

openedx_learning/apps/authoring/publishing/api.py

bradenmacdonald · 2025-04-03T21:25:04Z

openedx_learning/apps/authoring/publishing/models/draft_log.py

+
+class DraftSideEffect(models.Model):
+    """
+    Model to track when a change in one Draft affects other Drafts.


This phrasing is confusing to me (is it accurate? because neither cause nor effect have any relationship to the Draft model that I can see). Could we say something like "Model to track when a draft change to an entity implicitly affects other entities such as parent containers" ?

I'll change the wording. I was thinking about it in the sense that the act of changing what Draft.version points to (by calling set_draft_version, e.g. a component going from version 1 to version 2) affects another thing a different row's Draft.version points to (the container).

openedx_learning/apps/authoring/publishing/models/draft_log.py

ormsbee · 2025-04-03T21:54:51Z

A lot of our API methods, like create_unit_version() as one random example, accept an optional parameter like created_by: int | None = None,. But because it's optional, I've noticed that we haven't been very thorough in populating those fields when using Learning Core APIs within edx-platform. e.g. many of these libraries APIs don't even accept a user parameter so don't pass anything in to Learning Core.

Yeah, I have an edx-platform branch where I added created_by in a number of places.

I'm wondering if we can auto-set the created_by and similar fields from the current DraftChangeLogContext changed_by value. We'd still have a problem of making sure edx-platform explicitly creates a DraftChangeLogContext with the current user set, but there'd be a lot less manual passing of user IDs around through each level. Maybe the same for the changed_at values too.

Yeah, that sounds useful. I'll play around with it.

ormsbee · 2025-04-09T17:50:16Z

@kdmccormick, @bradenmacdonald: This should be in a fully reviewable state now. Two high level things:

I'm punting on implicitly setting created_at/created_by. I agree that it'd be a nice use of the context, but this PR is already larger than I'm comfortable with.
I got rid of the singledispatch call and used isinstance for switching between Draft objects and id. It's not as elegant, but it doesn't mess up any of the tooling, and I think it's easier for people to understand.

I have a companion edx-platform PR that I need to rebase and update (it mostly just passes a few extra args, like who is resetting something).

ormsbee · 2025-04-11T02:51:55Z

The edx-platform counterpart to this PR is: openedx/edx-platform#36513

ormsbee · 2025-04-11T16:59:40Z

Rebasing this now (need to account for the new migration added to the publishing app).

…l enough for anything else at the moment

…d comments

kdmccormick · 2025-04-16T19:12:45Z

openedx_learning/apps/authoring/publishing/api.py

+    Each publishable entity that is edited in this context will be tied to a
+    single DraftChangeLogRecord, representing the cumulative changes made to
+    that entity. Upon closing of the context, side effects of these changes will
+    be calcuated, which may result in more DraftChangeLogRecords being created
+    or updated. The resulting DraftChangeLogRecords and DraftChangeSideEffects
+    will be tied together into a single DraftChangeLog, representing the
+    collective changes to the learning package that happened in this context.
+    All changes will be committed in a single atomic transaction.
+
+    Example::
+
+        with bulk_draft_changes_for(learning_package.id):
+            for section in course:
+                update_section_drafts(learning_package.id, section)
+
+    If you make a change to an entity *without* using this context manager, then
+    the individual change (and its side effects) will be automatically wrapped
+    in a one-off change context. For example, this::
+
+        update_one_component(component.learning_package, component)
+
+    is identical to this::
+
+        with bulk_draft_changes_for(component.learning_package.id):
+            update_one_component(component.learning_package.id, component)


@ormsbee @bradenmacdonald About to merge with this new docstring. Lmk if you have any suggested edits, or we can edit it later if I merge before you get a chance to review.

ormsbee · 2025-04-16T19:56:48Z

@kdmccormick: Thank you for pushing this PR over the line!

ormsbee commented Mar 24, 2025

View reviewed changes

.vscode/settings.json Outdated Show resolved Hide resolved

ormsbee commented Mar 31, 2025

View reviewed changes

ormsbee force-pushed the draft_log2 branch 2 times, most recently from bca7747 to 0ac4531 Compare March 31, 2025 22:10

ormsbee marked this pull request as ready for review April 1, 2025 16:39

ormsbee commented Apr 1, 2025

View reviewed changes

openedx_learning/apps/authoring/publishing/api.py Outdated Show resolved Hide resolved

kdmccormick self-requested a review April 1, 2025 19:31

kdmccormick reviewed Apr 1, 2025

View reviewed changes

openedx_learning/apps/authoring/publishing/models/draft_log.py Outdated Show resolved Hide resolved

kdmccormick reviewed Apr 1, 2025

View reviewed changes

openedx_learning/apps/authoring/publishing/models/draft_log.py Outdated Show resolved Hide resolved

openedx_learning/apps/authoring/publishing/models/draft_log.py Outdated Show resolved Hide resolved

kdmccormick reviewed Apr 2, 2025

View reviewed changes

bradenmacdonald reviewed Apr 3, 2025

View reviewed changes

openedx_learning/apps/authoring/publishing/api.py Outdated Show resolved Hide resolved

bradenmacdonald reviewed Apr 3, 2025

View reviewed changes

ormsbee commented Apr 3, 2025

View reviewed changes

openedx_learning/apps/authoring/publishing/models/draft_log.py Show resolved Hide resolved

ormsbee mentioned this pull request Apr 11, 2025

feat: record the user for library content writes openedx/edx-platform#36513

Merged

ormsbee and others added 17 commits April 16, 2025 13:41

temp: assorted pylint/mypy fixups, because my brain isn't working wel…

6cc7dc2

…l enough for anything else at the moment

docs: rewrite docstring for DraftChangeLog to be more comprehensive

7e5b928

temp: be more specific in notation

5023ebd

test: fix mypy errors

554b265

test: more linting fixes

7f558aa

chore: import sorting

f6d6192

fix: handle edge case around resetting to previous draft versions, ad…

3aa8360

…d comments

temp: minor comment fixes

6c895ef

test: add tests

141a4fb

temp: linter fixups

3b8a273

temp: add more type annotations

fce7154

fix: broken test was passing in entity version ID isntead of entity ID

951afb6

refactor: update for rebase-introduced migration and API change

643758d

refactor: no, really, changing the migrations now

7b65c81

chore: no_pii annotations on draft models

7063f5d

fix: enrich type annotation of exit_callbacks

26cbd1e

docs: get_containers_with_entity TODO followup link

a4edf0d

kdmccormick force-pushed the draft_log2 branch from 3aca6cb to a4edf0d Compare April 16, 2025 17:42

kdmccormick added 2 commits April 16, 2025 14:36

refactor: simplify active_change_log conditional

b0f911a

docs: rm outdated comment

c7aa0a2

kdmccormick approved these changes Apr 16, 2025

View reviewed changes

kdmccormick reviewed Apr 16, 2025

View reviewed changes

docs: document bulk_draft_changes_for

bbc4dc2

kdmccormick force-pushed the draft_log2 branch from b70a1b7 to bbc4dc2 Compare April 16, 2025 19:17

build: increment version, 0.22 -> 0.23

36642ea

kdmccormick merged commit 443c3d6 into openedx:main Apr 16, 2025
11 checks passed

ormsbee deleted the draft_log2 branch April 16, 2025 19:56

bradenmacdonald mentioned this pull request Apr 17, 2025

Plan for backend changes to support Units in Libraries openedx/frontend-app-authoring#1697

Closed

16 tasks

ormsbee mentioned this pull request Sep 4, 2025

version bump 0.28.0 open-craft/openedx-learning#20

Closed

		verbose_name_plural = "Publish Log Records"


		class Published(models.Model):

		from .publishable_entity import PublishableEntity, PublishableEntityVersion


		class Draft(models.Model):

-    We have one unusual convention here, which is that if we have a
-    DraftChangeLogRecord where the old_version == new_version, it means that a
-    Draft's defined version hasn't changed, but the data associated with the
-    Draft has changed because some other entity has changed.
+    Changes often take form of a direct change to the content of the entity, which
+    result in a bump of that entity's version, and thus new_version > old_version.
+    However, this will not always be the case: if the data associated with the Draft
+    has changed purely as a side effect of some other entity changing, then this will
+    be represented here as a change log record where new_version == old_version.

		_create_container_side_effects_for_draft_change(change)


		@set_draft_version.register(int)

Support for grouped changesets of draft modifications #290

Support for grouped changesets of draft modifications #290

Uh oh!

Conversation

ormsbee commented Mar 19, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

New Draft change-tracking models

Side-effects

Uh oh!

Uh oh!

Choose a reason for hiding this comment

Uh oh!

kdmccormick Apr 2, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

ormsbee Mar 31, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Choose a reason for hiding this comment

Uh oh!

ormsbee Mar 31, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Choose a reason for hiding this comment

Uh oh!

ormsbee Mar 31, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Choose a reason for hiding this comment

Uh oh!

ormsbee commented Apr 1, 2025

Uh oh!

Uh oh!

Uh oh!

kdmccormick left a comment

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

kdmccormick Apr 2, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

ormsbee commented Apr 2, 2025

Uh oh!

bradenmacdonald commented Apr 3, 2025

Uh oh!

Uh oh!

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Uh oh!

ormsbee commented Mar 19, 2025 •

edited

Loading

kdmccormick Apr 2, 2025 •

edited

Loading

ormsbee Mar 31, 2025 •

edited

Loading

ormsbee Mar 31, 2025 •

edited

Loading

ormsbee Mar 31, 2025 •

edited

Loading

kdmccormick Apr 2, 2025 •

edited

Loading