Skip to content

Conversation

@ormsbee
Copy link
Contributor

@ormsbee ormsbee commented Dec 4, 2023

This commit normalizes the Media/MIME type in RawContent to be a foriegn key to a new table rather than a text string. This was done for a couple of reasons:

  1. Recent problems with edx-platform's courseware_studentmodule table have driven home the point that we should try to reduce index sizes where possible, and RawContent is likely to grow into a very large table in time.
  2. Media type designations can sometimes change, and that's much easier to do as a data migration if it's in its own lookup table, rather than affecting millions of rows in RawContent.

In order to support these changes, I've also added LRU caching for MediaType lookup... which in turn required a test helper that will clear out the values between test calls. That's the rationale behind our new openedx_learning.lib.test_utils.TestCase.

One major thing that this PR does that is a big unusual at this stage is that it rewrites the database migrations for oel_contents and oel_components. I thought this was safe because we haven't actually included these apps into edx-platform yet, and it was cleaner than writing migration code on a non-nullable field.

This commit normalizes the Media/MIME type in RawContent to be a foriegn
key to a new table rather than a text string. This was done for a couple
of reasons:

1. Recent problems with edx-platform's courseware_studentmodule table
   have driven home the point that we should try to reduce index sizes
   where possible, and RawContent is likely to grow into a very large
   table in time.
2. Media type designations can sometimes change, and that's much easier
   to do as a data migration if it's in its own lookup table, rather
   than affecting millions of rows in RawContent.

In order to support these changes, I've also added LRU caching for
MediaType lookup... which in turn required a test helper that will clear
out the values between test calls. That's the rationale behind our new
openedx_learning.lib.test_utils.TestCase.
@ormsbee ormsbee force-pushed the normalize-mimetype branch from 082f313 to 1a2a27f Compare December 4, 2023 19:28
@ormsbee
Copy link
Contributor Author

ormsbee commented Dec 4, 2023

@kdmccormick, @bradenmacdonald: The MIME type refactoring I mentioned earlier.

@ormsbee ormsbee requested a review from kdmccormick December 4, 2023 19:29
Copy link
Contributor

@bradenmacdonald bradenmacdonald left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Looks nice! I didn't test the code, but I read it and this makes sense to me.

No problem changing existing migrations if they're not in use yet.


# Second call pulls from cache instead of the database
with self.assertNumQueries(0):
contents_api.get_media_type_id(mime_type_str)
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Nit: It would be good to check the actual return value too, to make sure it's the same as the prior call. I guess other things are covered by how media_type is used throughout the test suite / app ?

if self.suffix:
return f"{base}+{self.suffix}"
return base

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Nit: It's not necessary but it could be nice to have a clean function that forces the field values to lowercase, and checks for invalid characters or at least disallows /

@ormsbee ormsbee merged commit b629e49 into openedx:main Dec 5, 2023
@ormsbee ormsbee deleted the normalize-mimetype branch December 5, 2023 14:50
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants