feat: adds content tagging app, models, and API #32518

pomegranited · 2023-06-20T07:32:05Z

Description

Adds the models, APIs, and permissions/rules for openedx.features.content_tagging, which builds on the oel_tagging app added by openedx/openedx-learning#57.

These changes are the first step towards allow Course Authors to tag content objects (courses, units, libraries) in edx-platform, but none of these changes are user-facing yet.

Supporting information

Closes openedx/modular-learning#63

Testing instructions

This change should be entirely covered by unit tests (still have a few lines left to cover).

But since there's no public-facing APIs yet, if you want to do any further verification, you can test the functionality from the devstack and shell.

Get the master openedx/devstack up and running.
Pull this branch into edx-platform
Install dependencies and update migrations by running paver install_prereqs in the LMS shell (from devstack dir, run make lms-shell) and CMS shell (make studio-shell).
Run the migrations added here by running ./manage.py lms migrate in the LMS shell.

Create file cms/envs/private.py which contains the following. You may need to restart Studio for these changes to take effect: make dev.restart-devserver.studio

from .devstack import FEATURES
# Org-based course authoring restrictions
FEATURES['ENABLE_CREATOR_GROUP'] = True
FEATURES['STUDIO_REQUEST_EMAIL'] = 'your@email.com'

Go to the Studio home page and sign up a new user: http://localhost:18010/
Expand the "Becoming a Course Creator in Studio" box to request course creation access (this is optional -- an "unrequested" entry is created even if you don't request one, and you can approve that below.)
Login to Studio Django Admin in another window as a superuser (http://localhost:18010/admin, edx@example.com / edx)
Create a new Organization, e.g. with Name: OpenCraft, Short Name: OC
Navigate to [Course creators|http://localhost:18010/admin/course_creators/coursecreator/] and mark the request granted, and select the OpenCraft organization as the only org you can create courses for.

From here, you'll have to play around in the shell, because we only have Django Admin capabilities for ContentTaxonomies right now, and you have to be global staff to access Django Admin, which doesn't let us test out our org restrictions.

For example:

# make studio-shell
root@lms:/edx/app/edxapp/edx-platform# ./manage.py cms shell

from django.contrib.auth import get_user_model
# user you created
org_user = get_user_model().objects.get(username='openedx')
# demo users automatically created
learner = get_user_model().objects.get(username='honor')
staff_user = get_user_model().objects.get(username='staff')

from openedx.features.content_tagging import models, rules, api

from organizations.models import Organization
oc = Organization.objects.get(short_name="OC")
org_taxonomy = api.create_taxonomy(name="OpenCraft taxonomy", org_owners=[oc])
org_taxonomy
# <ContentTaxonomy: ContentTaxonomy object (2)>
assert org_user.has_perm('oel_tagging.add_taxonomy', org_taxonomy)
assert org_user.has_perm('oel_tagging.change_taxonomy', org_taxonomy)
assert not learner.has_perm('oel_tagging.add_taxonomy')
assert not learner.has_perm('oel_tagging.change_taxonomy', org_taxonomy)
assert not learner.has_perm('oel_tagging.add_taxonomy', org_taxonomy)

Deadline

None

Other information

Depends on openedx/openedx-learning#57 and openedx/openedx-learning#60

Data migrations added here can be easily rolled back.

Course creator groups added by #26616.

Author to do before merge:

Update dependency: https://github.com/openedx/edx-platform/pull/32518/files#r1243237190
Finish test coverage: https://github.com/openedx/edx-platform/pull/32518/checks?check_run_id=14577165965

openedx-webhooks · 2023-06-20T07:32:10Z

Thanks for the pull request, @pomegranited! Please note that it may take us up to several weeks or months to complete a review and merge your PR.

Feel free to add as much of the following information to the ticket as you can:

supporting documentation
Open edX discussion forum threads
timeline information ("this must be merged by XX date", and why that is)
partner information ("this is a course on edx.org")
any other information that can help Product understand the context for the PR

All technical communication about the code itself will be done via the GitHub pull request interface. As a reminder, our process documentation is here.

Please let us know once your PR is ready for our review and all tests are green.

from cms.djangoapps.contentstore.helpers to common.djangoapps.student.auth

pomegranited · 2023-06-27T07:05:54Z

requirements/edx/github.in

+# FIXME JV - move to base.in once pypi package created
+git+https://github.com/open-craft/openedx-learning.git@jill/taxonomy-api#egg=openedx-learning==0.1


To do: deploy a release for openedx-learning to pypi and move this to base.in.

openedx/features/content_tagging/models.py

lms/envs/common.py

openedx/features/content_tagging/api.py

openedx/features/content_tagging/models.py

bradenmacdonald · 2023-06-27T20:48:42Z

openedx/features/content_tagging/models.py

+        return queryset
+
+
+class ContentTaxonomy(Taxonomy):


I'm not convinced that it makes sense to make this a subclass of Taxonomy. Because it's not a true subclass, right? When ObjectTag calls self.taxonomy.validate_object_tag(self), there's no virtual dispatch involved. Regardless of whether there's a ContentTaxonomy or not, the self.taxonomy property always returns a Taxonomy instance, and only the base class's validate_object_tag() will be called. You would need to use something like django-polymorphic to get this to work (which is not what I am recommending; it's too much magic and we have enough dependencies already).

Oh bummer. You're absolutely right. I had fooled myself into thinking this was ok in my tests, because when I created the invalid tags, their taxonomies are ContentTaxonomy instances. But they're not after they're refetched from the database :(

I guess I have to rethink this architecture then..

Oh, I think we can have the same problem with system-defined taxonomies

Is there a way to check if a model is an instance of ContentTaxonomy? (with isistance maybe?)

I remember the validate_object_tag function, it was first thought to be inside the API. We could check and depending of the instance, to call one and other function?

The exact same thing has happened to me before... sorry I didn't spot this sooner.

I didn't suggest any solutions because I didn't want to bias you one way or another, but I think there are still some nice ways to achieve a similar architecture without too many changes. One thing to think about for example would be if there is only one database/django model for Taxonomy but it has a field called type and that type determines the runtime behavior, and loads a different class or plugin to handle the implementation details. Basically, put your subclasses and hierarchy into pure python classes and keep the SQL data model simpler or even the same for all of them. But there are other ways to go too, that's just one idea.

@bradenmacdonald @pomegranited

It still uses only one query, but that query does an additional JOIN per registered subclass. So whether or not this is acceptable may depend on how many subclasses there are. If it's 1-2, should be fine. But if there are many, the query performance will be slower, even though it's still technically a single query

Adding more context with this in mind. For system-defined taxonomies, we were thinking of creating a subclass for each taxonomy, since each one had its special way of validating (validate_object_tag). With the current approach, in the short term it would already be slow.

But, I've been thinking that at the system taxonomies level, the type field (Language, Author, etc) could be implemented to determine which validation to run. That way we would only have the SystemDefinedTaxonomy subclass and it would not be necessary to create a subclass for each Taxonomy.

So far we will have two subclasses: ContentTaxonomy and SystemDefinedTaxonomy

@pomegranited that seems fine with me in principle, as long as the code and APIs are carefully constructed to make sure select_subclasses is used. But this is really more of an @ormsbee question so I'd like to hear his thoughts.

I haven't used it before, but as long as the number of subclasses is relatively small, I'm fine with a few joins.

Adding more context with this in mind. For system-defined taxonomies, we were thinking of creating a subclass for each taxonomy, since each one had its special way of validating (validate_object_tag). With the current approach, in the short term it would already be slow.

What are the other types of validation that have to happen on ObjectTag for other taxonomy types? The reason I ask is that I think it's a little odd for the ObjectTag to look to their parent Taxonomy subclass to validate themselves–basically looking upward into the thing containing them. It also seems like unnecessary coupling, since we might want to tag one than one type of thing with the same Taxonomy. In fact we're effectively already doing that here with CourseKey and UsageKey.

Maybe it's ObjectTag that needs the subclassing? It could be made abstract, and have subclasses like CourseTag, BlockTag, etc? And instead of having a completely generic object_id field, we make the object_id return a different real field value in the various subclasses (sort of like what pk does for Django models)? Then we could have the actual field type there–e.g. a CourseKeyField.

What are the other types of validation that have to happen on ObjectTag for other taxonomy types?

In the case of system-defined taxonomies, the validations that are added are more on the tag side than on the object.

Language: Verify if the language is on Django LANGUAGES

Organization: Validate that the organization exists and is active.

Author: Validate that the User exists and is active.

Later we can add more taxonomies that have their own validation

@ormsbee @bradenmacdonald

It still uses only one query, but that query does an additional JOIN per registered subclass. So whether or not this is acceptable may depend on how many subclasses there are. If it's 1-2, should be fine.

Yeah, as @ChrisChV noted, there's going to be a lot of these.. so using InheritanceManager like I've done in openedx/openedx-learning#60 not a good idea.

The reason I ask is that I think it's a little odd for the ObjectTag to look to their parent Taxonomy subclass to validate themselves–basically looking upward into the thing containing them.

I agree that ObjectTag.is_valid is an awkward-looking property -- I added it as a convenience method, because we're going to need the REST API to return an is_valid flag for object tags, so that the frontend can highlight invalid tags for the content authors to correct.

In the design stage, we pushed the ObjectTag validation into the Taxonomy class so we could avoid having to subclass both Taxonomies and ObjectTags, and link them together somehow (see below for a proposed way to do this). This made Taxonomy.validate_object_tag() strange, but it was intended to keep the class complexity down. But that decision was clearly flawed.

Maybe it's ObjectTag that needs the subclassing? It could be made abstract, and have subclasses like CourseTag, BlockTag, etc?

Ok sure -- there weren't any requirements in the spec that made those particular distinctions necessary, however..

We could subclass ObjectTag to encapsulate the "closed taxonomy tags" vs "free text tags" distinction. To do this, we need a way to connect a Taxonomy with its appropriate ObjectTag subclass. To try this out, I drafted open-craft/openedx-learning#2; see Taxonony.object_tag_classes for an example of how this could work.

There, I also allowed for custom ObjectTag classes stored with a Taxonomy that could be used to validate system tags for the different system taxonomy use cases listed above. We could replace the SystemTaxonomy subclasses and their validation with class methods on ObjectTag subclasses instead. Python is bad at polymorphism across class methods, but with care, we can do it, e.g

Taxonomy subclass LanguageTaxonomy.get_tags() was going to return a dynamically-determined list of Tags created from the django settings.LANGUAGES.
LanguageObjectTag could implement a class method to do the same thing.

AuthorTaxonomy has too many available tags to fetch them all in a list, so it would implement autocomplete_tags() instead, and just return the first N dynamically-created Tags with the given prefix.
AuthorObjectTag could implement a class method to do the same thing.

So, what do you think of something like open-craft/openedx-learning#2 instead?

Also, if we do this, it'd be best to stop returning ObjectTag Django Models from the API altogether, and instead use the models as data to populate proper python classes and subclasses, as @bradenmacdonald originally suggested.

If we found that we still need Taxonomy subclasses, we could add a Taxonomy.taxonomy_class field+property like Taxonony.object_tag_classes, and load these Taxonomy subclasses on the fly in the API using a generator, as shown with get_object_tags().

@ormsbee @bradenmacdonald @ChrisChV

Ok, this is looking really good. I'm going to close this PR in favor of #32661, so can we continue this conversation there?

openedx/features/content_tagging/rules.py

to do the low-effort checks first.

* updates requirement to use WIP branch for openedx-learning * import get_taxonomy and get_taxonomies from the oel_tagging API * rename content_tagging.api.get_taxonomies to get_taxonomies_for_org * adds tests to verify that API returns expected Taxonomy subclasses * adds tests to verify object tag validation uses the correct Taxonomy subclass

pomegranited · 2023-07-06T04:22:10Z

Closed in favor of #32661

openedx-webhooks · 2023-07-06T04:22:14Z

@pomegranited Even though your pull request wasn’t merged, please take a moment to answer a two question survey so we can improve your experience in the future.

openedx-webhooks added the open-source-contribution PR author is not from Axim or 2U label Jun 20, 2023

pomegranited mentioned this pull request Jun 26, 2023

Adds Taxonomy, Tag, ObjectTag models and APIs openedx/openedx-learning#57

Merged

3 tasks

pomegranited force-pushed the jill/add-content-tagging branch 3 times, most recently from cf85dbb to 21ab993 Compare June 26, 2023 22:59

pomegranited added 3 commits June 27, 2023 12:36

feat: adds content tagging app, models, admin, and API

2f70278

refactor: moves is_content_creator

d27c18c

from cms.djangoapps.contentstore.helpers to common.djangoapps.student.auth

feat: adds permissions/rules for content_tagging

456979a

pomegranited force-pushed the jill/add-content-tagging branch from 21ab993 to 456979a Compare June 27, 2023 04:04

pomegranited marked this pull request as ready for review June 27, 2023 07:02

pomegranited requested review from a team and bradenmacdonald June 27, 2023 07:02

pomegranited commented Jun 27, 2023

View reviewed changes

openedx/features/content_tagging/models.py Outdated Show resolved Hide resolved

bradenmacdonald reviewed Jun 27, 2023

View reviewed changes

pomegranited added 7 commits June 28, 2023 22:29

style: group tagging INSTALLED_APPS together

3ade910

perf: rearrange ContentTaxonomy.validate_object_tag

d51e148

to do the low-effort checks first.

style: import oel_tagging API methods more concisely

d0c446b

style: adds type annotations to rules

a0e33e1

fix: filter ContentTaxonomies by org, independent of the enabled flag

7278da0

Merge branch 'master' into jill/add-content-tagging

df61b49

pomegranited mentioned this pull request Jun 29, 2023

Ensure the correct Taxonomy subclass is used openedx/openedx-learning#60

Closed

pomegranited closed this Jul 6, 2023

pomegranited mentioned this pull request Jul 10, 2023

Allow custom Taxonomy, ObjectTag subclasses to customize tagging behavior openedx/openedx-learning#62

Merged

pomegranited deleted the jill/add-content-tagging branch November 1, 2024 02:14

		# FIXME JV - move to base.in once pypi package created
		git+https://github.com/open-craft/openedx-learning.git@jill/taxonomy-api#egg=openedx-learning==0.1

feat: adds content tagging app, models, and API #32518

feat: adds content tagging app, models, and API #32518

Uh oh!

Conversation

pomegranited commented Jun 20, 2023 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Description

Supporting information

Testing instructions

Deadline

Other information

Author to do before merge:

Uh oh!

openedx-webhooks commented Jun 20, 2023 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Choose a reason for hiding this comment

Uh oh!

pomegranited Jun 28, 2023 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

ormsbee Jul 1, 2023 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Uh oh!

pomegranited commented Jul 6, 2023

Uh oh!

openedx-webhooks commented Jul 6, 2023

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

5 participants

pomegranited commented Jun 20, 2023 •

edited

Loading

openedx-webhooks commented Jun 20, 2023 •

edited

Loading

pomegranited Jun 28, 2023 •

edited

Loading

ormsbee Jul 1, 2023 •

edited

Loading