Skip to content

Conversation

@ChrisChV
Copy link
Contributor

This PR adds an ADR to define the representation of the tags and the pagination in the single taxonomy view api

Reference issue: openedx/modular-learning#75

@openedx-webhooks openedx-webhooks added the open-source-contribution PR author is not from Axim or 2U label Aug 11, 2023
@openedx-webhooks
Copy link

openedx-webhooks commented Aug 11, 2023

Thanks for the pull request, @ChrisChV! Please note that it may take us up to several weeks or months to complete a review and merge your PR.

Feel free to add as much of the following information to the ticket as you can:

  • supporting documentation
  • Open edX discussion forum threads
  • timeline information ("this must be merged by XX date", and why that is)
  • partner information ("this is a course on edx.org")
  • any other information that can help Product understand the context for the PR

All technical communication about the code itself will be done via the GitHub pull request interface. As a reminder, our process documentation is here.

Please let us know once your PR is ready for our review and all tests are green.

**Cons**

- The children would not have pagination, in the long run there may be cases in which
the branch has hundreds of children, and they would still all be brought.
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

If we need to mitigate this, we could enforce a "maximum number of tags per branch" during tag import.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I would not be sure about this, if for the MVP we are already thinking of importing large taxonomies, it would not work to set limits (even if they are large)

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We can ask the product folks, but I don't think this restriction is going to work. This whole approach assumes that there are a lot of root tags and very few children per root, but I think the opposite will generally be more common: only a few root tags and potentially thousands of child or grandchild tags. Think of the tree of life used in Biology - it has only three root tags (Bacteria, Achaea, and Eukaryota) but there are over 1 million species in the taxonomy. Of course that's an extreme case, but I think that shape of tree is more common than many root tags and few children. See also LabXchange, or Amazon product categories, or anything like that.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Oh Yes, you are right, it makes sense. I think we can mitigate the "Cons" of "Get the branch in another call"


- The children would not have pagination, in the long run there may be cases in which
the branch has hundreds of children, and they would still all be brought.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I agree with your chosen option @ChrisChV , that makes total sense.

There's two more use cases implied by the UI:

tag_search_sort

Sort A-Z or Z-A

This sorts the top-level tags, not the child tags underneath (cf figma comment).

Search for tags in a taxonomy

This one's a little more complicated since it could be done a couple of ways.

Options I see:

  1. We support tag search on the backend, and return a subset of matching tags in the format proposed here.
    It means a backend API hit every time someone searches for a tag, but will scale.
  2. We constrain the number of tags allowed in a taxonomy for MVP, so that the API can return all the tags in one page.
    This allows the frontend to handle tag search all on its own, which is performant, but doesn't scale.

@ChrisChV and @bradenmacdonald How do you think we should handle this?

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@pomegranited What's the eventual limit on taxonomy size going to be, post-MVP? If it's fairly high I think we should just do the backend version now. If we're never going to allow huge taxonomies, let's do frontend.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@bradenmacdonald Yeah, I don't think there's an upper bound planned for post-MVP, and there's already huge taxonomies being discussed, as you noted. So I agree: using the backend for search is the safest (and simplest) option.

Copy link
Contributor Author

@ChrisChV ChrisChV Aug 15, 2023

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@bradenmacdonald @pomegranited Considering taxonomies like this, I also agree with the search option in the backend. But another concern arises, in the interface to tag an object, will all the tags be brought in as initially thought? I think this page should use this same view (with pagination) and the same search

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@ChrisChV Given the huge taxonomies shown, I think we'll have to find a different approach for that page. We can't bring in all the tags as we initially thought. It would be too big.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I was thinking of using the same view for both screens, since the screen to tag an object must also be paginated, I'm going to ask UX about this

@mphilbrick211
Copy link

Hi @ChrisChV! Is this pull request ready for review?

@ChrisChV
Copy link
Contributor Author

Hi @mphilbrick211, yes it's ready for review. @bradenmacdonald It's ready for your review

Copy link
Contributor

@bradenmacdonald bradenmacdonald left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Hmm, I like most of your ideas but I'm not sure if this approach is going to work. Let's discuss with UX/product and see what they say.

**Cons:**

- In the UI there is the functionality *Expand all*, another view would have to
be made to handle this functionality in a scalable way.
Copy link
Contributor

@bradenmacdonald bradenmacdonald Aug 22, 2023

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I personally like the "Get the branch in another call" approach the most. It's simple: first load only the first page of root tags, and then the user can load additional pages of root tags or expand any root tag and load [the first page of] its children. And so on, recursively.

If the main con is

In the UI there is the functionality Expand all, another view would have to be made to handle this functionality in a scalable way.

I think we can easily work around that by disabling the "Expand all" option for taxonomies that have > 1,000 tags.

Likewise, for taxonomies with < 1,000 tags we can pre-load all the data from the whole taxonomy.

But don't go changing the ADR just yet - let's ask on Slack for input.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I asked on Figma.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I updated the document with what was discussed


- In the UI there is the functionality *Expand all*, another view would have to
be made to handle this functionality in a scalable way.
- A user could make many calls; every time a parent is opened.
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

That's fine, these calls should be extremely fast and performant to serve.

**Cons**

- The children would not have pagination, in the long run there may be cases in which
the branch has hundreds of children, and they would still all be brought.
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We can ask the product folks, but I don't think this restriction is going to work. This whole approach assumes that there are a lot of root tags and very few children per root, but I think the opposite will generally be more common: only a few root tags and potentially thousands of child or grandchild tags. Think of the tree of life used in Biology - it has only three root tags (Bacteria, Achaea, and Eukaryota) but there are over 1 million species in the taxonomy. Of course that's an extreme case, but I think that shape of tree is more common than many root tags and few children. See also LabXchange, or Amazon product categories, or anything like that.

- List of root tags that can be expanded to show children tags.
- This list can be sorted alphabetically: A-Z (default) and Z-A
- The user can expand all root tags.
- The user can search for tags.
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We need to clarify in this ADR how this search is going to work, and how it interacts with pagination.

My suggestion is something like this:

  1. On the backend when serving this API, if there is a search term like "cat", filter all the tags to only those that contain the string "cat" as well as their ancestor tags.
  2. If the total number that matched is less than ~200, send them all to the frontend in a JSON tree structure.
  3. If the total number that matched is more than that, send only [the first page of] root tags to the frontend. Each root tag can be expanded by another API call, and/or additional pages of root tags can be loaded.

But let's discuss this with product and UX team before making too many changes here.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@bradenmacdonald I have updated the document with what was discussed.

I've been thinking, if we are going to return complete taxonomies that have less than 1000 tags, we can do a search in the frontend with these taxonomies. I have not added it to the document, because I think that many conditionals are already being added and it will complicate the implementation. What do you think?

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yes, I think so too. Although I also don't want to complicate things too much by having two different implementations of search, based on the taxonomy size. Maybe there's some way we can make the "< 1,000 so load all and search via frontend" just a subset / special case of the general version. Something like "if all tags are loaded into the frontend, do a local filter, else do a server-side search and refresh the tags shown" would work in both cases.

@bradenmacdonald
Copy link
Contributor

@mphilbrick211 Is the "FC" / funded contribution label missing from this repo, or has it been removed in general? This is an FC PR.

@mphilbrick211
Copy link

Hi @bradenmacdonald - looks like the label isn't in this repo.

For FC projects, would you mind please putting the FC-XXXX ID number in the PR title? That will also help me know which project it is.

@bradenmacdonald bradenmacdonald changed the title docs: ADR for pagination and repr of single taxonomy view api docs: ADR for pagination and repr of single taxonomy view api [FC-0030] Aug 22, 2023
@bradenmacdonald bradenmacdonald added the FC Relates to an Axim Funded Contribution project label Aug 22, 2023
@bradenmacdonald
Copy link
Contributor

@mphilbrick211 Yep, we try to do that generally, sorry we missed it here. I've added it now and created the FC label too.

Copy link
Contributor

@bradenmacdonald bradenmacdonald left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Looks good to me, just some minor cleanups of the text. Are you happy with this @ChrisChV ?

It was taken into account that taxonomies commonly have the following characteristics:

- It have few root tags.
- It have a very large number of children for each tag.
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
- It have a very large number of children for each tag.
- It may have a very large number of children for each tag.

@bradenmacdonald
Copy link
Contributor

FYI @ormsbee , please review this ADR for pagination of tags if you are interested.

@ChrisChV
Copy link
Contributor Author

Hi @bradenmacdonald, I think we can merge this ADR

@bradenmacdonald bradenmacdonald merged commit ea311e1 into openedx:main Sep 12, 2023
@openedx-webhooks
Copy link

@ChrisChV 🎉 Your pull request was merged! Please take a moment to answer a two question survey so we can improve your experience in the future.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

FC Relates to an Axim Funded Contribution project open-source-contribution PR author is not from Axim or 2U

Projects

Archived in project

Development

Successfully merging this pull request may close these issues.

5 participants