Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[Discussion] Tag Usage & Tagging Strategy #73

Closed
BethanyG opened this issue Feb 19, 2020 · 5 comments
Closed

[Discussion] Tag Usage & Tagging Strategy #73

BethanyG opened this issue Feb 19, 2020 · 5 comments
Labels
needs discussion The fix for this issue needs discussion question Further information is requested

Comments

@BethanyG
Copy link
Member

BethanyG commented Feb 19, 2020

Please review this PR comment as background: #69 (comment)


The TL;DR

We are going to be tagging literally everything, and now is probably the time to have a discussion about our thoughts and architecture around it.

Some Background

  • We've decided to use a Django add-in library called taggit to manage tagging for our backend API points. We currently use the default configuration.

  • Taggits default configuration uses Djangos contenttypes framework and the Django ORMs generic relation to model a manytomany relationship between items to be tagged and tags.

  • We can change this behavior to be more direct and more explicit by using what is called a through model, but this requires tracking tags for, say Resources in a resources_tags table, and tags for Hangouts in a hangouts_tags table.

What this means

(not an exhaustive list of points, but a start)

  • We can "tag" any model (resources, hangouts, people, discussions, notes) in a relatively straightforward matter by adding a "field" that points to taggits TaggableManager class. See this for basic usage. The TaggableManager takes care of looking up and saving the appropriate entries into the three associated "general" tables used for tagging.

  • We can use common taggit code to pull associated tags for an item, update those tags, or remove those tags, without having to write additional ORM code.

  • We can use common code to serialize and de-serialize tags for our API.

  • Pulling tags across many items or endpoints could get expensive, due to the generic relations used. See this warning from the docs.

  • Someone cannot easily trace the relationship between the models being tagged and the tags/taggeditems/contenttyes from looking at the DB - there are no explicit links from, for example, the resources_resource table to the taggit_tageditem table.

  • Custom serializer code had to be written to properly serialize the tags into the format we wanted them for for Resources. We'll need to use the same code for any other endpoint that uses tags and that code may still have bugs in it.

Some Questions

(again, not exhaustive, but a start)

  • Do we re-configure taggit to use explicit relations for each item being tagged? Pros? Cons?

  • Do we want a explicit Tags endpoint to handle things like translating, filtering, pulling relations, etc?

  • What makes sense for how we'll be using tags?

  • Because of the queries involved, will we be setting ourselves up for performance issues?

  • Some additional related concerns in the conversation here: Issues 43 & 67 #69 (comment)

  • Some side debate about generic relations here

@BethanyG BethanyG added question Further information is requested needs discussion The fix for this issue needs discussion labels Feb 19, 2020
@BethanyG BethanyG mentioned this issue Feb 19, 2020
@lpatmo
Copy link
Member

lpatmo commented Feb 20, 2020

Thanks for these detailed notes and questions! :) Will let these questions sink in/give this some more thinking before I try to add to the discussion.

@BethanyG
Copy link
Member Author

Additional thought about a common tags table: context may or may not be harder to track. As a "quick-and-dirty" example, a "nighttime" tag may be relevant in the Hangouts context, but completely useless in the Resources context, and might have to be filtered out. Or not. Depending on how the tables/queries/endpoints are set up.

In any case, scenarios like the one above are what I'd love to suss out and discuss before we add in additional complexity for other endpoints.

@bengineerdavis
Copy link

bengineerdavis commented Feb 26, 2020

Can we format tags as their own models? That way, we could create appropriate associations through foreign keys since each API currently is a direct derivative/translation of a Django model. Would this be an option for us?

@BethanyG
Copy link
Member Author

BethanyG commented Mar 4, 2020

@bengineerdavis - TL;DR its complicated.

Tags are already their own "model" -- they just aren't attached to what they are tagging as a Foreign Key relation. Not right now.

This Medium post helped me think through things a bit. As did this Django documentation on generic keys and contenttypes.

Finally, we went the route of customizing taggit (you can see the several failed PRs and the crazy stuff there) because we needed to make unicode render better in tags, and add a guid for each ....so this taggit documentation goes through a bit of that.

And it was ... involved, at least it was for me.

..so I think we are in a state now where we want to think through any more hidden bombs, and also decide if this is what we really want to stick with, if that makes sense?

@BethanyG
Copy link
Member Author

Closing this in favor of a discussion.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
needs discussion The fix for this issue needs discussion question Further information is requested
Projects
None yet
Development

No branches or pull requests

3 participants