Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Implement quota tracking options per ObjectStore. #14047

Closed
wants to merge 9 commits into from

Conversation

jmchilton
Copy link
Member

@jmchilton jmchilton commented Jun 9, 2022

Redoing #10221 / #10977 but building on #14044.

Overview

#6552 implemented the ability for admins to assign job outputs to different object stores at runtime (this could take into account tool/workflow injected parameters or just be based on user, tool, destination, cluster state, etc..). But all the stored data would consume the same quota - regardless of the source selected.

This pull request allows different object stores or different groups of object stores to have different quotas or no quota at all. This enables uses cases such as sending job to cheaper data when a user's quota is getting near full or allowing admin to setup tool and/of workflow parameters to send job outputs higher quality, more redundant storage based on user selected options or user preferences.

This is a substantial step forward toward allowing scratch-space histories, while I suspect we want to implement some higher level convince functions and interface around that (per history preferences, object store preferences types) - I think that would all be based on these abstractions - abstractions that allow even more flexibility for admins who require it.

Implementation

This adds the quota tag to XML/YAML object store declarations - that allow specifying a "quota source label" for each objectstore in a nested objectstore or disabling quota all together on objectstores.

The following quota block would assign all this storage to a quota source labelled with s3.

        <backend id="dynamic_s3" type="disk" weight="0">
            <quota source="s3" />
            <files_dir path="${temp_directory}/files_dynamic_s3"/>

Whereas this would disable quota usage for this object store altogether.

        <backend id="temp_disk" type="disk" weight="0">
            <quota enabled="false" />
            <files_dir path="${temp_directory}/files_cloud_scratch"/>

In order to implement this a new table/model has been added to track a user's usage per quota source label - namely UserQuotaSourceUsage. Object stores that did not have a source label are still tracked using the User model's disk_usage attribute. I've updated all the scripts that recalculate user usage.

UI + API

The quota dialog adds the option to pick a quota source label from those defined on the object stores, though this option only appears if quota source labels are configured.

Screen Shot 2020-09-28 at 8 45 33 PM

Likewise, by default the quota meter is unaffected but when multiple quota source labels are configured the meter becomes a link that shows the usage of each quota source.

Screen Shot 2022-06-16 at 7 13 19 PM

A new API /api/users/<user_id|current>/usage enables this.

Abstractions for #4840

While this PR adds significant complexity related to recalculating a User's quota - it does reduce the duplication, adds tests (made more useful by having fewer paths through the quota recalculation code), and bring object store information into the calculation. I think this is all stuff that would be needed for #4840 and currently missing.

Part of this establishes a pattern for how to exclude certain datasets from usage calculation both when it is being added (included in #4840) and when re-calculdated (not included in #4840).

The API endpoints for disk usage across object stores and the UI entry point for displaying that information will hopefully both enable a more robust implementation of #4840.

How to test the changes?

(Select all options that apply)

License

@jmchilton jmchilton force-pushed the quota_4 branch 2 times, most recently from dc187ac to d9eb2cf Compare June 11, 2022 16:32
@jmchilton jmchilton force-pushed the quota_4 branch 8 times, most recently from ed3b4c3 to b1665a7 Compare June 20, 2022 16:24
@jmchilton jmchilton changed the title [WIP] Implement quota tracking options per ObjectStore. Implement quota tracking options per ObjectStore. Jun 20, 2022
@github-actions github-actions bot added this to the 22.09 milestone Jun 20, 2022
Copy link
Contributor

@davelopez davelopez left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This looks awesome!
Just some minor comments below from my limited point of view 😅

client/src/components/ObjectStore/DescribeObjectStore.vue Outdated Show resolved Hide resolved
lib/galaxy/managers/users.py Outdated Show resolved Hide resolved
lib/galaxy/webapps/galaxy/services/history_contents.py Outdated Show resolved Hide resolved
lib/galaxy/webapps/galaxy/services/history_contents.py Outdated Show resolved Hide resolved
@bgruening
Copy link
Member

This one needs a rebase unfortunately.

@mvdbeek mvdbeek marked this pull request as draft October 17, 2022 08:54
@dannon dannon modified the milestones: 23.0, 23.1 Jan 10, 2023
@jmchilton jmchilton force-pushed the quota_4 branch 3 times, most recently from 3e5bfc2 to b273ebf Compare February 14, 2023 17:10
@jmchilton
Copy link
Member Author

Merged as part of #14073

@jmchilton jmchilton closed this Mar 17, 2023
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging this pull request may close these issues.

4 participants