Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Adding resource_organization_id, person_organization_id, piperson_organization_id to cloudfact tables #1956

Conversation

eiffel777
Copy link
Contributor

Description

This adds the Resource, PI, and Person organization ID columns for the cloud realm. These columns are used in the xsede module at this point in time.

Motivation and Context

We want to track the organizations that a person, resource and PI belong to.

Tests performed

Tested in docker

Checklist:

  • The pull request description is suitable for a Changelog entry
  • The milestone is set correctly on the pull request
  • The appropriate labels have been added to the pull request

@eiffel777 eiffel777 added the Category:Cloud Cloud Realm label Dec 4, 2024
@eiffel777 eiffel777 self-assigned this Dec 4, 2024
@aaronweeden aaronweeden modified the milestones: 11.0.1 ACCESS p2, 11.0.0 ACCESS p2 Dec 4, 2024
jpwhite4
jpwhite4 previously approved these changes Dec 4, 2024
Copy link
Contributor

@aaronweeden aaronweeden left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Does there need to be a PR into xdmod-xsede that updates these files to add the piperson_organization_id column?

configuration/etl/etl_action_defs.d/cloud_access/access_cloud_metrics_aggregation.json
configuration/etl/etl_action_defs.d/cloud_access/access_cloud_metrics_aggregation_by_day.json
configuration/etl/etl_tables.d/cloud_jetstream/jetstream_cloudfact_by_.json
configuration/etl/etl_tables.d/cloud_jetstream/jetstream_cloudfact_by_day.json

@@ -27,8 +27,10 @@
"year": "${:YEAR_VALUE}",
"${AGGREGATION_UNIT}": "${:PERIOD_VALUE}",
"host_resource_id": "sr.resource_id",
"resource_organization_id": "sr.resource_organization_id",
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It looks like the Cloud realm uses service_provider instead of resource_organization_id — why is resource_organization_id needed?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I removed this. It should probably change to using resource_organization_id in the future since there is a common group by for service provider that uses that column

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The comment on line 6 says Euca — should that be Cloud?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It probably should but I will change that some other time in a different PR. I want to keep this PR focused on just the organization ids.

…umns and remove resource_organization_id column from cloud realm
Co-authored-by: Aaron Weeden <31246768+aaronweeden@users.noreply.github.com>
@eiffel777
Copy link
Contributor Author

Does there need to be a PR into xdmod-xsede that updates these files to add the piperson_organization_id column?

configuration/etl/etl_action_defs.d/cloud_access/access_cloud_metrics_aggregation.json
configuration/etl/etl_action_defs.d/cloud_access/access_cloud_metrics_aggregation_by_day.json
configuration/etl/etl_tables.d/cloud_jetstream/jetstream_cloudfact_by_.json
configuration/etl/etl_tables.d/cloud_jetstream/jetstream_cloudfact_by_day.json

@aaronweeden This was added in https://github.com/ubccr/xdmod-xsede/pull/530

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

When comparing this file with xdmod-xsede: configuration/etl/etl_action_defs.d/cloud_access/access_cloud_metrics_aggregation.json, it looks like to compute num_sessions_ended, this file uses start_event_type_id while the xdmod-xsede file uses end_event_type_id.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think this is best addressed in a separate PR.

@@ -29,6 +29,7 @@
"host_resource_id": "sr.resource_id",
"account_id": "sr.account_id",
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Does this need to be COALESCE(sr.account_id, -1)? That's how it is in xdmod-xsede: configuration/etl/etl_action_defs.d/cloud_access/access_cloud_metrics_aggregation.json. It looks like there are some other fields like person_id that are going from a nullable column to a non-nullable column so there might need to be some more COALESCEs.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

There probably are other fields COALESCEs should be put on, but I think is better put in a new PR since that affects fields not related to this PR.

Co-authored-by: Aaron Weeden <31246768+aaronweeden@users.noreply.github.com>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Category:Cloud Cloud Realm
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants