-
Notifications
You must be signed in to change notification settings - Fork 68
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Add cloud user and system account group by #797
Conversation
…adding people and usernames. adding updated regression and integration tests
…oud-user-group-by
…m database migration file
…ith generic cloud data to account for event_id being generated on the event table instead of generic_cloud_staging_event
…t in post_ingest_updates
*/ | ||
protected function writeJsonPartialConfigFile($directory, $name, array $data) | ||
{ | ||
$json = Json::prettyPrint(json_encode($data)); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
$json = Json::prettyPrint(json_encode($data)); | |
$json = json_encode($data, JSON_PRETTY_PRINT); |
The Json::prettyPrint is terribly inefficient and only exists because the JSON_PRETTY_PRINT setting was not in php 5.3. We no longer support php 5.3 so Json::prettyPrint is deprecated and should not be used for new code.
return $file; | ||
} | ||
}); | ||
|
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Need some error checking here. What happens if $partialConfigFile
is an empty array?
*/ | ||
public function execute() | ||
{ | ||
$this->setCloudRolesFile(); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Please explain why this config migration exists. You should be able to update the roles.d/cloud.json file and the RPM will install it in the config directory.
$rolesConfigFolder = $this->config->getPartialFilePaths('roles'); | ||
|
||
if($cloudFile = array_search(CONFIG_DIR."/roles.d/cloud.json", $rolesConfigFolder) === false){ | ||
throw new Exception("cloud.json file not found in roles.d folder"); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
What happens to this exception? Will it completely kill the upgrade process? If so how does the user recover?
private function addCloudRolesGroupBy($groupBy, $role) | ||
{ | ||
if(!array_key_exists($role, $this->cloudRolesFile['+roles'])){ | ||
throw new Exception("Role not found in cloud.json file"); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
What happens to this exception? Will it completely kill the upgrade process? If so how does the user recover?
"namespace": "ETL\\Ingestor", | ||
"options_class": "IngestorOptions", | ||
"truncate_destination": false, | ||
"enabled": true |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
enabled: true
is the default and does not need to be specified
"name" : "Cloud DB", | ||
"config" : "datawarehouse", | ||
"schema": "mod_shredder", | ||
"create_schema_if_not_exists": "true" |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
If the setting is the same as the default then you should not override it. E.g. create_schema_if_not_exists is specifed as true in the defaults at the top of the file.
@@ -4,7 +4,7 @@ USE modw_cloud; | |||
-- somme OpenStack events. The events that already exist in the staging and event | |||
-- need to have their mappings updated. Some of the updates include mapping the | |||
-- compute.instance.power_on.start event to POWER_ON_START event and | |||
-- compute.instance.resume.start to REQUEST_RESUME and compute.instance.resume.end | |||
-- compute.instance.resume.start to REQUEST_RESUME and compute.instance.resume.end |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
As a general rule I prefer to not have changes that only change whitespace. This is because it makes it more difficult for code reviewers to review correctly. For example, in this case, this file shows in the list of changed files relating to adding the user and system group bys. However there is no need fo this file to change at all for the purpose of this pull request.
…oud-user-group-by
* @param string $name The config file name (without ".json"). | ||
* @param array $data The data to store in the config file. | ||
*/ | ||
protected function writeJsonPartialConfigFile($directory, $name, array $data) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Question the need for another dedicated function that writes a json config file. I note that we already have a JSON::saveFile()
that does exactly the same thing. The code that calls this function already knows about how to construct the path to the file. In fact you even use the sister JSON::loadFile()
function a few lines before calling this one. See classes/OpenXdmod/Migration/Version800To810/ConfigFilesMigration.php
lines 36 and 51
Json::loadFile(CONFIG_DIR."/roles.d/cloud.json");
…ace with using JSON::savefile
…d by Ben in a previous PR
LGTM once you address @jpwhite4's outlying comment. |
…oud-user-group-by
Don't forget to squash and merge |
* adding person and username group bys for cloud data and pipeline for adding people and usernames. adding updated regression and integration tests * removing unneeded migration code. adding function write out partial config files * update config migration script and remove unneeded use statements from database migration file * documetation updates and remove erroneous commits * formatting changes * removing unneeded function from user and systemaccount group bys * moving generation of event_id to event table instead of staging table * added comments and changed query for event_asset table when dealing with generic cloud data to account for event_id being generated on the event table instead of generic_cloud_staging_event * adding trucate_destination to etl.d file instead of truncate statement in post_ingest_updates * moving post ingest sql update action back to original location in pipeline file * documetation updates * fixing style issues and updating test artifacts for unit tests * fixes for passing unit and style tests * remove extra spaces to pass unit tests * adding new group bys to test artifact * addressing comments from @jpwhite4 * removing defaults from etl pipeline files * updating tests * updating jobs_cloud_generic to be correct * updating cloud person and username tests to use updated anonymized data * removing unnecessary function to write out partial config files. replace with using JSON::savefile * re-adding hide_sql_warning_codes in jobs_cloud_generic that were added by Ben in a previous PR * changing $value to $unused to pass linter
Description
This pull request adds group by's for the person and system username for the cloud realm. The person group by will show the first and last name of a person associated with a username. The System Username group by will show the username of the person associated with the data being shown. In order to use these group by's usernames from the cloud event logs files are loaded into the modw.systemaccount and modw.person tables using pipelines added to jobs_cloud_generic.json and jobs_cloud_openstack.json. The new pipelines take actions from existing pipelines used for adding usernames from job files and modifies them for use in the cloud realm. In order for the first and last name to be shown when using the User group by the
xdmod-import-csv
command with the-t names
option should be used, which is the same method for loading full names for Jobs data.Each VM session will now have a person and system username associated with it. A sessions may have multiple events by multiple people during its run. We have choosen to use the person who started the session as the person who is responsible for the session and the resources used by it.
A new ConfigFilesMigration.php file has made for upgrading from a previous installation of xdmod as the new group by's need to be added to the roles.json file. This file checks to see if the cloud.json file exists in the CONFIG_DIR/roles.d folder and if it does the group by's are added to the public and default roles. If the cloud.json file is not found an exception is thrown. Since the cloud.json file exists in the roles.d folder the writeJsonConfigFile function in /classes/OpenXdmod/Migration/ConfigFilesMigration.php cannot be used so I added a function called writeJsonPartialConfigFile that will write a partial json config file to whatever partial config file subdirectory that is specified. When upgrading all previous events in the database will default to -1 for the person and system username. If a person re-ingests their events those events will be updated to have the correct person and system username.
The user group by for the cloud realm was already created but was not active. To activate it entries were added for in to the roles.json and datawarehouse.json files.
Tests performed
Manually tested in docker and new regressions tests for each group by have been added and the integration tests have been modified to account for the new group by's.
Types of changes
Checklist: