-
Notifications
You must be signed in to change notification settings - Fork 68
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Modular Acls: ETL Support #97
Conversation
@@ -0,0 +1,4 @@ | |||
CALL ${DESTINATION_SCHEMA}.add_module_with_version('xdmod', 'XDMoD', 6, 5, 0, ''); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Is this code tying to define the version numbers of the installed modules? The version numbers specified here are never going to be correct since this code will not be backported to the 6.5.0 release.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
That is definitely what the code is doing. One clarification I'd like to make is that the versions are stored historically ( in the module_versions table with a reference to the current version in modules ). So we can either leave the call as it currently while adding a line to call add_module_version
thereby having a record of both versions or we can just update it whatever the current version is when the code is just prior to integrating it into the system.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Surely the lines that define version numbers for the (optional) modules should be stored in the source code repos for the respective modules. We also already define the version number in the build.json files, does this change mean that we now have to maintain the same information in two different locations?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I definitely agree that the data specific to a particular module should reside within that module. That was certainly an oversight on my part and I'll work on moving the module specific data into its respective modules. In so far as duplicating data in build.json and some where else, that's something I definitely don't want to do. I'll see if there's a way we can utilize the modules build.json information in constructing the module creation sql. I'm currently working with Jeff to figure out the best way to have these sql statements executed on module install.
@@ -14,7 +14,7 @@ | |||
} | |||
} | |||
}, | |||
"acls": [ | |||
"acls-xdmod-management": [ | |||
{ | |||
"#": "AclTableManagement is meant to be run more than once in it's lifetime. Doing so will keep the associated tables up to date.", |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Please state when the acl-xdmod-management is intended to be run. I assume that you should run it every time the software is updated to a new version or a new module is installed.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I'll update the description to state more exactly when it is intended to be run. Which is basically whenever the structure of one of the tables it manages changes. New module installation may not trigger a run as hopefully they're just inserting data into the existing tables not modifying their structure.
@@ -16,7 +20,7 @@ public function handle() | |||
to input the values to the best of your knowledge. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The phrase "to the best of your knowledge" is redundant.
bin/acl-xdmod-import
Outdated
fi | ||
|
||
# Parse the command line arguments. | ||
while [[ $# -gt 0 ]] |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
How come you did not use getopt to parse the options?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Because this bit of code accomplished what I needed. Are there some common cases where this will fall down / I should look at getopt?
@@ -96,7 +96,7 @@ public function initialize(EtlOverseerOptions $etlOverseerOptions = null) | |||
$etlTableKey = key($this->etlDestinationTableList); | |||
if ( count($this->etlDestinationTableList) > 1 ) { | |||
$msg = $this . " does not support multiple ETL destination tables, using first table with key: '$etlTableKey'"; | |||
$logger->warning($msg); | |||
$this->logger->warning($msg); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This looks like a bug fix for ETL code. If so then does it really belong in this pull request?
-- currently supported modules. | ||
-- ============================================================================= | ||
|
||
CALL ${DESTINATION_SCHEMA}.add_module_with_version('xdmod', 'XDMoD', 6, 5, 0, ''); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Again. I would like to see this file being autogenerated by the build process.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
With the latest commit files in the configuration folder ( as specified on a module by module basis ) can now take advantage templating to get access to module version information. This includes the whole version string as well as each part of the version broken out so it can be referenced independently.
06dd549
to
703707f
Compare
This latest commit moves as much data as is feasible out of the sql files and into json files ( although the sql files are still present in case they are needed for reference / alternative methods ). I'm going to begin looking at how we can hook into the build process and snag the module version so that we don't have to have it defined in more than one place. |
classes/OpenXdmod/Build/Config.php
Outdated
{ | ||
$MAJOR = 1; | ||
$MINOR = 2; | ||
$MICRO = 3; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
conventionally this should be called patch I believe (http://semver.org/)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
'micro' ( and it's associated function ) have been renamed 'patch' as per semver.
bd7c2e8
to
356b0cf
Compare
}, | ||
"acls-xdmod-management": [ | ||
{ | ||
"#": "AclTableManagement is meant to be run more than once in it's lifetime. Doing so will keep the associated tables up to date.", |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Please change this comment to be more specific about when the AcLTableManagement should be run.
For example:
AclTableManagment should always be run after any changes are made to the definition files listed below.
["usr","metric_explorer",null,null,null], | ||
["usr","report_generator",null,null,null], | ||
["usr","about_xdmod",null,null,null], | ||
["usr","app_kernels",null,null,null], |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
We should not reference optional sub-modules in the main framework
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Absolutely, this has been resolved in the recent commits.
["about_xdmod","About","10000","0","XDMoD.Module.About","CCR.xdmod.ui.aboutXD","","About",null, "xdmod" ], | ||
["job_viewer","Job Viewer","5000","0","XDMoD.Module.JobViewer","CCR.xdmod.ui.jobViewer","View detailed job-level metrics","Job Viewer",null, "xdmod" ], | ||
["app_kernels","App Kernels","400","0","XDMoD.Module.AppKernels","CCR.xdmod.ui.appKernels","Displays data reflecting the reliability and performance of grid resources","App Kernels", null,"xdmod" ], | ||
["app_kernel_viewer","App Kernel Viewer","100","0","","","","","app_kernels","xdmod" ], |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Should not see app kernel tabs in here.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Absolutely, this has been resolved in the recent commits.
["my_allocations","Allocations","350","0","XDMoD.Module.Allocations","CCR.xdmod.ui.AllocationViewer","Displays your allocation usage","Allocations Tab",null, "xdmod" ], | ||
["compliance","Compliance","2000","0","XDMoD.Module.Compliance","CCR.xdmod.ui.complianceTab","","Compliance Tab",null, "xdmod" ], | ||
["custom_query","Custom Queries","3000","0","XDMoD.Module.CustomQueries","CCR.xdmod.ui.customQuery","","Custom Queries Tab",null, "xdmod" ], | ||
["sci_impact","Sci Impact","4000","0","XDMoD.Module.SciImpact","CCR.xdmod.ui.impact","Scientific Impact by user, organization, and project","Sci Impact Tab",null, "xdmod"] |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Should not see sci impact tab here. They are xsede-specific
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Absolutely, this has been resolved in the recent commits.
@@ -0,0 +1,3697 @@ | |||
INSERT INTO statistics (module_id, realm_id, name, display, alias, unit, decimals, formula, description) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This file should not be in the pull request (nor should any of the other SQL files that are autogenerated).
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
nod removed in bfa9434 ( removed all sql that isn't explicitly being used by the import pipeline. ) pulled / deployed to my local box and ran a quick import to make sure things were still working as expected. So far so good.
classes/OpenXdmod/Build/Config.php
Outdated
for ($i = 1; $i < $length; $i++) { | ||
switch ( $i ) { | ||
case $MAJOR: | ||
$this->versionMajor = trim($matches[$i]); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
No need to trim here since the regex cannot match any characters that would be trimmed.
classes/OpenXdmod/Build/Config.php
Outdated
$this->versionMajor = trim($matches[$i]); | ||
break; | ||
case $MINOR: | ||
$this->versionMinor = trim($matches[$i]); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
No need to trim here since the regex cannot match any characters that would be trimmed.
classes/OpenXdmod/Build/Config.php
Outdated
$this->versionMinor = trim($matches[$i]); | ||
break; | ||
case $PATCH: | ||
$this->versionPatch = trim($matches[$i]); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
No need to trim here since the regex cannot match any characters that would be trimmed.
classes/OpenXdmod/Build/Config.php
Outdated
$this->versionPatch = trim($matches[$i]); | ||
break; | ||
case $PRE_RELEASE: | ||
$this->versionPreRelease = trim($matches[i]); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
No need to trim here since the regex cannot match any characters that would be trimmed.
classes/OpenXdmod/Build/Packager.php
Outdated
$destination = implode( | ||
array($this->getPackageDir(), "configuration"), | ||
DIRECTORY_SEPARATOR | ||
); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I would prefer if the implode
parameters were reversed.
From http://php.net/implode:
Note:
implode() can, for historical reasons, accept its parameters in either order. For consistency with explode(), however, it may be less confusing to use the documented order of arguments.
classes/OpenXdmod/Build/Packager.php
Outdated
$destination = implode( | ||
array_merge(array($this->getPackageDir(), "configuration", "etl"), $directoryParts, array($this->config->getName())), | ||
DIRECTORY_SEPARATOR | ||
); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Another reversed implode
.
classes/OpenXdmod/Build/Packager.php
Outdated
$fileName = pathinfo($file, PATHINFO_FILENAME); | ||
$subDirectory = substr($baseDest, 0, strpos($baseDest, $fileName) - 1); | ||
|
||
$destinationFile = implode(array($destination, $subDirectory, $fileName), DIRECTORY_SEPARATOR); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Another reversed implode
.
3f2d457
to
5a3d1d3
Compare
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
A couple of relatively minor changes. Looks much improved.
bin/acl-config
Outdated
use Xdmod\Config; | ||
|
||
$opts = array( | ||
'r' => 'dryrun', |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Please change the key here to 't'
to be consistent with other XDMoD scripts
{ | ||
parent::__construct($currentVersion, $newVersion); | ||
|
||
$this->baseDirectory = realpath(BASE_DIR.DIRECTORY_SEPARATOR.'..'); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Please add spaces around concatenation operator here and in the lines below.
}, | ||
{ | ||
"name": "visible", | ||
"type": "boolean", |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Use tinyint(1)
instead of boolean
because MySQL converts boolean
under the hood and the information schema actually contains tinyint(1)
. This will cause continual alter table statements to be generated until I address the conversions.
"class": "ExecuteSql" | ||
} | ||
}, | ||
"#": "", |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Remove unused comment
* Ensure that the tables / data exists that will support the Acl subsystem | ||
* going into version 7.0. | ||
**/ | ||
class DatabaseMigration extends \OpenXdmod\Migration\DatabasesMigration |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Name needs to change to DatabasesMigration
bin/acl-import
Outdated
@@ -0,0 +1,5 @@ | |||
#!/usr/bin/env php |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Why do you have a php script that executes a subshell to run a program? Surely this could be a shell script with the following content:
acl-etl -pacls-import -g
Is this wrapper really needed? You could just call the acl-etl script directly.
bin/acl-xdmod-management
Outdated
@@ -0,0 +1,8 @@ | |||
#!/usr/bin/env php |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Why do you have a php script that executes a subshell to run a program? Surely this could be a shell script with the following content:
acl-etl -pacls-xdmod-management -g
Is this wrapper even needed? Surely you can just call the acl-etl script directly?
{ | ||
$scripts = array( | ||
'acl-config' => array(), | ||
'acl-import' => array( |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The acl-import script does not accept any command line arguments. Why are you passing arguments here and what are they supposed to do?
public function migrateTables() | ||
{ | ||
$scripts = array( | ||
'acl-xdmod-management' => array( |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
acl-xdmod-management does not accept commandline arguments. Why are they being passed?
$hadError = strpos($output, 'error') !== false; | ||
|
||
if ($hadError == true) { | ||
$this->logger->debug(<<<MSG |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
You need to log errors at a higher level than debug.
$this->console->displayMessage($sectionMessage); | ||
$this->console->displayBlankLine(); | ||
|
||
$cmd = "$scriptName -g"; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
These scripts do not accept commandline arguments so they will never see the -g option. Is this a problem?
@@ -25,4 +25,20 @@ public static function unlockMethod($classOrObject, $methodName) | |||
$method->setAccessible(true); | |||
return $method; | |||
} | |||
|
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Why has this unlockProperty function been added in this pull request? It does not appear to be used anywhere.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
It had been used in the now removed PackagerTemplatingTest class. We had a whole conversation about it at the time and how it belonged in the TestHelper class not in the Test Class itself. Which would also explain why it's still here. I'll go ahead and remove it.
- Added all files / code required to support the ACL ETL process. - Added documentation where appropriate ETL Table Definitions (Creation / Management of Tables): - configuration/etl/etl_tables.d/acls/*.json ETL SQL Files ( Initial Population of Tables ) - configuration/etl/etl_sql.d/acls/xdmod/*.sql ETL Pipelines: - configuration/etl/etl.d/acls-xdmod-import.json - responsible for the import / population of acl tables. - configuration/etl/etl.d/acls-xdmod-management.json - responsible for the creation / management of the acl tables. bash scripts: - bin/acl-xdmod-import - provides users with a wrapper around the acls-xdmod-import pipeline. - bin/acl-xdmod-management - provides users with a wrapper around the acls-xdmod-management pipeline. xdmod-setup Menu Items: - Acl Setup: XDMoD - gathers information from the user and calls bin/acl-xdmod-import reports any errors encountered. - Acl Import: XDMoD - gathers information from the user and calls bin/acl-xdmod-management reports any errors encountered. Documentation: - Each SQL statement is documented as to it's purpose and when it should be executed. - Added overview documentation for the tables being added - Added documentation that explains the new xdmod-setup Menu Items.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
cloud_jobs.json
should not be removed.
@smgallo jobs_[cloud|hpc].json re-added. hpc is minus the "XdcdbJobRecordIngestor" which has been added to a jobs_hpc.json file in the xsede repo ( under the same pipeline name ). |
@ryanrath I need to give this some thought because the order of execution is important (the jobs ingestor action has to run before the aggregator action) and when the actions are added to the pipeline they will be added to the end. Lets discuss tomorrow. |
- This tool transforms information stored in human readable configuration files ( roles.json and datawarehouse.json ) into a database representation that can be referenced by the system at run time. - It provides the ability to perform basic validation of the configuration files that it uses. Communicating to the user when an invalid file has been found. - It also provides a 'dryrun' option which allows a user to see which records will be added to their database without actually inserting them. - Added a help option that displays the general usage of acl-config along with the various options both short and long with a brief description of each. - Added a function to Packager.php that will create a directory (modules.d) / file ( <module_name>.json ) that contains information from a modules build.json file. This enables information from build.json to outlive the Packaging process and to signal that a particular module is installed. New Helper Class: Roles - This class's sole purpose for being is to expose various private portions of the abstract class aRole that are needed for consistent use of Config ( i.e. the 'extend' functionality ). - Also provides a 'module' aware version of 'getConfig' that will only retrieve the values from a particular 'modules' files instead of the amalgamated whole. Placeholder json file for new modules config category - Need at least an empty json document for our merging config file system to work properly. This one is for modules.d. Adding modules.json to be included when xdmod is packaged - Added modules.json to the list of files to be included when the module 'xdmod' is packaged. Cleaning Up acls-xdmod-management ETL Pipeline - Removed statistics_hierarchy as it doesn't exist anymore. - Removed second instance of module_versions as it doesn't need to be processed twice. Adding 'type' to the set of sections processed for roles - We also want to retrieve information on type as this has been added to ccr-private-xdmod/conf.d/modules/xsede/configuration/roles.json Rework the way in which the data gathered - Reworked the way in which data is gathered for both roles and datawarehouse. It should now be guaranteed that the information in each '$module' section is information that only pertains to that module, not the merged whole. Make sure xdmod ( default module ) is processed first - Since it's imperative that the default module be processed before other 3rd party modules. To that end the algorithm for processing a module was extracted into its own function. Now we can test for the existence of default module, processing it first / removing it from the results which can be processed as normal. Adding the acl-* scripts in the bin directory - added acl-config and acl-xdmod-management to the files list for rpm builds. Added a warning to not modify the auto-generated file. - Added a warning to the auto-generated file so that users know not to modify the file. Or at least if they do that it's at their own risk. - Added a conditional to acl-config to strip the newly added warning message from the modules data. Updated installed_on -> packaged_on - Updated per conversation with @jtpalmer Updated createModuleFile to use CCR\Json - swapped out file_put_contents for \CCR\Json::saveFile as we get pretty printing and error handling this way. Removed processModuleFiles as it is no longer needed - the purpose of processModuleFiles was to process module specific templates, allowing the injection of information contained within build.json ( module name, version etc. ) into data that would then be consumed by some sort of etl process. Now that we're generating files for each module (modules.d/<module>.json) this feature is no longer required. Adding a new configuration directory / file - Added a configuration directory / file set and the corresponding section to utilize it in acl-config. This new section is hierarchies.json/hierarchies.d. The contents of which will be processed into the 'hierarchies' table. - There are some accompanying changes in ccr-private xdmod to roles to support acl_hierarchies records. *** ( changes inadvertently squashed into this commit ) - Removed the json data files that are no longer needed ( etl/etl_data.d/acls/xdmod/*.json ). - Removed PackagerTemplatingTest.php as the function is no longer being used / has been removed. *** NOTES: These changes were added after going through which acl tables were populated via which configuration files: - modules.json - modules - module_versions - datawarehouse.json - realms - group_bys - statistics - *acl_group_bys ( via being referenced when processing roles.json) - roles.json - acls - tabs - acl_tabs - *acl_group_bys ( utilizing information from datawarehouse.json ) The following tables are populated via sql scripts on install / upgrade and via the system thereafter ( i.e. we have no concept of these tables being modified via a config file ). - user_acls - user_acl_group_by_parameters ( UserRoleParameters ) This left the following tables that needed to be populated: - hiearchies - acl_types - acl_hierarchies Of these three, two can be handled by incorporating some additional data into the roles.json file (as they're related to acls, acl_types and acl_hierarchies). This leaves hiearchies, which we now populate via: - configuration/hierarchies.[d|json] So net change for a module writer is some additional information in roles.json and one new file ( if their module has any hierarchies it utilizes ). acl-* script changes - re-added acl-import due to needing to populate two tables via sql script: - user_acls - user_acl_group_by_parameters - ultimately re-wrote acl-import / acl-xdmod-management to be based off of the bash3boilerplate project. This provides some nice logging, arg parsing, error handling functionality that we don't have to maintain / think about. - b3bp file main.sh kept in a new dir bin/deps - ripped out the duplicated code in acl-import / acl-xdmod-management and placed it in a 'parent' script called 'acl-etl' located in bin/deps - fixed syntax error in acl-config: missing semi - removed unused sql: update_module_version.sql Modified acl-import ETL pipeline - no longer using the structured file imports but we do have a few sql files. - modified the defaults to reflect no longer using StructuredFileInstor but instead using the ExecuteSql. - Removed all actions that are no longer being used which leaves: - user_acls.sql - user_acl_group_by_parameters.sql Added a DatabaseMigration from 6.6.0 to 7.0.0 - This migration executes acl-xdmod-management, acl-import and acl-config in the to ensure that the database tables are setup / populated correctly. Added a new AclConfig Setup Item / Removed unused - Added a new SetupItem that handles executing the acl-config script. This new item was appended to the end of the DatabaseSetup section. - Removed a DatabaseMigration that is no longer used. Acl Documentation Update - Updated the Acl documentation to bring it up to date. Changed the order in which the Acl steps run - AclConfig is responsible for populating the definition tables and as such needs to be executed *before* AclImport which populates some of the relation tables. Updates per code review comments by @smgallo - Changed '-r|--dryrun' to '-t|--dryrun'. - Code format cleanup to surround string concat operations with spaces. - removed empty comment in acls-import.json - Updated 'boolean' column declarations to utilize 'tinyint(1)' Updating acl-config linker require - Replacing with the standard linker path used in the other php scripts in this directory. This is so that it is replaced with the appropriate value during the install process. Ensure verifyJsonSyntax checks for existence - Before trying to read either the config file or config dir, ensure the file / dir exists then proceed with the required operations. Updated acl scripts - removed bash script deps directory as it is no longer needed - add acl-etl which serves as a basic wrapper script for running sections - update acl-[import|xdmod-management] to be php scripts instead of bash. They now call acl-etl with their respective sections. - Simplified the logic in Acl[Config|Import].php Fixing up issues with acl-config - Just making the script more robust / taking care of a few things like missing global includes. relaxing the memory_limit for etl_verseer - Ran into memory exceptions while testing the acls-import / acls-xdmod-management sections. removing the ini_set('memory_limit') - Steve is looking at introducing a fix for the underlying memory jump problem in EtlLock.php. This won't need to be here once that goes in. Updating the DatabaseMigration for 660to770 - Updated to use the new style scripts - Updated the order in which the tables are populated ( i.e. the order the scripts are executed.) Updating DatabaseMigration name - Migration doesn't work when it's not named correctly. Updated the comment stripping for ExecuteSql - Per conversation with @smgallo, removing c-style comments so that we don't accidentally strip query hints. Updates to 660To700 Database Migration - Updated the name so that it's actually executed during xdmod-upgrade - Updated the logging levels so that users will actually see the logging when appropriate - added an abnormal exit if the script encounters an error so that the process can be attempted again after the issue is resolved. Updates to acl-config and roles.json - Added a section to acl-config that ensures that the public user is present in the system. - Updated roles.json with the additional information required for the acl population process. Updating the acl-import / xdmod-acl-management scripts per @jpwhite4 review comments: - Changing the convenience scripts acl-import / xdmod-acl-management to simple bash scripts. - Updated the calling of the convenience scripts so that they no longer provide additional arguments as none are needed. Removing unused TestHelper function - The only test that utilized this function is no longer present. Removing Public User Code - Having a database representation of Public User needs code changes to operate correctly. The removed code will be re-added to acl_xduser. Adding acl specific changes to the xdmod-setup expect script - Adding the lines required to handle the updated Database section of xdmod-setup. re-added jobs_*.json - jobs_hpc.json had the XdcdbJobRecordIngestor section removed as the specified db (tgcdbmirror) is not currently available in open xdmod installs. Updates to jobs_* pipelines - renamed jobs_hpc to jobs_common - renamed the xdcdb-jobs pipline -> jobs-common - removed the files that the xsede specific actions utilized as they are no longer needed here. Updating the default section name - need to update the pipeline defaults so that it references jobs-common as opposed to xdcdb-jobs.
Modular Acls: ETL Support (ubccr#97)
* ACL ETL - Added all files / code required to support the ACL ETL process. - Added documentation where appropriate ETL Table Definitions (Creation / Management of Tables): - configuration/etl/etl_tables.d/acls/*.json ETL SQL Files ( Initial Population of Tables ) - configuration/etl/etl_sql.d/acls/xdmod/*.sql ETL Pipelines: - configuration/etl/etl.d/acls-xdmod-import.json - responsible for the import / population of acl tables. - configuration/etl/etl.d/acls-xdmod-management.json - responsible for the creation / management of the acl tables. bash scripts: - bin/acl-xdmod-import - provides users with a wrapper around the acls-xdmod-import pipeline. - bin/acl-xdmod-management - provides users with a wrapper around the acls-xdmod-management pipeline. xdmod-setup Menu Items: - Acl Setup: XDMoD - gathers information from the user and calls bin/acl-xdmod-import reports any errors encountered. - Acl Import: XDMoD - gathers information from the user and calls bin/acl-xdmod-management reports any errors encountered. Documentation: - Each SQL statement is documented as to it's purpose and when it should be executed. - Added overview documentation for the tables being added - Added documentation that explains the new xdmod-setup Menu Items. * Initial Commit of Acl Configuration Tool - This tool transforms information stored in human readable configuration files ( roles.json and datawarehouse.json ) into a database representation that can be referenced by the system at run time. - It provides the ability to perform basic validation of the configuration files that it uses. Communicating to the user when an invalid file has been found. - It also provides a 'dryrun' option which allows a user to see which records will be added to their database without actually inserting them. - Added a help option that displays the general usage of acl-config along with the various options both short and long with a brief description of each. - Added a function to Packager.php that will create a directory (modules.d) / file ( <module_name>.json ) that contains information from a modules build.json file. This enables information from build.json to outlive the Packaging process and to signal that a particular module is installed. New Helper Class: Roles - This class's sole purpose for being is to expose various private portions of the abstract class aRole that are needed for consistent use of Config ( i.e. the 'extend' functionality ). - Also provides a 'module' aware version of 'getConfig' that will only retrieve the values from a particular 'modules' files instead of the amalgamated whole. Placeholder json file for new modules config category - Need at least an empty json document for our merging config file system to work properly. This one is for modules.d. Adding modules.json to be included when xdmod is packaged - Added modules.json to the list of files to be included when the module 'xdmod' is packaged. Cleaning Up acls-xdmod-management ETL Pipeline - Removed statistics_hierarchy as it doesn't exist anymore. - Removed second instance of module_versions as it doesn't need to be processed twice. Adding 'type' to the set of sections processed for roles - We also want to retrieve information on type as this has been added to ccr-private-xdmod/conf.d/modules/xsede/configuration/roles.json Rework the way in which the data gathered - Reworked the way in which data is gathered for both roles and datawarehouse. It should now be guaranteed that the information in each '$module' section is information that only pertains to that module, not the merged whole. Make sure xdmod ( default module ) is processed first - Since it's imperative that the default module be processed before other 3rd party modules. To that end the algorithm for processing a module was extracted into its own function. Now we can test for the existence of default module, processing it first / removing it from the results which can be processed as normal. Adding the acl-* scripts in the bin directory - added acl-config and acl-xdmod-management to the files list for rpm builds. Added a warning to not modify the auto-generated file. - Added a warning to the auto-generated file so that users know not to modify the file. Or at least if they do that it's at their own risk. - Added a conditional to acl-config to strip the newly added warning message from the modules data. Updated installed_on -> packaged_on - Updated per conversation with @jtpalmer Updated createModuleFile to use CCR\Json - swapped out file_put_contents for \CCR\Json::saveFile as we get pretty printing and error handling this way. Removed processModuleFiles as it is no longer needed - the purpose of processModuleFiles was to process module specific templates, allowing the injection of information contained within build.json ( module name, version etc. ) into data that would then be consumed by some sort of etl process. Now that we're generating files for each module (modules.d/<module>.json) this feature is no longer required. Adding a new configuration directory / file - Added a configuration directory / file set and the corresponding section to utilize it in acl-config. This new section is hierarchies.json/hierarchies.d. The contents of which will be processed into the 'hierarchies' table. - There are some accompanying changes in ccr-private xdmod to roles to support acl_hierarchies records. *** ( changes inadvertently squashed into this commit ) - Removed the json data files that are no longer needed ( etl/etl_data.d/acls/xdmod/*.json ). - Removed PackagerTemplatingTest.php as the function is no longer being used / has been removed. *** NOTES: These changes were added after going through which acl tables were populated via which configuration files: - modules.json - modules - module_versions - datawarehouse.json - realms - group_bys - statistics - *acl_group_bys ( via being referenced when processing roles.json) - roles.json - acls - tabs - acl_tabs - *acl_group_bys ( utilizing information from datawarehouse.json ) The following tables are populated via sql scripts on install / upgrade and via the system thereafter ( i.e. we have no concept of these tables being modified via a config file ). - user_acls - user_acl_group_by_parameters ( UserRoleParameters ) This left the following tables that needed to be populated: - hiearchies - acl_types - acl_hierarchies Of these three, two can be handled by incorporating some additional data into the roles.json file (as they're related to acls, acl_types and acl_hierarchies). This leaves hiearchies, which we now populate via: - configuration/hierarchies.[d|json] So net change for a module writer is some additional information in roles.json and one new file ( if their module has any hierarchies it utilizes ). acl-* script changes - re-added acl-import due to needing to populate two tables via sql script: - user_acls - user_acl_group_by_parameters - ultimately re-wrote acl-import / acl-xdmod-management to be based off of the bash3boilerplate project. This provides some nice logging, arg parsing, error handling functionality that we don't have to maintain / think about. - b3bp file main.sh kept in a new dir bin/deps - ripped out the duplicated code in acl-import / acl-xdmod-management and placed it in a 'parent' script called 'acl-etl' located in bin/deps - fixed syntax error in acl-config: missing semi - removed unused sql: update_module_version.sql Modified acl-import ETL pipeline - no longer using the structured file imports but we do have a few sql files. - modified the defaults to reflect no longer using StructuredFileInstor but instead using the ExecuteSql. - Removed all actions that are no longer being used which leaves: - user_acls.sql - user_acl_group_by_parameters.sql Added a DatabaseMigration from 6.6.0 to 7.0.0 - This migration executes acl-xdmod-management, acl-import and acl-config in the to ensure that the database tables are setup / populated correctly. Added a new AclConfig Setup Item / Removed unused - Added a new SetupItem that handles executing the acl-config script. This new item was appended to the end of the DatabaseSetup section. - Removed a DatabaseMigration that is no longer used. Acl Documentation Update - Updated the Acl documentation to bring it up to date. Changed the order in which the Acl steps run - AclConfig is responsible for populating the definition tables and as such needs to be executed *before* AclImport which populates some of the relation tables. Updates per code review comments by @smgallo - Changed '-r|--dryrun' to '-t|--dryrun'. - Code format cleanup to surround string concat operations with spaces. - removed empty comment in acls-import.json - Updated 'boolean' column declarations to utilize 'tinyint(1)' Updating acl-config linker require - Replacing with the standard linker path used in the other php scripts in this directory. This is so that it is replaced with the appropriate value during the install process. Ensure verifyJsonSyntax checks for existence - Before trying to read either the config file or config dir, ensure the file / dir exists then proceed with the required operations. Updated acl scripts - removed bash script deps directory as it is no longer needed - add acl-etl which serves as a basic wrapper script for running sections - update acl-[import|xdmod-management] to be php scripts instead of bash. They now call acl-etl with their respective sections. - Simplified the logic in Acl[Config|Import].php Fixing up issues with acl-config - Just making the script more robust / taking care of a few things like missing global includes. relaxing the memory_limit for etl_verseer - Ran into memory exceptions while testing the acls-import / acls-xdmod-management sections. removing the ini_set('memory_limit') - Steve is looking at introducing a fix for the underlying memory jump problem in EtlLock.php. This won't need to be here once that goes in. Updating the DatabaseMigration for 660to770 - Updated to use the new style scripts - Updated the order in which the tables are populated ( i.e. the order the scripts are executed.) Updating DatabaseMigration name - Migration doesn't work when it's not named correctly. Updated the comment stripping for ExecuteSql - Per conversation with @smgallo, removing c-style comments so that we don't accidentally strip query hints. Updates to 660To700 Database Migration - Updated the name so that it's actually executed during xdmod-upgrade - Updated the logging levels so that users will actually see the logging when appropriate - added an abnormal exit if the script encounters an error so that the process can be attempted again after the issue is resolved. Updates to acl-config and roles.json - Added a section to acl-config that ensures that the public user is present in the system. - Updated roles.json with the additional information required for the acl population process. Updating the acl-import / xdmod-acl-management scripts per @jpwhite4 review comments: - Changing the convenience scripts acl-import / xdmod-acl-management to simple bash scripts. - Updated the calling of the convenience scripts so that they no longer provide additional arguments as none are needed. Removing unused TestHelper function - The only test that utilized this function is no longer present. Removing Public User Code - Having a database representation of Public User needs code changes to operate correctly. The removed code will be re-added to acl_xduser. Adding acl specific changes to the xdmod-setup expect script - Adding the lines required to handle the updated Database section of xdmod-setup. re-added jobs_*.json - jobs_hpc.json had the XdcdbJobRecordIngestor section removed as the specified db (tgcdbmirror) is not currently available in open xdmod installs. Updates to jobs_* pipelines - renamed jobs_hpc to jobs_common - renamed the xdcdb-jobs pipline -> jobs-common - removed the files that the xsede specific actions utilized as they are no longer needed here. Updating the default section name - need to update the pipeline defaults so that it references jobs-common as opposed to xdcdb-jobs.
* ACL ETL - Added all files / code required to support the ACL ETL process. - Added documentation where appropriate ETL Table Definitions (Creation / Management of Tables): - configuration/etl/etl_tables.d/acls/*.json ETL SQL Files ( Initial Population of Tables ) - configuration/etl/etl_sql.d/acls/xdmod/*.sql ETL Pipelines: - configuration/etl/etl.d/acls-xdmod-import.json - responsible for the import / population of acl tables. - configuration/etl/etl.d/acls-xdmod-management.json - responsible for the creation / management of the acl tables. bash scripts: - bin/acl-xdmod-import - provides users with a wrapper around the acls-xdmod-import pipeline. - bin/acl-xdmod-management - provides users with a wrapper around the acls-xdmod-management pipeline. xdmod-setup Menu Items: - Acl Setup: XDMoD - gathers information from the user and calls bin/acl-xdmod-import reports any errors encountered. - Acl Import: XDMoD - gathers information from the user and calls bin/acl-xdmod-management reports any errors encountered. Documentation: - Each SQL statement is documented as to it's purpose and when it should be executed. - Added overview documentation for the tables being added - Added documentation that explains the new xdmod-setup Menu Items. * Initial Commit of Acl Configuration Tool - This tool transforms information stored in human readable configuration files ( roles.json and datawarehouse.json ) into a database representation that can be referenced by the system at run time. - It provides the ability to perform basic validation of the configuration files that it uses. Communicating to the user when an invalid file has been found. - It also provides a 'dryrun' option which allows a user to see which records will be added to their database without actually inserting them. - Added a help option that displays the general usage of acl-config along with the various options both short and long with a brief description of each. - Added a function to Packager.php that will create a directory (modules.d) / file ( <module_name>.json ) that contains information from a modules build.json file. This enables information from build.json to outlive the Packaging process and to signal that a particular module is installed. New Helper Class: Roles - This class's sole purpose for being is to expose various private portions of the abstract class aRole that are needed for consistent use of Config ( i.e. the 'extend' functionality ). - Also provides a 'module' aware version of 'getConfig' that will only retrieve the values from a particular 'modules' files instead of the amalgamated whole. Placeholder json file for new modules config category - Need at least an empty json document for our merging config file system to work properly. This one is for modules.d. Adding modules.json to be included when xdmod is packaged - Added modules.json to the list of files to be included when the module 'xdmod' is packaged. Cleaning Up acls-xdmod-management ETL Pipeline - Removed statistics_hierarchy as it doesn't exist anymore. - Removed second instance of module_versions as it doesn't need to be processed twice. Adding 'type' to the set of sections processed for roles - We also want to retrieve information on type as this has been added to ccr-private-xdmod/conf.d/modules/xsede/configuration/roles.json Rework the way in which the data gathered - Reworked the way in which data is gathered for both roles and datawarehouse. It should now be guaranteed that the information in each '$module' section is information that only pertains to that module, not the merged whole. Make sure xdmod ( default module ) is processed first - Since it's imperative that the default module be processed before other 3rd party modules. To that end the algorithm for processing a module was extracted into its own function. Now we can test for the existence of default module, processing it first / removing it from the results which can be processed as normal. Adding the acl-* scripts in the bin directory - added acl-config and acl-xdmod-management to the files list for rpm builds. Added a warning to not modify the auto-generated file. - Added a warning to the auto-generated file so that users know not to modify the file. Or at least if they do that it's at their own risk. - Added a conditional to acl-config to strip the newly added warning message from the modules data. Updated installed_on -> packaged_on - Updated per conversation with @jtpalmer Updated createModuleFile to use CCR\Json - swapped out file_put_contents for \CCR\Json::saveFile as we get pretty printing and error handling this way. Removed processModuleFiles as it is no longer needed - the purpose of processModuleFiles was to process module specific templates, allowing the injection of information contained within build.json ( module name, version etc. ) into data that would then be consumed by some sort of etl process. Now that we're generating files for each module (modules.d/<module>.json) this feature is no longer required. Adding a new configuration directory / file - Added a configuration directory / file set and the corresponding section to utilize it in acl-config. This new section is hierarchies.json/hierarchies.d. The contents of which will be processed into the 'hierarchies' table. - There are some accompanying changes in ccr-private xdmod to roles to support acl_hierarchies records. *** ( changes inadvertently squashed into this commit ) - Removed the json data files that are no longer needed ( etl/etl_data.d/acls/xdmod/*.json ). - Removed PackagerTemplatingTest.php as the function is no longer being used / has been removed. *** NOTES: These changes were added after going through which acl tables were populated via which configuration files: - modules.json - modules - module_versions - datawarehouse.json - realms - group_bys - statistics - *acl_group_bys ( via being referenced when processing roles.json) - roles.json - acls - tabs - acl_tabs - *acl_group_bys ( utilizing information from datawarehouse.json ) The following tables are populated via sql scripts on install / upgrade and via the system thereafter ( i.e. we have no concept of these tables being modified via a config file ). - user_acls - user_acl_group_by_parameters ( UserRoleParameters ) This left the following tables that needed to be populated: - hiearchies - acl_types - acl_hierarchies Of these three, two can be handled by incorporating some additional data into the roles.json file (as they're related to acls, acl_types and acl_hierarchies). This leaves hiearchies, which we now populate via: - configuration/hierarchies.[d|json] So net change for a module writer is some additional information in roles.json and one new file ( if their module has any hierarchies it utilizes ). acl-* script changes - re-added acl-import due to needing to populate two tables via sql script: - user_acls - user_acl_group_by_parameters - ultimately re-wrote acl-import / acl-xdmod-management to be based off of the bash3boilerplate project. This provides some nice logging, arg parsing, error handling functionality that we don't have to maintain / think about. - b3bp file main.sh kept in a new dir bin/deps - ripped out the duplicated code in acl-import / acl-xdmod-management and placed it in a 'parent' script called 'acl-etl' located in bin/deps - fixed syntax error in acl-config: missing semi - removed unused sql: update_module_version.sql Modified acl-import ETL pipeline - no longer using the structured file imports but we do have a few sql files. - modified the defaults to reflect no longer using StructuredFileInstor but instead using the ExecuteSql. - Removed all actions that are no longer being used which leaves: - user_acls.sql - user_acl_group_by_parameters.sql Added a DatabaseMigration from 6.6.0 to 7.0.0 - This migration executes acl-xdmod-management, acl-import and acl-config in the to ensure that the database tables are setup / populated correctly. Added a new AclConfig Setup Item / Removed unused - Added a new SetupItem that handles executing the acl-config script. This new item was appended to the end of the DatabaseSetup section. - Removed a DatabaseMigration that is no longer used. Acl Documentation Update - Updated the Acl documentation to bring it up to date. Changed the order in which the Acl steps run - AclConfig is responsible for populating the definition tables and as such needs to be executed *before* AclImport which populates some of the relation tables. Updates per code review comments by @smgallo - Changed '-r|--dryrun' to '-t|--dryrun'. - Code format cleanup to surround string concat operations with spaces. - removed empty comment in acls-import.json - Updated 'boolean' column declarations to utilize 'tinyint(1)' Updating acl-config linker require - Replacing with the standard linker path used in the other php scripts in this directory. This is so that it is replaced with the appropriate value during the install process. Ensure verifyJsonSyntax checks for existence - Before trying to read either the config file or config dir, ensure the file / dir exists then proceed with the required operations. Updated acl scripts - removed bash script deps directory as it is no longer needed - add acl-etl which serves as a basic wrapper script for running sections - update acl-[import|xdmod-management] to be php scripts instead of bash. They now call acl-etl with their respective sections. - Simplified the logic in Acl[Config|Import].php Fixing up issues with acl-config - Just making the script more robust / taking care of a few things like missing global includes. relaxing the memory_limit for etl_verseer - Ran into memory exceptions while testing the acls-import / acls-xdmod-management sections. removing the ini_set('memory_limit') - Steve is looking at introducing a fix for the underlying memory jump problem in EtlLock.php. This won't need to be here once that goes in. Updating the DatabaseMigration for 660to770 - Updated to use the new style scripts - Updated the order in which the tables are populated ( i.e. the order the scripts are executed.) Updating DatabaseMigration name - Migration doesn't work when it's not named correctly. Updated the comment stripping for ExecuteSql - Per conversation with @smgallo, removing c-style comments so that we don't accidentally strip query hints. Updates to 660To700 Database Migration - Updated the name so that it's actually executed during xdmod-upgrade - Updated the logging levels so that users will actually see the logging when appropriate - added an abnormal exit if the script encounters an error so that the process can be attempted again after the issue is resolved. Updates to acl-config and roles.json - Added a section to acl-config that ensures that the public user is present in the system. - Updated roles.json with the additional information required for the acl population process. Updating the acl-import / xdmod-acl-management scripts per @jpwhite4 review comments: - Changing the convenience scripts acl-import / xdmod-acl-management to simple bash scripts. - Updated the calling of the convenience scripts so that they no longer provide additional arguments as none are needed. Removing unused TestHelper function - The only test that utilized this function is no longer present. Removing Public User Code - Having a database representation of Public User needs code changes to operate correctly. The removed code will be re-added to acl_xduser. Adding acl specific changes to the xdmod-setup expect script - Adding the lines required to handle the updated Database section of xdmod-setup. re-added jobs_*.json - jobs_hpc.json had the XdcdbJobRecordIngestor section removed as the specified db (tgcdbmirror) is not currently available in open xdmod installs. Updates to jobs_* pipelines - renamed jobs_hpc to jobs_common - renamed the xdcdb-jobs pipline -> jobs-common - removed the files that the xsede specific actions utilized as they are no longer needed here. Updating the default section name - need to update the pipeline defaults so that it references jobs-common as opposed to xdcdb-jobs.
Adding in the files required to bootstrap / support the new modular Acl system.
Description
There are two ETL pipelines:
Included are the table definition files ( etl/etl_tables.d/acls/.json ) and the scripts for populating them
( etl/etl_sql.d/acls/.sql ). Where possible the scripts attempt to only insert records when a record that matches the values being inserted is not found.
Motivation and Context
To serve as the basis for the rest of the Acl system ( and to provide a baseline for the rest of the broken up PR's )
Tests performed
manual tests on a dev cloud instance installing locally.
Types of changes
Checklist: