Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Round Trip Conversions and Testing #405

Closed
wants to merge 101 commits into from
Closed

Round Trip Conversions and Testing #405

wants to merge 101 commits into from

Conversation

howieavp76
Copy link

Committer Notes

This PR provides round trip testing of conversions from XML->JSON->XML. It has been refactored to work within the CI/CD pipeline and to match the structure/template of Bash code for other scripts. It is now config driven, temporal artifacts are published as CI/CD artifacts for troubleshooting, and the readme is updated with the latest info on how to run the scripts. Unnecessary Python packages were removed and .gitignores updated to reflect dynamic builds in CI/CD.

Some tests do fail but the failures appear to be accurate. Some work is still needed in the conversions and adding these tests to the CI/CD pipeline will assist with troubleshooting.

This PR is related to #342 and #343 Issues.

All Submissions:

  • [ x] Have you followed the guidelines in our Contributing document?
  • [x ] Have you checked to ensure there aren't other open Pull Requests for the same update/change?
  • [x ] Have you squashed any non-relevant commits and commit messages? [instructions]

Changes to Core Features:

  • [ x] Have you added an explanation of what your changes do and why you'd like us to include them?
  • [ x] Have you written new tests for your core changes, as applicable?
  • [x ] Have you included examples of how to use your new feature(s)?

Remove some JSON testing custom code and pip installs for Python in favor of AJV-CLI.  Added XPath to the XML comparison code. Testing for XMLDIFF.
No longer using several external libraries
FIxed git ignores
Changed from -v check to something more compatible
Remove some JSON testing custom code and pip installs for Python in favor of AJV-CLI.  Added XPath to the XML comparison code. Testing for XMLDIFF.
No longer using several external libraries
FIxed git ignores
Changed from -v check to something more compatible
Fixed the -v statements to use a more portable -z statement.
Removes XMLDiff

Adds exit code to the Python code, does error count tracking
Working for all NIST 800-53 checks (FedRAMP currently bombing)
Printf messages are now color coded for improved readability in large log files
Round trip files now written to the Temp directory, this directory added to build artifacts in CI/CD
Build files for roundtrip now carry their base naming conventions for troubleshooting
Typo on base image
Trying to get it working end to end with local file
Fix CI/CD bug finding Python/Saxon scripts
Get etree.parse to work in CI/CD
Use the supplied build directory
Adds Maven to the Circle CI image
@howieavp76
Copy link
Author

@david-waltermire-nist I haven't refactored to use your colors yet. Wanted to get the base functionality over so that you and @wendellpiez could see the errors in the conversions. Did my best to match your approach on the scripts and to use the same dynamic logic for parsing files.

@david-waltermire
Copy link
Contributor

I had to make a new PR (#410) to address the following issues:

  • The CircleCI configuration in this PR was out-of-date with the current config in master. The new round-trip job also didn't make use of the build directory to store all files produced by the build. Instead a temp directory was used in the repo directory, which complicates management of deployed files later in the build. The necessary changes were made in Issue343 roundtrip fixes #410 to bring the configuration in sync with the other build jobs. Round-trip files will get published as artifacts now without any additional scripting being needed.
  • There were a bunch of extra files checked in that needed to be removed, because they are IDE, generated, or dependency files that shouldn't be checked in. These files were located in .vscode, build/ci-cd/temp, and build/ci-cd/python/saxon9he.jar.
  • The script, build/ci-cd/python/xmlComparison.py, would not run on the SP 800-53 catalog due to the presence of an XML comment in the file and a recursion depth causing an errors. After analyzing the code, the program was found to be kn efficient in the use of memory, due to the creation of unneeded class instances. After some attempts to fix the script, I instead rewrote the script to use a simple recursive function to do the analysis. At the same time, I reduced the amount of logged output from the script to focus on reporting only errors, and to provide an XPath pointing to the location of the problem in the source file.
  • The build/ci-cd/roundTripXML.sh script needed to be rewritten to make it more extensible beyond catalog and profile content, and to fix pathing issues running Saxon that were causing the script not to function properly. The script was also updated to use the common xml_transform bash routine used in the other sripts instead of inline execution of Java. The JSON->XML->JSON functionality was removed, since the comparison script doesn't support this pathway. The validation of content was also removed since this is handled in other scripts in the CI/CD workflow, and is duplicative work.
  • The build/ci-cd/config/content configuration file was restored to the original version instead of the version checked in this PR, since it was breaking other aspects of the build that is responsible for content generation.
  • There is still an issue with using the -it:start CLI option for running Saxon transforms. A template named "start" is not present in the generated versions of the JSON->XML content converter. This PR added this name to the the XML->JSON and the JSON->XML converter XSLT files, but these changes are lost as soon as the converters are regenerated. I will check with @wendellpiez about addressing this problem in the converter generation. This change is only needed in the JSON->XML converters, and not the XML->JSON converters.

@howieavp76
Copy link
Author

@david-waltermire-nist - A few clarifications on the PR from the comments above:

  • When testing locally in my branch, it was publishing the artifacts to the build directory in CI/CD. It would write them locally and the last step would do a copy from the /temp directory to the build artifacts directory.
  • I think there was an issue with Git caching old files. The .gitignore in my branch should have excluded the /temp directory and all *.jar files. I did not catch that going over. I have removed the cache and re-initialized Git so that doesn't happen again.
  • I am not sure what pathing issue was encountered with Saxon. I used a local .jar file when testing on my machine and then a dynamic build with Maven when running in CI/CD. Both were able to work in my branch. I did run into pathing issues with the ** in the build/ci-cd/config/content file where it would not parse properly on my Mac. I also ran into issues with the FedRAMP content not validating from that file. I would assume that is now working from your refactor of the Python script.
  • I am aware of the -it:start CLI issue. Wendall and I did some troubleshooting to get that to work and I reminded him we would need a fix in the transforms on Gitter.

When you are free, I would like to chat further on how I can best remedy going forward to ensure we meet your expectations in the last sprint.

@secautobuilder
Copy link

@howieavp76 - Some comments on your comments.

@david-waltermire-nist - A few clarifications on the PR from the comments above:

  • When testing locally in my branch, it was publishing the artifacts to the build directory in CI/CD. It would write them locally and the last step would do a copy from the /temp directory to the build artifacts directory.

It should have written the build artifacts directly to the build directory. No need for a temp directory or a copy. This is how all the other scripts work.

  • I think there was an issue with Git caching old files. The .gitignore in my branch should have excluded the /temp directory and all *.jar files. I did not catch that going over. I have removed the cache and re-initialized Git so that doesn't happen again.

No need. I already cleaned all the extraneous files up. I also had to deal with a few git merges that left the commit history a bit of a mess. I had to resolve a ton of conflicts as a result. Spent about an hour working though all of that. In the future, you will want to use git rebase upstream/master instead of doing a git pull/merge upstream master to avoid that type of issue. Using git rebase keeps the commit history linear. Merging creates a branched revision tree.

  • I am not sure what pathing issue was encountered with Saxon. I used a local .jar file when testing on my machine and then a dynamic build with Maven when running in CI/CD. Both were able to work in my branch. I did run into pathing issues with the ** in the build/ci-cd/config/content file where it would not parse properly on my Mac. I also ran into issues with the FedRAMP content not validating from that file. I would assume that is now working from your refactor of the Python script.

I asked during the last team meeting to not check in the Saxon JAR. This can be downloaded using Maven. This is how the rest of the build is working. No need to do anything special for the round trip script. You had the CircleCI config halfway setup to do this by installing maven and running the Saxon-HE dependency retrieval. Unfortunately, you didn't use that retrieved version of Saxon in the script.

  • I am aware of the -it:start CLI issue. Wendall and I did some troubleshooting to get that to work and I reminded him we would need a fix in the transforms on Gitter.

I fixed the issue with the templates. This is an honest mistake. No worries on this template issue.

When you are free, I would like to chat further on how I can best remedy going forward to ensure we meet your expectations in the last sprint.

There is no need to discuss this any further, as there are no actions for you to take. I rewrote the Python script and batch file, and reworked the CircleCI config to get it all working. My new code has been pulled from PR #410 into master. It's running in the CI/CD pipeline now. @wendellpiez and I worked through the round trip issues, and everything is building clean now with the round trip validations all working.

I had to do this work last week to give @wendellpiez the needed time to work through the conversion issues. Without completing this work myself, we would not be able to make our release deadline later this week. We couldn't wait any longer to get this functionality working and deployed.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants