Test data optimization ADR (GSA#700)

* Introduce test data optimization ADR * Correction: only scaffold invalid test data file * Provide more detail in ADR * Date correction * Add solution & consequences * Add reference to GSA#710. --------- Co-authored-by: A.J. Stein <aj@gsa.gov>
brian-ruf · Nov 8, 2024 · 75ae758 · 75ae758
1 parent 00a24fa
commit 75ae758
Showing 1 changed file with 33 additions and 0 deletions.
diff --git a/documents/adr/0006-test-data-optimization-ADR.md b/documents/adr/0006-test-data-optimization-ADR.md
@@ -0,0 +1,33 @@
+# 6. Test data optimization ADR 
+Date: 2024-09-20
+
+## Status
+Proposed
+
+## Context
+Currently, all tests for constraints used by the constraint validation tool rely on the same invalid test data file (`ssp-all-INVALID.xml`). This results in conflicts between many constraints, preventing the tests from failing properly when using a shared invalid test data file. An example of this conflict is:
+```xml
+<categorization system="https://doi.org/10.6028/NIST.SP.800-60v2r1">
+  <information-type-id>C.2.8.12</information-type-id>
+</categorization>
+```
+We have one constraint, `information-type-has-categorization`, that checks the existence of the `categorization` element. In the fail case, we need to remove this `categorization` block from the invalid test data file entirely in order to make the fail case fail successfully (categorization element doesn't exist). We have another constraint, `information-type-system`, that checks if the `system` attribute is a valid value. In the fail case, the `information-type-system` constraint cannot check the `system` attribute because the entire `categorization` element is removed to satisfy the `information-type-has-categorization` constraint fail case. This conflict makes it impossible for both of these constraints to share the same invalid test data file when validating the constraints.
+
+## Possible Solutions
+1. Do nothing, continue to maintain test architecture as-is.
+
+2. Currently, the test runner has a feature where the positive (`-PASS.yaml`) and negative (`-FAIL.yaml`) tests are scaffolded for a given constraint by running `npm run constraint` in a terminal window. These yaml test files reference the corresponding test data files that the constraint should be validated against. We can add functionality to this existing feature to prompt a user if they would like to generate new invalid test data (`-all-INVALID.xml`) as well, or use the existing invalid test data file for that model (e.g., ssp-all-INVALID.xml). If a user chooses to generate new invalid test data, we can scaffold a new invalid test data file, specific to the constraint, that uses a lean template. This template would contain only the necessary context to validate the constraint, and schema validation would be disabled to ensure that the invalid test data is as lean as possible. The yaml test files can then automatically reference the newly generated invalid test data file to validate the constraint. If a user chooses to use the existing invalid test data file for that model, then the yaml test files will reference the existing invalid test data file for that model to validate the constraint. Adding this functionality to the `npm run constraint` command would make sure that all necessary content to validate a constraint is generated at the same time, and that the yaml test files reference the correct test data file. This solution would solve the problem of constraints not successfully passing the fail case when referencing the same invalid test data file by adding the flexbility to isolate the testing of a constraint by scaffolding a new invalid test data file specific to the constraint. The current method of validating the pass case of all constraints in the same valid test data file (`ssp-all-VALID.xml`) will not be changed or impacted by this solution.
+
+## Decision
+No decision has currently been made. 
+
+## Consequences
+Solution 1: 
+- Current average test execution time is maintained.  
+- Test suite cannot be realistically enhanced whenever there is a conflicting requirement, which will be often.  
+- Development and maintenance of individual tests and overall test suite will slow and eventually halt. One such example recently occurred and our developers had to apply this approach to resolve test failures in [GSA/fedramp-automation#710](https://github.com/GSA/fedramp-automation/pull/710).
+
+Solution 2:
+- Longer test run time.  
+- Developers can continue current velocity of test development or increase, but not stop with this workaround.  
+- The test architecture achieves meaningful flexibility to permit development and testing of conflicting test requirements simultaneously.