-
Notifications
You must be signed in to change notification settings - Fork 184
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Tracking control identity via catalog of origin #538
Comments
First, I would encourage - as a best practice among profile creation tools - all conflicts be called to the attention of the profile creator at the time of creation, with the intention of deconflicting the profile's control references before it is ever processed for resolution. I like the @canonical-id in theory, but am am concerned that this requires catalog creators to have another ID to manage for their catalog, and that management would need to be under a clear set of guidelines. For example, after NIST SP 800-53r4 was released, there was an update with (mostly) corrections and a few tweaks released at some point during its first year. Most of the controls did not change at all. A few did. If they have the same @canonical-id value, we would fail to correctly process the few controls that changed. If the @canonical-id changed, we would fail to correctly process the majority of controls that did not change. Also, we would have to make the @canonical-id required or it will not be present for a given catalog. We may need to require it to be unique as well. At that point, we should just use the UUID instead, if we are going to take this approach. However, I would like to suggest an alternate approach:
This would allow the resulting catalog to unambiguously trace back to the control's origin without reliance on potentially missing or miss-leading @canonical-id's from a catalog. It also puts the responsibility on the profile creator to explicitly handle any conflicts. Sadly, while I like the idea of directives that indicate accepting the first or last, I think that only works for XML as JSON processing cannot be trusted to process import statements in their original sequence. |
As for how conflicts should be managed by tools, whether while authoring a profile, or resolving one, I see the following functional steps (which ignore current tool capabilities for the sake of this discussion): BARE MINIMUM TARGET
DESIRED MINIMUM TARGET
ADVANCED TARGET
For all of the above, if we adopt either the @canonical-id approach suggested by @wendellpiez, or the @import-id approach suggested in my post above (or both), those could be valid methods of addressing duplication. My main goal is that a human should always be responsible for explicitly addressing duplicate control references. There is too much effort that goes into satisfying controls to risk an unintended de-duplication action by a processing tool. |
As an observer, this issue/feature seems overly complicated. IRL users can just point to a git (subversion etc) version/release. I don't think it's helpful for OSCAL to worry about a solved/outsource-able problem. Version management should not be part of the schema (outside of a UUID or timestamped hash), it should be an updatable only via |
@JJediny this issue is about how the syntax in an OSCAL profile is interpreted to create a new catalog. It is about a specific issue in that process where the OSCAL profile specifies the same control (by OSCAL control ID) more than once, creating a conflict. First, I am unaware of how GitHub could help with this within OSCAL XML or JSON content, and would welcome a proof-of-concept from you as it would save us having to build tools. Also, OSCAL is intended to be used both within Internet-connected environments, as well as "offline", environments, such as may be required for classified processing, where there would be no access to a public site such as GitHub. |
if this is about conflicts in the namespace, I'm confused why versioning is a topic? |
@JJediny, sorry for any confusion. While @wendellpiez does use the word version a couple times, he does so more loosely. Version may not be the best word, as it's not a versioning topic, per se. It's more about a profile resolving multiple controls with the same control ID (from different import sources) and determining whether they are just duplicates, or something that started out as a duplicate, but was changed by an upstream profile, or something from a different catalog that just happens to have the same control ID. |
Sorry, I admit I was well offbase with my original understanding of this issue, but which I now (think?) I understand it to be "if the standard e.g. NIST 800-53/SOC2/ISO 27001" has conflicting namespace whether across standards or versions-of-themselves there is an issue? If so, it seems to warrant a canonical index of the It would make it cleaner/easier if each said framework published its controls with ids as versions over making that apart of the schema/user level to modify. |
@brianrufgsa, import statements in the JSON are in an array so that order can be respected. This is one of the features of Metaschema, that it ensures predictability of ordering even in the JSON. (Mostly: there are exceptions around the edges.) Mostly, I really like the thinking/discussion here. Although I can also see the issue is apparently complex enough that we are probably going to need mockups/demos. |
@JJediny so @brianrufgsa is correct; I should probably have used a different word, maybe 'variant'. You are right though that this is effectively a persistent namespace for control identification. And yes, a canonical index would be a huge help, maybe essential at some level. I also agree with the design goal of solving this upstream from users. |
Regarding the global scope of a |
@JJediny I realize the whole point of OSCAL is to be as machine-readable as possible, thus we want to automate our activities as much as possible, including de-conflicting of controls during an import. Here is why I assert a human should always have final responsibility for control de-confliction:
Conflicting requirement IDs represent an ambiguity of functional requirements. Machines are not yet smart enough to intelligently accurately de-conflit such ambiguities using judgement and reasoning, They can detect and address exact duplicates. They can assess the degree of difference for non-exact duplicates. They can even recommend changes. They should do everything they can to enable a human to understand the conflict, present options, and take action once a human has rendered a decision. But ultimately, a human needs to "own" the final deconfliction decisions. |
The different merge behaviors are designed to enable more and less assertive resolutions of conflicts. More assertive resolutions have the advantage of succeeding in producing valid catalogs for a wider range of inputs including (nominally) ambiguous inputs (given a way to resolve such ambiguities). Less assertive resolutions have the advantage of exposing problems in profiles rather than resolving them. This is good when a better solution to such a problem is easily found upstream. (Fix the input so there is no clash to resolve.)
The question of control identity - recognizing that AC-2 in one import actually clashes with another -- is essential to the 'merge' and 'use-first' methods, but not the 'keep' method, which amounts to straight up GIGO. So one solution could be to remove the merge options besides 'keep', and let the devil take the hindmost. Instead of supporting any merging of controls, we would rely on tooling and perhaps mandatory error reporting, in resolution, to help authors deconflict the control imports. Even a canonical-id or "namespacing" mechanism has a problem, however, when two different catalogs have controls with clashing identifiers. Addressing that, unfortunately, implies a feature for reassigning ID values in the result. Which would open another can of worms. |
Noting today that this issue applies also to parameters (which, like controls, have the potential to clash with other parameters with the same ID, on multiple imports) and potentially to groups. |
I think its possibly more basic than a merge need. A
That control identifier ( |
Agreeing with @bradh. This is more general than simply merging; it is also about addressability from higher layers. My main question at this point is whether a top-level |
@wendellpiez I can see why that would work OK for a NIST 800-53 style approach, where controls are versioned at the document level. However my control source versions them at the control level, and the document is revised many times (I think 8 times in 2019), although most controls are unchanged from document revision to document revision. Chasing document revisions is part of the problem I'm hoping to address, so I'd prefer to have version as a separate attribute. I could (potentially) have it at the control level (basically incorporating revision in, so the |
@bradh Would it be possible to version the document at the document level, but add a property at the control level that indicates which document version the control was last updated by? |
I do have last-changed and revision properties in my controls: https://github.com/bradh/ism-oscal/blob/master/Australian_Government_Information_Security_Manual_NOV19_catalog.xml#L57 for an example. That is obviously non-standard though. |
"Standards" are relative and many-layered. If implemented consistently and documented, a solution like this can have the effect of a standard for a user community. I'd like to promote consistent extension-by-restriction whenever possible, as it keeps the baseline schema simpler for everyone. (Moving the problem.) That's actually not an absolute 'no' fwiw from me as I also feel we need to keep an eye on these things. If everyone has the same need we don't want a dozen ways to do it, either. That's the flip side. |
@wendellpiez Has this issue been fully addressed in PR #559? |
@david-waltermire-nist no this has not been fully addressed, but remains an issue to be worked out in testing. Ultimately we should have unit tests showing pathological inputs so we can detect both intended and unintended control (identity) collisions across imported catalog(s) and/or profile(s). For now this is a tracking issue for this problem, with notes for possible approaches. |
Implemented a means to support this in liboscal-java v1.0.4 through use of an identifier mapper. This needs to be added to the Profile Resolution specification (see #1196). |
User Story:
In specifying and implementing profile resolution (#508, #509) we have exposed a requirement to support tracking 'control identity' more robustly through profile resolution. This is so that when profiles import profiles, the controls they import can be correctly matched with controls from other import pathways (such as source catalogs or other profiles of the same source catalogs).
A simple design extension, with two new flags (one each for
control
andcatalog
) could address this.Details
A profile, call it profileX:
source profileY:
source catalogZ:
Note that profileX selects control a1 twice: once in modified form (from profileY) and once in its original form in CatalogZ.
There are three "combination rules" for merging: "keep", "use-first" and "merge". Keep is easy - it says, keep both a1 controls and detect the clash downstream. This option is presumably most useful for dev/testing, although if a profile is written correctly there is no error, hence no harm in it. (In this case, the second import could exclude the control as it isn't actually wanted from the catalog.)
But how do we determine that the two 'a1' controls are the same for purposes of the "use-first" and "merge" options? This can be dramatized by examining the catalog that results from resolving the imported ProfileY:
When we combine this with catalogZ, we have no way of knowing here that control 'a1' originated from the same catalog (with id="xyz123"), and is not some totally different source.
This problem is compounded by the likelihood that a catalog
@id
does not persist across different released/revised/published versions of the same catalog, so it is not reliable as a disambiguator.Proposal
If our original source catalogZ has:
By propagating the value of the catalog's canonical-id to the controls, the results of resolving ProfileY could look like this:
Now the fact that 'a1' derives from catalogZ in both cases, can be determined (a value of
ZZZZZ
as an@origin-id
or on an ancestorcatalog/@canonical-id
), and a resolution of ProfileX can look like this (assuming the merge method 'use first' is applied):The
origin-id
attribute would then persist -- as tracking the catalog of origin, it is not rewritten by subsequent profile resolution steps as it might be needed any time down stream.The same issue arises with groups for merging purposes under
merge/as-is
.NOTE:
This design permits correct use-first or merging behavior, but it does not rewrite IDs; thus if controls with colliding IDs are imported from two different sources, they will (correct;u) not be merged and validation errors will presumably result. So it is still necessary to see to it that IDs do not clash between catalogs to be combined.
Further note:
We could also specify operations on document metadata and/or back matter, to track how a profile is made. So the metadata of a result catalog (even if only in memory as a profile is resolved) will say something about its sources, and upstream catalogs could be referenced.
Goals:
Dependencies:
Linked to both #508 and #509.
The solution should also be unit tested.
Acceptance Criteria
The text was updated successfully, but these errors were encountered: