-
Notifications
You must be signed in to change notification settings - Fork 0
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Improve identifier: sourceSystemId #54
Comments
identifierType defines a really generic way to handle identifiers (which is good, since we use it across all our standards). I am not sure I fully understand what you mean by "improve" in this context, but I can react to your proposals:
My take is that, due to the extreme flexibility we need for this identiferType, the current implementation is an OK one. I am however fully for improvements should we find some. |
sadly, I agree... was hoping you had a magical solution. IdentifierType contains id and sourceSystemId. Dream scenario: sourceSystemID is an URI (hence no need for a curated list). However, not every source system has an URI atm and forseeable future. |
Hi all. What do you think of this? <xs:schema xmlns:xs="http://www.w3.org/2001/XMLSchema">
<!-- Define the enum type for sourceSystemId -->
<xs:simpleType name="sourceSystemEnumType">
<xs:restriction base="xs:token">
<xs:enumeration value="SYSTEM_A"/>
<xs:enumeration value="SYSTEM_B"/>
<xs:enumeration value="SYSTEM_C"/>
</xs:restriction>
</xs:simpleType>
<!-- Define an id token type with a length restriction -->
<xs:simpleType name="idToken">
<xs:restriction base="xs:token">
<xs:maxLength value="50"/>
</xs:restriction>
</xs:simpleType>
<!-- Define the union type that combines enum and restricted-length token -->
<xs:simpleType name="sourceSystemUnionType">
<xs:union memberTypes="sourceSystemEnumType idToken"/>
</xs:simpleType>
<!-- Example element using the union type -->
<xs:element name="record">
<xs:complexType>
<xs:sequence>
<xs:element name="id" type="xs:string"/>
<xs:element name="sourceSystemId" type="sourceSystemUnionType"/>
</xs:sequence>
</xs:complexType>
</xs:element>
</xs:schema>
This way, you could define common known source systems in sourceSystemEnumType, while allowing freeform IDs for all other systems. You would have to get consensus from the working group for the entries that would go into sourceSystemEnumType. Subsequent expansion of those entries would be a minor change to the spec. Anything else would be a major change. The proposed solution is backwards-compatible with current systems, i.e., anything goes. So where is the improvement? The improvement comes for consuming systems that wish to constrain allowed values: There, you can create validators that insist that only the entries of the enumeration are used. In the Java world, a rudimentary example might look something like this: Java Validator Example public class SourceSystemIdValidator {
// The values would be read from the XSD ...
private static final List<String> VALID_ENUMS = Arrays.asList("SYSTEM_A", "SYSTEM_B", "SYSTEM_C");
public static void validate(Record record) throws IllegalArgumentException {
if (!VALID_ENUMS.contains(record.getSourceSystemId())) {
throw new IllegalArgumentException("Invalid sourceSystemId value: " + record.getSourceSystemId());
}
}
}
An implementation at the DB level could look something like this: SQL Implementation Example CREATE TABLE my_agricultural_imported_data (
id VARCHAR(255) NOT NULL,
source_system_id VARCHAR(10) NOT NULL,
CONSTRAINT chk_source_system_id CHECK (
source_system_id IN ('SYSTEM_A', 'SYSTEM_B', 'SYSTEM_C')
)
);
|
@montanajava <xs:simpleType name="sourceSystemEnumType">
<xs:restriction base="xs:token">
<xs:enumeration value="SYSTEM_A"/>
<xs:enumeration value="SYSTEM_B"/>
<xs:enumeration value="SYSTEM_C"/>
</xs:restriction>
</xs:simpleType> What if I enter "SYSTEM_a"? Is it a typo and should really be "SYSTEM_A" or is it actually another system and I am using the freeform flexibility given to me? There is no way to differenciate. |
You are exactly right. The proposal is a compromise.
The check can be performed by the provisioning and/or by the consuming
system, but not with the XSD -- it would have to be done with another XSD
or another technology. The XSD here specifies what is _possible_. And what
we are making possible is one of three things, depending upon the need:
a. anything goes. The enum buys you nothing here.
b. restricted. Two parties can agree that only those entries in the enum
are valid. This approach would be predicated on having requisite system
Ids registered in the enum.
c. restricted with an option for freeform entries in exceptional cases.
This will be few architect's favoured approach, but it is a viable option
for those situations where a "contract" between provisioning and consuming
parties can only be partially agreed upon.
…On Tue, Nov 26, 2024 at 6:34 AM Ambrogio Foletti ***@***.***> wrote:
@montanajava <https://github.com/montanajava>
Hey! Thanks for the proposal.
I may be missing something, but I fail to understand how you can implement
a real check if the sourceSystemId is both an enumType AND freeform at the
same time.
<xs:simpleType name="sourceSystemEnumType">
<xs:restriction base="xs:token">
<xs:enumeration value="SYSTEM_A"/>
<xs:enumeration value="SYSTEM_B"/>
<xs:enumeration value="SYSTEM_C"/>
</xs:restriction>
</xs:simpleType>
What if I enter "SYSTEM_a"? Is it a typo and should really be "SYSTEM_A"
or is it actually another system and I am using the freeform flexibility
given to me? There is no way to differenciate.
—
Reply to this email directly, view it on GitHub
<#54 (comment)>,
or unsubscribe
<https://github.com/notifications/unsubscribe-auth/ABPTNWGLBAOHIN5AVKD6TG32CQB7HAVCNFSM6AAAAABSCG6JPOVHI2DSMVQWIX3LMV43OSLTON2WKQ3PNVWWK3TUHMZDIOJZG4YDCMBXGQ>
.
You are receiving this because you were mentioned.Message ID:
***@***.***>
|
Understood. This could of course result in data that conforms to the eCH standard but not to the tigher restrictions posed by the specific usecase, which is in my opinion perfectly acceptable. |
Identifiertype contains ID and sourcesystemId.
SourceSystemID accepts a token of length 50. Can we improve? For example: Use a code list, demand an UID (for CH-sources) or UID?
https://github.com/blw-ofag-ufag/eCH-0261/blob/2e70dda84d9971ca7fe699da64985de70d643e9a/src/eCH-0261-1-0.xsd#L373C1-L384C17
The text was updated successfully, but these errors were encountered: