-
-
Notifications
You must be signed in to change notification settings - Fork 54
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Add script to check for common model issues #1124
Conversation
cf3efbd
to
28e8d6a
Compare
e373f1b
to
610e1c9
Compare
This change adds a script to test for common issues with the default FtM model: * Divergent types: Multiple properties with the same name, but different types * Divergent labels: Multiple properties with the same name, but different labels * Label collisions: Multiple properties with different names, but using the same label These issues can cause problems for example in Aleph. For example, divergent types can cause errors when querying multiple Elasticsearch indexes. Divergent labels result in a confusing user experience.
…tAward:nutsCode`
610e1c9
to
23dac13
Compare
"criteria", | ||
"procedure", | ||
] | ||
|
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Question. Should we be concerned that there is both an authority type, and an authority label?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
What "divergent types" means: There are two properties with the name "authority" that have different types (haven’t checked it, but probably one is has the entity
and the other name
or something like that).
What "divergent labels" means: There are two properties with the same name, but they use different labels in the UI (e.g. CallForTenders:authority
has the label "Name of contracting authority" while Sanction:authority
has the label "Authority").
We should be concerned about all of these issues. However, there are some issues that can only be resolved with breaking changes, so I’ve added them to the ignore list for now (otherwise it would break CI).
collisions[label] = props | ||
|
||
return collisions | ||
|
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The three functions above: test_divergent_types, _labels, _collisions all use the same basic pattern. Would it be worth pulling this out into a generic function to reduce duplication?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
While they use the same structure, they do different things, and abstracting the generic structure would probably require passing a bunch of parameters or predicates as lambdas. I’m not sure this would make it easier to understand/maintain tbh
for prop in props: | ||
print(f" * {prop.qname}") | ||
print() | ||
|
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Is it worth extracting this into a function to reduce replication?
This change adds a script to test for common issues with the default FtM model:
These issues can cause problems for example in Aleph. For example, divergent types can cause errors when querying multiple Elasticsearch indexes. Divergent labels result in a confusing user experience.
I have also updated some schema definitions to use consistent labels and types (where types are compatible and the change is not a breaking change).
Current issues: