-
Notifications
You must be signed in to change notification settings - Fork 34
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Erratic behaviour with nested entity and choose in combination with flushWith #366
Comments
I don't think I can confirm. First of all, But even after removing that line, executing Can you be more specific as to what should be the expected outcome?
|
There's almost definitely something fishy going on... When shuffling the file list, catching the
While they're stable when iterating in fixed (directory) order:
It's also not limited to the
Which originates from a - seemingly - inconspicuous |
It is expected that 610 and 650 get a "type" : [ "Keyword" ] or "type" : [ "Concept" ] based on the specific second indicator of that field. the internal inconsistency is shown in this two cases: (DE-605)HT003176544.json:
The 610 and the 650 even if they are the same workflow 610 in this example provides the type and 650 does not. We only have one 610 testfile. Unfortunately not shown in the the catched example 650 also does not work always and even can give different results in different testfiles . It then can happen that HT003176544 and HT020202475 have different results with this workflow. We saw the case that: While HT003176544: (DE-605)HT020202475.json
Your second case:
Could this due to the fact that the source data has duplicates?
Münster appears multiple times. |
Sorry for intervening late: the tests are not working and we have an issue for that. For the rest of the problem I can't say because I haven't checked that on my own yet. |
I suspect both issues have exactly the same root cause. Your tests don't give reliable results (seems to be the I was able to reproduce the erratic behaviour with a reduced test case ( Possible outcomes for
Test scenarios:
Control test with only
Which leads to the following questions:
|
So here's a heavily reduced test case that hopefully still catches the essence of this issue (at least w.r.t. questions 2 and 4, maybe 3). You can run it with the following Gradle invocation from the ../gradlew --rerun-tasks test --tests org.metafacture.metamorph.Issue366Test -Dissue366.verbose=true -Dissue366.iter=100 It shows that option 0 (the issue at hand) fails 50% of the time (misses Input: {"10012":{"a":"V"}} Option 0: <metamorph xmlns="http://www.culturegraph.org/metamorph" version="1">
<rules>
<entity name="r" flushWith="100??">
<choose flushWith="100??.a">
<data name="k" source="100?2.a">
<constant value="K"/>
</data>
</choose>
<data name="v" source="100??.a" />
</entity>
</rules>
</metamorph> Output 0: {"r":{"v":"V"}} // 47/100
{"r":{"k":"K","v":"V"}} // 53/100 Option 1: <metamorph xmlns="http://www.culturegraph.org/metamorph" version="1">
<rules>
<entity name="r" flushWith="100??">
<choose flushWith="100??">
<data name="k" source="100?2.a">
<constant value="K"/>
</data>
</choose>
<data name="v" source="100??.a" />
</entity>
</rules>
</metamorph> Diff 1: --- option0.xml
+++ option1.xml
@@ -1,7 +1,7 @@
<metamorph xmlns="http://www.culturegraph.org/metamorph" version="1">
<rules>
<entity name="r" flushWith="100??">
- <choose flushWith="100??.a">
+ <choose flushWith="100??">
<data name="k" source="100?2.a">
<constant value="K"/>
</data> Output 1: {"r":{"v":"V","k":"K"}} Option 2: <metamorph xmlns="http://www.culturegraph.org/metamorph" version="1">
<rules>
<entity name="r" flushWith="100??">
<choose flushWith="100??.a">
<data name="k" source="100??.a">
<constant value="K"/>
</data>
</choose>
<data name="v" source="100??.a" />
</entity>
</rules>
</metamorph> Diff 2: --- option0.xml
+++ option2.xml
@@ -2,7 +2,7 @@
<rules>
<entity name="r" flushWith="100??">
<choose flushWith="100??.a">
- <data name="k" source="100?2.a">
+ <data name="k" source="100??.a">
<constant value="K"/>
</data>
</choose> Output 2: {"r":{"k":"K","v":"V"}} Option 3: <metamorph xmlns="http://www.culturegraph.org/metamorph" version="1">
<rules>
<entity name="r" flushWith="100??">
<choose flushWith="100?2.a">
<data name="k" source="100?2.a">
<constant value="K"/>
</data>
</choose>
<data name="v" source="100?2.a" />
</entity>
</rules>
</metamorph> Diff 3: --- option0.xml
+++ option3.xml
@@ -1,12 +1,12 @@
<metamorph xmlns="http://www.culturegraph.org/metamorph" version="1">
<rules>
<entity name="r" flushWith="100??">
- <choose flushWith="100??.a">
+ <choose flushWith="100?2.a">
<data name="k" source="100?2.a">
<constant value="K"/>
</data>
</choose>
- <data name="v" source="100??.a" />
+ <data name="v" source="100?2.a" />
</entity>
</rules>
</metamorph> Output 3: {"r":{"k":"K","v":"V"}} |
* Disallow unsupported levels. * Add tests.
Working with lobid-resources I found a bug.
https://github.com/hbz/lobid-resources/blob/ccc34fc8590340c6fb72d4d9c884746090b7fbf8/src/main/resources/alma/common/subjects.xml#L4-L60
To add a type specific to the second indicator of a marc field in 610 and 650 I have introduced a
choose
collector withsameEntity="true" reset="true" flushWith="650??.a"
(or 610??.a). The wrapping mother entity also has flushWith conditions. Both flushWith as well al the source data refrence are using wild cards.When I run the lobid morph to test with:
mvn failsafe:integration-test -Dit.test=AlmaMarc21XmlToLobidJsonTest
Results for this specific transformation process seem to be erratic.
They are inconsistent internally as well as between different test runs. But is does not always create these inconsistencies between runs.
hbz/lobid-resources@ccc34fc
@dr0i saw it live. When we changed the
flushWith
-attribute ofchoose
to the entity level (610?? instead of 610??.a). The bug does not appear.The text was updated successfully, but these errors were encountered: