Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Change related.issn to array and delete - #1828 #1916

Merged
merged 7 commits into from
Apr 16, 2024
Merged

Conversation

TobiasNx
Copy link
Contributor

@TobiasNx TobiasNx commented Sep 29, 2023

Resolves #1828

ISSN is defined as array in the context therefore I changed related.issn to an array.

"issn" : {
"@id" : "http://purl.org/ontology/bibo/issn",
"@container" : "@set"
},

Also it has a - which should be deleted.

@TobiasNx TobiasNx requested a review from dr0i September 29, 2023 08:13
@dr0i
Copy link
Member

dr0i commented Sep 29, 2023

checks failed, reassigning

@dr0i dr0i removed their request for review September 29, 2023 08:26
@TobiasNx
Copy link
Contributor Author

[error] Test tests.IndexIntegrationTest.testResultCount[(83) {Q=q.all:07206763}] failed: expected:<[0]> but was:<[1]>, took 0.076 sec
...
[error] Test tests.IndexIntegrationTest.testResultCount[(84) {Q=q.all:0720\-6763}] failed: expected:<[1]> but was:<[0]>, took 0.091 sec

@blackwinter These are your tests. I have deleted the - in related.issn and turned issn into an array as defined in the context.
Could you have a look?

@TobiasNx
Copy link
Contributor Author

@dr0i could it be that with the change from string to array related.issn is not indexed anymore: #1267

@blackwinter
Copy link
Member

These tests illustrate that there's no normalization happening for ISSNs (in contrast to ISBNs). So if you changed the data, just adjust the tests accordingly (the new behaviour is the expected behaviour).

@TobiasNx
Copy link
Contributor Author

I updated one test. It seems that the hyphen normalization is not supported for related.issn

@dr0i
Copy link
Member

dr0i commented Sep 29, 2023

You didn't update:

[error] Test tests.IndexIntegrationTest.testResultCount[(84) {Q=q.all:0720-6763}] failed: expected:<[1]> but was:<[0]>

i.e. 0720-6763 cannot be queried. Idk if this is something that you want to work?

@TobiasNx
Copy link
Contributor Author

Idk if this is something that you want to work?

ISSNs are usually with - as ISBNs therefore ISSN with and without - should work.

@blackwinter
Copy link
Member

Also it has a - which should be deleted.

Why is that?

ISSNs are usually with - as ISBNs therefore ISSN with and without - should work.

So add both. It's an array now after all.

@TobiasNx
Copy link
Contributor Author

Also it has a - which should be deleted.

Why is that?

ISSNs are usually with - as ISBNs therefore ISSN with and without - should work.

So add both. It's an array now after all.

This is the same situation with ISBN isn't it? We only use the ISBN without -. The same is the "historical" situation with ISSN.

<data source="542[-ab][-1].[a]" name="$[ns-bibo]issn">
<replace pattern="-| " with=""/>
</data>

@blackwinter
Copy link
Member

Okay, I understand. But we don't want ISSN normalization in q.all, this is what got us in trouble with hbz IDs (see 2f45f28).

@TobiasNx
Copy link
Contributor Author

TobiasNx commented Sep 29, 2023

This can wait until after your holiday.
d167733

this seems to be the commit. also we we already normalize the issn for the record:

replace_all("issn[].*", "-","")

Also I do not see how this is any different to ISBN or why this is handled differently to ISBN:

{ "q.all:0702075558", /*->*/ 1 },
{ "q.all:07\\-0207\\-555\\-8", /*->*/ 1 },
{ "q.all:07206763", /*->*/ 0 },
{ "q.all:0720\\-6763", /*->*/ 1 },

There we also do not keep the hyphen.

@blackwinter
Copy link
Member

Sorry, that's not what I meant with "normalization". I was talking about the standardnumber filter.

As I said before:

just adjust the tests accordingly (the new behaviour is the expected behaviour)

@TobiasNx
Copy link
Contributor Author

TobiasNx commented Oct 2, 2023

Ah, I see what the reason was. related.isbn and realted.issn were not configured in the same way in the index-config,json as issn and isbn. I commited a change for the index config.

@dr0i dr0i removed their assignment Oct 20, 2023
@dr0i
Copy link
Member

dr0i commented Oct 20, 2023

@TobiasNx will adopt the test to reflect a related.issn query. Besides this the PR is good.

@TobiasNx
Copy link
Contributor Author

@dr0i could you have another look?

@TobiasNx TobiasNx requested a review from dr0i March 28, 2024 07:44
@TobiasNx TobiasNx assigned dr0i and unassigned TobiasNx Mar 28, 2024
@dr0i
Copy link
Member

dr0i commented Apr 2, 2024

Okay, I understand. But we don't want ISSN normalization in q.all, this is what got us in trouble with hbz IDs (see 2f45f28).

Do you ACK @TobiasNx ? We won't be able to query an ISSN with a hyphen.

@blackwinter but the same is OK for ISBN? (Look at 2f45f28 and test if q.all=HT0702075558 hits - it does!)

@blackwinter
Copy link
Member

We won't be able to query an ISSN with a hyphen.

You're not using the query field q.all, are you?

but the same is OK for ISBN? (Look at 2f45f28 and test if q.all=HT0702075558 hits - it does!)

HT0702075558 is not a valid hbz ID, these only have 9 digits (so far).

@TobiasNx
Copy link
Contributor Author

TobiasNx commented Apr 2, 2024

I am confused what is blocking the PR:
Original intention of this PR was:

  1. To change the related.issn-element to an array as the top-level issn-element already is (as well as isbn and related.isbn are). (3b710e2)
  2. Also to delete the hyphen as we already do with the issn-element (as well as with isbn and related.isbn). (3b710e2)
  3. To ensure that one can search for records with an issn with and without hyphen I changed the index config (18b55a7) and added a new test (d391be8).

This should be reflected in the 3 commits (not counting the merge).

Anything wrong with this?

PS: I see that the test still fails for the issn with hyphen using q.all:
[error] Test tests.IndexIntegrationTest.testResultCount[(84) {Q=q.all:0720\-6763}] failed: expected:<[1]> but was:<[0]>, took 0.078 sec

But with the isbn it works:

{ "q.all:0702075558", /*->*/ 1 },
{ "q.all:07\\-0207\\-555\\-8", /*->*/ 1 },
{ "q.all:07206763", /*->*/ 1 },
{ "q.all:0720\\-6763", /*->*/ 1 },
{ "q.all:HT072067630", /*->*/ 0 },

@blackwinter
Copy link
Member

I see that the test still fails for the issn with hyphen using q.all:

As I said before, you have to adjust the tests to match your change in the data. Basically, you have to switch the results for q.all:07206763 and q.all:0720-6763. You only did the first half so far.

These tests show that it's a behaviour change, but DigiBib (union catalogue) has to live with it I suppose.

@blackwinter
Copy link
Member

But with the isbn it works:

ISBNs receive special treatment via the digibib_standardnumber filter.

This also highlights the different behavion from q.all to the general lobid-search.
@TobiasNx
Copy link
Contributor Author

TobiasNx commented Apr 2, 2024

Talked to @blackwinter off board.

Queries with q.all are not the general lobid search but a specific DigiBib-Search.
I added issn and related.issn tests.

Now the test should work.

@blackwinter
Copy link
Member

Now the test should work.

There are still some failures. Will review when everything passes.

@TobiasNx
Copy link
Contributor Author

TobiasNx commented Apr 3, 2024

@dr0i can you have a look. It seems the indexIntegration tests work locally but the tests still seem to fail.

dr0i referenced this pull request Apr 8, 2024
Losing the ISSN match without the hyphen is unfortunate, but a small price to pay when compared to the incorrect matches for hbz IDs.
dr0i added 2 commits April 16, 2024 10:44
We use tabs.
The related.issn is now stored without hyphen.
@dr0i
Copy link
Member

dr0i commented Apr 16, 2024

@blackwinter you can have a look now

@blackwinter blackwinter removed their assignment Apr 16, 2024
@dr0i dr0i merged commit 5137d0f into master Apr 16, 2024
1 check passed
@dr0i dr0i deleted the 1828-relatedIssn branch April 16, 2024 10:05
@dr0i
Copy link
Member

dr0i commented Apr 16, 2024

Will be deployed next Monday.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
Status: Done
Development

Successfully merging this pull request may close these issues.

ISSN with -
3 participants