-
Notifications
You must be signed in to change notification settings - Fork 58
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Long metadata author list (172 authors) capped at 69 authors #2061
Comments
Some more info: On the latest EML file (eml 1.5) when I search for the string it occurs 72 times. So I see two in the Resource Contacts section, and the others must be in the Resource Creators section. On the 1.3 file there are 172 instances of the string. I think we might have updated the IPT at some point recently, so perhaps there's something in the new update? |
@rukayaj Thank you for reporting the issue |
I was not able to reproduce the bug. I've tried the current development version (2.7.4-SNAPSHOT) and the same version 2.7.3 at different environments. |
Sure, here's the log file. |
Thanks. I've tried the IPT. It does not save more than 69 agents indeed. However I don't see any related errors, whether it's in the logs or the browser. The IPT sends data correctly and reports it's saved. No ideas for now what can be the issue, I've never seen something like this in the IPT. |
Hmm strange. I wonder what the problem is... Anyway, I emailed you with details of how to access the backup we made of the IPT. |
IPT sends a pretty large request to save the data there, so it's possible that the large URL encoded request is hitting some limitations or restrictions, causing the truncation of data and resulting in only part of the data being saved. Tomcat may have limitations on the length of URLs. When the URL length exceeds the server's limit, it may truncate the request data. You can check the server configuration to determine the maximum URL length supported and we can compare it with the size of the URL encoded request you are sending. Also if you are using Apache HTTP Server as a front-end proxy for Tomcat, it may have its own configuration for limiting URL length. Ensure that the Apache HTTP Server configuration allows for a large enough URL length to handle your request data. |
I've only seen URL length limits for GET requests, but I assume this one is a POST. Some sort of web application firewall could be interfering with a request though. |
We have our IPT(s) deployed in k8s cluster with standard nginx ingress - so I also don't think the issue is there. Also if it was just cutting off parameters, I'd assume there would be an error because of an invalid request? |
What was the first time the issue happened? And what was the IPT version when it was fine the last time? |
Hmm we just applied your latest update and it doesn't work on that one. We're not sure which version it was actually working on unfortunately... Just that it was working at some time in the past. Are there any changes you've made to the IPT in the past year or so which might have caused an issue? |
There were many things. The biggest one is #1325, might be because of it, but I just can't reproduce it anywhere |
Our user is asking about this again:
So I think we probably need to take a look at this again if possible. @MichalTorma and I did update the IPT to the latest version and the problem still seems to be there. |
Oleksii adds: It seems that the limit is set for 31 authors and not more. |
@rukayaj Thank you. Last time I checked I couldn't fix the issue because I couldn't reproduce it on any available IPT (cloud one or local). |
@rukayaj May I ask you to send me this resource https://ukraine.ipt.gbif.no/manage/resource.do?r=alienspeciesua1 data directory archive? ( |
So you added this ukraine dataset to one of your cloud ipts and it was working when you tried to add authors? When I tried it just now it seems like when you import from these archives (I tried https://ukraine.ipt.gbif.no/archive.do?r=redbookua2022) it doesn't actually import all the creators. Here is what I did:
Maybe it's worth having a call (and perhaps including Oleksii) so we can go through it step by step? |
This is the one from our back up from yesterday which should be the same as what is on the server: |
Thank you. Looking into it |
Yes, I still don't experience anything like that either at the local IPT or https://ipt.gbif-uat.org (I can create you an account there) If you're able to reproduce the issue at your test IPT, would it be possible to start it in a debug mode so I can connect remotely? |
Another thing - as you also mentioned previously, it used to work before. So we can try to downgrade the test IPT to try to locate what version causes the issues? |
Hmm for downgrading I'd have to work out which version of the IPT we were using when the eml 1.3 file was created, I can take a look on Mon. I put the test ipt into debug mode using https://test.ipt.gbif.no/admin/config.do, is that what you meant? If you email me (rukayasj@uio.no) an acc for https://ipt.gbif-uat.org/ I will test it the way I did our test IPT. |
I've sent credentials |
No, I think you need some manual configs to run Tomcat in debug. |
Logged in, tried it, you're right, the dataset imports correctly with all the authors, and adding new authors isn't an issue. I wonder now whether there could be some limit in our deployment which stops tomcat creating larger file sizes? But then I would have thought it would cut off mid xml tag or something you know? All the xml is perfectly formed, it's just it isn't saving. So weird. And actually it can't be that because writing more info into the eml works, it's just literally adding more creators that doesn't. Just be aware we deploy to a k8s cluster using this helm chart https://github.com/gbif-norway/ipt-s3/tree/main/helm/ipt-s3, and in case more info is helpful:
So i can log into into the test ipt pod and change tomcat to run in debug but then I guess you need some kind of port to be opened to connect to as well? It might just be easier for us to temporarily make you a user acc so you can kubectl exec into our test ipt pod. Maybe this is not something for Friday evening though, let's pick it up on Mon if you have time. Thanks for your help, have a good weekend! |
Thank you. Yes, let's try to figure it out on Monday |
https://chat.openai.com/share/1cd41752-1c51-4206-80f9-50dedea49318 chatgpt suggests a few things, the one that jumps out to me is number 2 upping the pod resource limits - I want to try that because we're not being super generous with it right now. But I'll try on Monday, just adding this as a comment so I don't forget :) |
@rukayaj Did you manage to find out anything? |
I had something urgent I had to do this week, I still haven't tried upping the resources yet. I will try and do something this evening or tomorrow... |
Doubling the resources (see referenced issue) doesn't seem to have done the trick, unfortunately.
After:
Verified with kubectl describe pod:
These were my steps:
I also tried adding multiple creators at once at step 4. Then I repeated the whole process 3 times for that archive, and then I tried it with dwca-alienspeciesua1-v1.5.zip (freshly downloaded from the ukraine IPT), same result. I downloaded a new one because the old zip I had from last time didn't seem to work, not sure what that was about. I'm emailing you a username+pass so you can access our test IPT, can't remember if I did it before. |
So I suppose next debugging step would be to try run Tomcat in debug and open it so you can connect? If we can't figure this out one thing we could do is move the Ukraine IPT over to your cloud, now that you have individual hosted IPTs for countries I think they would be happy with e.g. https://cloud.gbif.org/ua/. I am curious about what could possibly be causing it now though. How are you deploying there, also using kubernetes and helm? |
I'm going to try 2x the resource allocation again just to make sure... Edit: Nope, still not. Def not a resource problem then :( |
Yes, I think I need to try debugging. We will also likely need to analyze Tomcat's HTTP traffic to see what IPT is receiving. I don't think we host dedicated IPTs for non-participant countries, but they can surely publish on https:://cloud.gbif.org/eca for example. I'll need to clarify. We don't have kubernetes here for IPTs, we have a simple custom script that replaces war files in Tomcat. |
Ok, I need to finish something else today but I'll try open it for you tomorrow or later today when I have time.
Hmm they were on there to begin with as far as I remember and decided it was better to have their own space as they have so many datasets.
👍 simpler is often better |
I'm working on giving you access now, but I think realistically I should probably only actually do it on Monday so I don't leave the port open all weekend - sorry this keeps getting delayed, and thanks for staying engaged and for your help. I will post again on Mon. |
Sent you an email with connection info, @mike-podolskiy90 |
Ok I'm struggling with this a bit 😅 but I'm guessing you're going on leave soon for Easter right? I'm off from tomorrow, so maybe it'll be best to pick it up again after. |
I will be working tomorrow, then we gonna have 5 holidays here in Denmark. |
I have a ticket open with digital ocean (our service provider) about this by the way, it's a networking issue (I think) and it's still unresolved. I'll keep this thread updated. |
I'll give this one more day with digital ocean support and if they haven't come up with some way of fixing it I'll give up and we'll try something else. |
Nice work @mike-podolskiy90 and @MichalTorma :) And thank you very much for your patience with the networking issues @mike-podolskiy90. I think we can probably close this now? |
Glad this solved! |
GBIF URL: https://www.gbif.org/dataset/36914742-56c5-4d54-a18a-6ab1e41b9240#contacts
IPT URL: https://ukraine.ipt.gbif.no/resource?r=alienspeciesua1
IPT version 2.7.3
One of our Ukrainian data providers has contacted me, unable to enter the full list of authors for this dataset. We can add up to 69, and then when we try to save the 70th the IPT shows us the successfully saved message, but going back to the Basic Metadata only the original 69 are shown. It is possible to edit of the 1 - 69 authors and the changes are persisted.
I believe that previously it was possible to have all 172 authors included in the metadata, but I just have to confirm that with the data provider.
The text was updated successfully, but these errors were encountered: