Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

library.kiwix.org is not updated anymore from new ZIMs #181

Closed
Popolechien opened this issue Apr 9, 2024 · 16 comments
Closed

library.kiwix.org is not updated anymore from new ZIMs #181

Popolechien opened this issue Apr 9, 2024 · 16 comments
Labels
bug Something isn't working

Comments

@Popolechien
Copy link
Member

I see that https://download.kiwix.org/zim/wikipedia/wikipedia_ru_all_maxi_2024-04.zim has been completed and uploaded to download.kiwix.org, but the oldest zim (2023-11) is still there. The file is also not available on library.kiwix.org

This is only for the maxi flavour, however. Both mini and nopic are there.

@Popolechien Popolechien added the bug Something isn't working label Apr 9, 2024
@benoit74
Copy link
Collaborator

benoit74 commented Apr 9, 2024

Library update job is indeed failing since few hours.

Problem is linked to the creation of zimit/youscribe_fr_primaire_2024-04.zim while youscribe_fr_primaire previous version is at other/youscribe_fr_primaire_2023-03.zim.

Someone (@RavanJAltaie?) obviously changed the warehouse path from other to zimit. Is it intentional? Should I move all youscribe_fr_primaire to zimit warehouse path? All youscribe? Please explain what is intended here.

In the mean time, I archived the new zimit/youscribe_fr_primaire_2024-04.zim so that library can be updated again (restoring it will be immediate, I just need precise instructions, no need to run again the farm recipe) and restarted the update job. Library should be updated in more or less 15/20 minutes from now if there is not another issue.

@benoit74 benoit74 changed the title New zim not present in library.kiwix.org library.kiwix.org is not updated anymore from new ZIMs Apr 9, 2024
@benoit74
Copy link
Collaborator

benoit74 commented Apr 9, 2024

For the techies, the log was

│ 2024-04-09 12:54:59,852 DEBUG [READ] 6846 other/youscribe_fr_college_2020-02.zim                                                                                                                                                                                             │
│ 2024-04-09 12:54:59,854 DEBUG [READ] 6847 other/youscribe_fr_lycee_2023-03.zim                                                                                                                                                                                               │
│ 2024-04-09 12:55:00,001 DEBUG [READ] 6848 other/youscribe_fr_lycee_2020-02.zim
│ 2024-04-09 12:55:00,003 DEBUG [READ] 6849 zimit/youscribe_fr_primaire_2024-04.zim                                                                                                                                                                                            │
│ 2024-04-09 12:55:00,220 DEBUG >> is update youscribe_fr_primaire: 89c6e919-f9bc-c8e8-f82b-e9c00be55c9f                                                                                                                                                                       │
│ 2024-04-09 12:55:00,220 DEBUG [READ] 6850 other/youscribe_fr_primaire_2023-03.zim                                                                                                                                                                                            │
│ 2024-04-09 12:55:00,225 ERROR FAILED. An error occurred: 'id'                                                                                                                                                                                                                │
│ 2024-04-09 12:55:00,226 ERROR 'id'                                                                                                                                                                                                                                           │
│ Traceback (most recent call last):                                                                                                                                                                                                                                           │
│   File "/usr/local/bin/library-maint", line 943, in entrypoint                                                                                                                                                                                                               │
│     sys.exit(maint.run())                                                                                                                                                                                                                                                    │
│   File "/usr/local/bin/library-maint", line 742, in run                                                                                                                                                                                                                      │
│     self.readfs()                                                                                                                                                                                                                                                            │
│   File "/usr/local/bin/library-maint", line 527, in readfs                                                                                                                                                                                                                   │
│     logger.debug(f">> is update {alias}: {entry['id']}")                                                                                                                                                                                                                     │
│ KeyError: 'id'

The problem is that we assume the list of ZIM to be alphabetically ordered and to save time we read only the data of first ZIM for a given alias, to save processing time. I.e. zimit/youscribe_fr_primaire_2024-04.zim should have been processed after other/youscribe_fr_primaire_2023-03.zim, not before.

@rgaudin is this a known limitation of the library update job or should I log a ticket? (might even be voluntary, I don't think we want to update the warehouse path without moving old ZIMs)

@benoit74
Copy link
Collaborator

benoit74 commented Apr 9, 2024

Incident is resolved, library is now updated again.

@benoit74 benoit74 closed this as completed Apr 9, 2024
@rgaudin
Copy link
Member

rgaudin commented Apr 9, 2024

It's prohibited to change warehouse path of existing ZIM without informing operations. Content team knows this but since it's not frequent, it's probably been forgotten.

That's a known limitation of the script from day 1

@kelson42
Copy link
Contributor

kelson42 commented Apr 9, 2024

It is concerning that things fail silently or at least without a very clear (alarm) message. We need to open a ticket to find a better solution.

@rgaudin
Copy link
Member

rgaudin commented Apr 9, 2024

It doesn't fail silently. Jobs are in error. We just don't have alarms for this and we should ; definitely

@Popolechien
Copy link
Member Author

Reopening this ticket as I have the exact same issue with https://farm.openzim.org/recipes/wikipedia_ja_medicine - updated more than 6 hours ago, still the old zim(s) available in the library.

@benoit74
Copy link
Collaborator

Same cause, same consequence.

We had before zimit/editions-ganndal_fr_fo-livres_2024-04.zim and now we have other/editions-ganndal_fr_fo-livres_2024-04.zim.

I archived the offending other/editions-ganndal_fr_fo-livres_2024-04.zim and opened openzim/zim-requests#962

@RavanJAltaie @Popolechien you need to remember that it is NOT possible to change a ZIM warehouse folder in Zimfarm without prior asking devs to moving existing ZIMs to the new warehouse folder. Request has to be opened in zim-request to ask for the change.

@rgaudin
Copy link
Member

rgaudin commented Apr 15, 2024

This ZIM should NOT be in the public library at the moment. It's content is not free. And I don't understand how it ended up in zimit folder since it's not using zimit scraper...

@Popolechien
Copy link
Member Author

Yeah I'm a little suprised that zims should move like this. I certainly haven't touched it, and with @RavanJAltaie being away most of the past couple of weeks I doubt she did (ganndal is quite specific and she did not take part in its creation). I suppose we don't have a log of operations?

@rgaudin
Copy link
Member

rgaudin commented Apr 15, 2024

No we don't. It's a long time feature request though. To be faire I should have disabled the recipe (which was previously in dev) when we agreed it would not go public. I see that it is disabled at the moment so someone did that. That's mysterious!

@kelson42
Copy link
Contributor

what is the measure/issue to secure tech people know about a failure before end user notice?

@benoit74
Copy link
Collaborator

what is the measure/issue to secure tech people know about a failure before end user notice?

#182

@RavanJAltaie
Copy link

RavanJAltaie commented Apr 15, 2024

Unfortunately, I'm the one who changed the warehouse paths for both Ganndal and Youscribe primair.
I didn't know earlier that changing warehouse path would create a bug and should be informed to operation as this is the first time I conclude such an action, I'm aware now and would put these rules into consideration.
@benoit74 I see that you've disabled the Youscribe Primaire recipe, is that for fixing the bug?

@RavanJAltaie
Copy link

On another hand I have the Spanish version of Marxist.org succeeded and pushed to the library but it doesn't show up in the library, is that because of the same bug? @benoit74

@benoit74
Copy link
Collaborator

@benoit74 I see that you've disabled the Youscribe Primaire recipe, is that for fixing the bug?

Yes, see openzim/zim-requests#961

On another hand I have the Spanish version of Marxist.org succeeded and pushed to the library but it doesn't show up in the library, is that because of the same bug?

Yes, I thought I've fixed the issue with editions-ganndal_fr_fo but in fact library update is still failing due to the comeback of zimit/youscribe_fr_primaire_2024-04.zim. I've archived the ZIM again.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
bug Something isn't working
Projects
None yet
Development

No branches or pull requests

5 participants