Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Bulbagarden Zimfarm rule doesn't work for several months #1690

Closed
Agetian opened this issue Nov 16, 2022 · 8 comments · Fixed by #1713
Closed

Bulbagarden Zimfarm rule doesn't work for several months #1690

Agetian opened this issue Nov 16, 2022 · 8 comments · Fixed by #1713
Assignees
Labels
Milestone

Comments

@Agetian
Copy link

Agetian commented Nov 16, 2022

Problem

The following Zimfarm rule seems to be dysfunctional:
https://farm.openzim.org/recipes/bulbagarden
It has been failing at least for the last 7 months. Debug information shows that it fails with this error:

mwUrl [https://bulbapedia.bulbagarden.net/] is not valid.

This is apparent from the following pipeline debug info: https://farm.openzim.org/pipeline/ab6a3828030f716b6a3f3636/debug

If mwoffliner is run manually with the similar parameters as outlined in the rule, the scraping seems to start without this error (haven't tried actually completing the job).

Reproducing steps

[//]: # Not sure how to exactly reproduce this locally, but please check the OpenZIM farm information above, the rule fails consistently.

@Agetian Agetian added the bug label Nov 16, 2022
@kelson42 kelson42 transferred this issue from openzim/zimfarm Nov 16, 2022
@kelson42 kelson42 transferred this issue from openzim/zim-requests Nov 16, 2022
@rgaudin
Copy link
Member

rgaudin commented Nov 16, 2022

Checked the latest run and the failure is not related to URL. It looks like this should be a ticket for mwoffliner repository.

I believe this repo is for requests only and errors are to be opened on the scraper's repo.

@Agetian
Copy link
Author

Agetian commented Nov 16, 2022

Thank you for the clarification and move!
Indeed, I tried running the mwoffliner job locally for this website, and it eventually fails with a timeout for me, even with a high timeout value specified.

@kelson42
Copy link
Collaborator

My local test fails differently as in the Zimfarm indeed. This needs an investigation.

@uriesk
Copy link
Collaborator

uriesk commented Jan 4, 2023

The api.php?action=visualeditor isn't available at bulbagarden, and even if you switch to a different API that gives you the html, like rest.php/v1/page/Main_Page/html, it is not available.
That will be deliberate by the owners of the website.

@kelson42
Copy link
Collaborator

kelson42 commented Jan 5, 2023

@uriesk If the visualeditor API is not available them indeed this looks bad. That said I have updated the values at https://farm.openzim.org/recipes/bulbagarden (they were anyway wrong). Now it dies Cannot read properties of undefined (reading 'general') but it should die with the error indicating the necessary API are not available... something looks fishy here to me.

@kelson42 kelson42 added this to the 1.12.0 milestone Jan 5, 2023
@kelson42
Copy link
Collaborator

kelson42 commented Jan 5, 2023

Looks like I had left a few pieces of code when I have removed the Local Parsoid code... I will remove the last part. I guess then the error message will be better.

@uriesk
Copy link
Collaborator

uriesk commented Jan 5, 2023

when the veApi is not available, it falls back to local parsoid, that's why you get that

@uriesk
Copy link
Collaborator

uriesk commented Jan 5, 2023

We can do the fandom pokemon wiki for our pokemon bros instead... they are almost the same anyway.

openzim/zim-requests#545

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Projects
None yet
Development

Successfully merging a pull request may close this issue.

4 participants