Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Filter and order languages in ZIM Language metadata #172

Closed
benoit74 opened this issue Mar 19, 2024 · 1 comment · Fixed by #181
Closed

Filter and order languages in ZIM Language metadata #172

benoit74 opened this issue Mar 19, 2024 · 1 comment · Fixed by #181
Assignees
Milestone

Comments

@benoit74
Copy link
Collaborator

When multiple languages are required but only some of them are found, the scraper still sets all languages in ZIM Language metadata (or at least it will once #170 will be merged).

The scraper should in fact:

  • filter out languages for which no videos have been found
  • order languages by importance (in term of number of videos) inside the ZIM

This is going to be a bit tricky because TED lang codes are different than ISO-639-3 codes, we will have to be careful about that.

Note that once #171 is implemented, the exact list of languages will be dynamic.

@rgaudin
Copy link
Member

rgaudin commented Mar 19, 2024

We've been using scraperlib to get one from the other. We even have a mapping table for those that can't match

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging a pull request may close this issue.

2 participants