Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Download from data.europa.eu fails due to complex "name" metadata #95

Open
alvarolopez opened this issue Oct 18, 2024 · 0 comments
Open

Comments

@alvarolopez
Copy link

alvarolopez commented Oct 18, 2024

In some cases, downloading data from data.europa.eu (implemented in #86) is broken, due to the internationalization of the file name. In those cases, datahugger fails with Error: argument should be a str or an os.PathLike object where __fspath__ returns a str, not 'list.

For instance, when downloading https://data.europa.eu/data/datasets/cordish2020projects the file info looks as follows:

{
    'link': 'https://cordis.europa.eu/data/cordis-h2020reports-json.zip',
    'name': [
        {'@language': 'lt-t-en-t0-mtec', '@value': 'Programos „Horizontas 2020“ ataskaitos santraukos'}, 
        {'@language': 'fi-t-en-t0-mtec', '@value': 'H2020-raportin tiivistelmät'}, 
        {'@language': 'et-t-en-t0-mtec', '@value': 'Programmi „Horisont 2020“ aruande kokkuvõtted'}, 
        {'@language': 'es-t-en-t0-mtec', '@value': 'Resúmenes de los informes H2020'}, 
        {'@language': 'sk-t-en-t0-mtec', '@value': 'Súhrny správ o programe Horizont 2020'}, 
        {'@language': 'lv-t-en-t0-mtec', '@value': 'Pamatprogrammas “Apvārsnis 2020” ziņojuma kopsavilkumi'}, 
        {'@language': 'pt-t-en-t0-mtec', '@value': 'Resumos do relatório do H2020'}, 
        {'@language': 'da-t-en-t0-mtec', '@value': 'Sammendrag af H2020-rapporten'}, 
        {'@language': 'fr-t-en-t0-mtec', '@value': 'Résumés du rapport H2020'}, 
        {'@language': 'nl-t-en-t0-mtec', '@value': 'H2020 Rapportsamenvattingen'}, 
        {'@language': 'ga-t-en-t0-mtec', '@value': 'Achoimrí tuarascála H2020'}, 
        {'@language': 'uk-t-en-t0-mtec', '@value': 'H2020 Резюме звіту'}, 
        {'@language': 'hu-t-en-t0-mtec', '@value': 'H2020 Jelentésösszefoglalók'}, 
        {'@language': 'mt-t-en-t0-mtec', '@value': 'Sommarji tar-Rapport H2020'}, 
        {'@language': 'sv-t-en-t0-mtec', '@value': 'Sammanfattningar av H2020-rapporten'},
        {'@language': 'bg-t-en-t0-mtec', '@value': 'Резюмета на доклада по „Хоризонт 2020“'}, 
        {'@language': 'de-t-en-t0-mtec', '@value': 'Zusammenfassungen des H2020-Berichts'}, 
        {'@language': 'pl-t-en-t0-mtec', '@value': 'Podsumowanie sprawozdania w sprawie programu „Horyzont 2020”'}, 
        {'@language': 'en', '@value': 'H2020 Report summaries'}, 
        {'@language': 'hr-t-en-t0-mtec', '@value': 'Sažeci izvješća u okviru Obzora 2020.'}, 
        {'@language': 'sl-t-en-t0-mtec', '@value': 'Povzetki poročila o programu Obzorje 2020'}, 
        {'@language': 'ro-t-en-t0-mtec', '@value': 'Rezumate ale raportului H2020'}, 
        {'@language': 'cs-t-en-t0-mtec', '@value': 'Shrnutí zpráv programu Horizont 2020'}, 
        {'@language': 'el-t-en-t0-mtec', '@value': 'Περιλήψεις της έκθεσης «Ορίζων 2020»'}, 
        {'@language': 'it-t-en-t0-mtec', '@value': 'Sintesi della relazione H2020'}
    ],
    'size': None, 
    'hash': None, 
    'hash_type': None
}
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

1 participant