Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Big issues with Poland #64

Open
eguidotti opened this issue Oct 11, 2021 · 6 comments
Open

Big issues with Poland #64

eguidotti opened this issue Oct 11, 2021 · 6 comments
Labels
bug Something isn't working

Comments

@eguidotti
Copy link

Hi there,
the data for Poland seem to be cumulative counts before 2020-11-23 and daily counts afterwards (cases, deaths, tests).
The nuts_2 column is also mixed with the same regions referenced with different strings due to encoding issues.
It would be great to fix that by standardizing this file.
That would be really helpful as this is the only source for Poland at regional level I could find!
Many thanks,
Emanuele

@emptymalei emptymalei added the bug Something isn't working label Oct 18, 2021
@emptymalei
Copy link
Contributor

Indeed. This is confirmed. We should have the cumulative values.

We will need to fix all files in this folder:
https://github.com/covid19-eu-zh/covid19-eu-data/tree/master/dataset/daily/pl

I believe we will need to fix the download and aggregation script too.

Added to my todo list.

@emptymalei
Copy link
Contributor

This should be fixed now.

@eguidotti
Copy link
Author

Mmm... it seems that tests are not cumulative and the issue with encodings is still there.
I would like to ask:

  • are tests daily tests or should they be cumulative numbers as well?
  • the column nuts_2 does not seem to be corresponding to the NUTS codes. Indeed, for Poland we have 17 NUTS2 codes but it seems to me that "Warszawski stołeczny" and "Mazowiecki regionalny" are provided together as the NUTS 1 "MAZOWIECKIE".
  • it is difficult to track the same region in time because it is referenced with different names due to encoding issues. Would it be possible to fix that?
  • in the nuts_2 column there are wrong entries such as (a) website addresses and (b) some names that do not correspond to any Polish region

Many thanks for your efforts!

@emptymalei
Copy link
Contributor

emptymalei commented Oct 31, 2021

@eguidotti This is very hard to fix and manage. I decided to stop collecting data for Poland.
9f1c347

@emptymalei emptymalei reopened this Oct 31, 2021
@eguidotti
Copy link
Author

Completely understandable. Thanks anyway

@emptymalei
Copy link
Contributor

emptymalei commented Nov 1, 2021

@eguidotti ❤️

If you need the original data, you can download all the data from here

https://arcgis.com/sharing/rest/content/items/a8c562ead9c54e13a135b02e0d875ffb/data

These are the daily data files.

There are some data before this, but as you have already spotted, there are many inconsistencies between the datasets.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
bug Something isn't working
Projects
None yet
Development

No branches or pull requests

2 participants