Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Why does readr::read_csv need timezone data and Olson database? #952

Closed
ttimbers opened this issue Jan 3, 2019 · 6 comments
Closed

Why does readr::read_csv need timezone data and Olson database? #952

ttimbers opened this issue Jan 3, 2019 · 6 comments

Comments

@ttimbers
Copy link

ttimbers commented Jan 3, 2019

This issue is closed, but I see no resolution: #320

I just ran into a similar situation today when teaching on a server. We were trying to use readr::read_csv to load this file: https://raw.githubusercontent.com/UBC-DSCI/dsci-100/master/materials/worksheet_01/marathon_small.csv

On one server which has tzdata installed, it works fine, on another that does not have this installed the following error message is given:

library(tidyverse)
marathon_small <- read_csv("marathon_small.csv")
Warning message in OlsonNames():
“no Olson database found”
Error: Unknown TZ UTC
Traceback:

1. read_csv("marathon_small.csv")
2. read_delimited(file, tokenizer, col_names = col_names, col_types = col_types, 
 .     locale = locale, skip = skip, comment = comment, n_max = n_max, 
 .     guess_max = guess_max, progress = progress)
3. col_spec_standardise(data, skip = skip, comment = comment, guess_max = guess_max, 
 .     col_names = col_names, col_types = col_types, tokenizer = tokenizer, 
 .     locale = locale)
4. guess_header(ds_header, tokenizer, locale)
5. guess_header_(datasource, tokenizer, locale)
6. default_locale()
7. locale()
8. check_tz(tz)
9. stop("Unknown TZ ", x, call. = FALSE)

This csv file is very simple (see snippet below). Why does readr::read_csv need timezone data and Olson database? This GitHub issue, appears to tracking down readr::read_csv depending on tzdata but why? Naively, it doesn't seem necessary, and read.csv works fine on either server.

.csv snippet:

age,bmi,km5_time_seconds,km10_time_seconds,sex
25.0,21.6221160888672,NA,2798,female
41.0,23.905969619751,1210.0,NA,male
25.0,21.6407279968262,994.0,NA,male
35.0,23.5923233032227,1075.0,2135,male
34.0,22.7064037322998,1186.0,NA,male
45.0,42.0875434875488,3240.0,NA,female
33.0,22.5182952880859,1292.0,NA,male
58.0,25.2340793609619,NA,3420,male

The smallest reproducible example I can get down to is this:

library(tidyverse)
default_locale()

which on the problematic server gives:

Warning message in OlsonNames():
“no Olson database found”
Error: Unknown TZ UTC
Traceback:

1. default_locale()
2. locale()
3. check_tz(tz)
4. stop("Unknown TZ ", x, call. = FALSE)

So the problem likely stems from the read_* functions depending on default_locale() for the locale argument by default. Is this really necessary? Naively, I think not, but I could be very wrong. Is there a way to say locale = None/NA/etc?

This is not just a problem for me, as I have come across others experiencing the same problem...

@jimhester
Copy link
Collaborator

Because readr reads date times using it.

Why doesn't your system have an Olson database?

@ttimbers
Copy link
Author

ttimbers commented Jan 3, 2019

In our case, I think because it wasn't always part of the docker-stacks for jupyter? But that begs the question, is every machine readr is used on expected to have that?

We can fix it on our machines, but I opened this issue because I am not the only one who seems to have come across this problem. Wonder if it would be nice/possible for read_* to not care about time zones if no date times are in the data being loaded? Or tell read_* not to read things as date/times and just as numbers/characters?

@jimhester
Copy link
Collaborator

jimhester commented Jan 3, 2019

Yes, every machine readr is used on is expected to have a olsen database, the docker containers you are using are too stripped down.

@ttimbers ttimbers closed this as completed Jan 3, 2019
@ttimbers
Copy link
Author

ttimbers commented Jan 3, 2019

Thanks for the prompt response and explanation @jimhester!

@wangshun1121
Copy link

In docker container, here is a solution:

# Ubuntu
apt install tzdata
# CentOS
yum install tzdata

@lock
Copy link

lock bot commented Nov 5, 2019

This old issue has been automatically locked. If you believe you have found a related problem, please file a new issue (with reprex) and link to this issue. https://reprex.tidyverse.org/

@lock lock bot locked and limited conversation to collaborators Nov 5, 2019
Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Labels
None yet
Projects
None yet
Development

No branches or pull requests

3 participants