Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

with_tz to support dataframes? #344

Closed
ijlyttle opened this issue Sep 8, 2015 · 7 comments
Closed

with_tz to support dataframes? #344

ijlyttle opened this issue Sep 8, 2015 · 7 comments

Comments

@ijlyttle
Copy link
Contributor

ijlyttle commented Sep 8, 2015

Could we consider adapting with_tz() (and possibly force_tz()) to accept dataframes as well as vectors?

This is a follow-on to tidyverse/readr#230.

The idea would be that if the first arg is a vector, with_tz() would behave as it does currently.

If the first arg is a dataframe, with_tz() would loop over all the time-columns of the dataframe, and call with_tz() on them.

If you like, I could work up a PR.

Thanks,

Ian

@vspinu
Copy link
Member

vspinu commented Sep 9, 2015

I think these sort of things must be part of generic data manipulation packages.

What is the most concise pattern to do it right now?

for(nm in names(df)) 
    if(is.POSIXt(df[[nm]])
       df[[nm]] <- with_tz(df[[nm]])

## or 

df <- do.call(cbind, lapplydf, function(x) if(is.POSIXt(x)) with_tz(x) else x)

@hadley any tricks with plyr/dplyr?

This is a job for an idempotent conditional map like:

df <- condmap(df, is.POSIXt, forse_tz)

In any case, if we are taking this path we probably should enhance other functions as well. with_tz is essentially a coercion function so it would make sense to have other coercion functions, and maybe parsers, to operate on a data frame.

Could you please go through the list of functions in lubridate and iden

If we are going on this path then we

@ijlyttle
Copy link
Contributor Author

ijlyttle commented Sep 9, 2015

To answer a question that @hadley posed in another issue, maybe I can try to explain the problem I'm trying to solve.

When I know the columns of a dataframe, with_tz does a great job as-is. My particular problem is developing a shiny app with a csv parser (using readr) - where the app does not know the columns it is getting.

I know this is in a grey area between dplyr and lubridate, so I flipped a coin :)

My workaround is a utility function that borrows from dplyr and lubridate:

df_set_tz <- function(df, tz = "UTC"){

  fn_set_tz <- function(x){

    if (!identical(dplyr::type_sum(x), "time"))
      return(x) # do nothing

    lubridate::with_tz(x, tz)
  }

  dplyr::as_data_frame(lapply(df, fn_set_tz))
}

@lionel-
Copy link
Member

lionel- commented Sep 10, 2015

Hi Vitalie,

This is a job for an idempotent conditional map like:

This is exactly what purrr::map_if() is for ;)

df <- do.call(cbind, lapplydf, function(x) if(is.POSIXt(x)) with_tz(x) else x)

I think you can replace the do.call() with df[] <- lapply(df, fun). The [<- method for data frames makes sure the object stays a data frame.

@vspinu
Copy link
Member

vspinu commented Sep 10, 2015

Thanks @lionel- . Didn't know about purrr. I was almost there to start writing a functional vector manipulation library myself.

@josiekre
Copy link

I think this is related so I will add to the discussion here. I have a dataframe with column of timestamps in UTC and another column with the local timezone. I need to create a new column with the timestamp converted to the local time.

library(dplyr)
data <- data_frame(
  timestamp_utc = as.POSIXct(c('2015-11-18 03:55:04', '2015-11-18 03:55:08', 
                    '2015-11-18 03:55:10'), tz = "UTC"),
  local_tz = c('America/New_York', 'America/Los_Angeles', 
               'America/Indiana/Indianapolis')
  )

This doesn't work in a chain of other dplyr statements.

data %>% 
  mutate(timestamp_local = with_tz(timestamp_utc, tzone = local_tz))

This does but obviously is slow. (Answer from StackOverflow.)

data %>% 
  rowwise() %>%
  mutate(timestamp_local = with_tz(timestamp_utc, tzone = local_tz))

I tend to use dplyr with readr, stringr, and lubridate. The fact that lubridate isn't always able to work with dataframes is tricky. The bummer is that I don't know enough about how dplyr is coded or how lubridate is to even guess how much work this would be.

@josiekre
Copy link

I've moved this over to a different issue #359. It turns out it isn't related to this issue the more I read.

@vspinu vspinu closed this as completed in 857c07c Mar 6, 2016
@ijlyttle
Copy link
Contributor Author

ijlyttle commented Mar 7, 2016

Cool! Thanks!

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

4 participants