Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Date column in mm/dd/yyyy format in salesforce report produces NA in results tibble #93

Closed
lorenze3 opened this issue Nov 2, 2020 · 8 comments
Assignees
Labels
bug Unintended behavior that should be corrected

Comments

@lorenze3
Copy link

lorenze3 commented Nov 2, 2020

With apologies for the no doubt numerous github issue style violations below, the issue I have is that when I pull a report for sales for with a date column, I'm getting a columns of NAs in R.

From salesforce's front end, I see a column Created Date with values in mm/dd/yyyy format. The resulting tibble has 0 non-NA values in the same column.

In a using sf_describe_report(), I can see the columns meta data has it described as a date. Based on cruising the utils-datatypes.R file I suspect that there is an underlying assumption that all date columns will be in yyyy-mm-dd format.

I'd love a pointer to the function(s) that are responsible for selecting data types . . . I thought I could do a quick rewrite of the sf_format_date() function, but either I don't understand the underlying issue or that function isn't called as part of the sf_run_report() chain of events.

Here's my actual R script:

#salesforcer test
library(salesforcer)
library(tidyverse)
sf_auth(security_token = 'super secret')
webformsReport <- "00O2G000006Rrj2UAC" #from URL, like this ...Report/<reportid here>/view?
#try a custom function here
results <- sf_run_report(webformsReport,verbose=T) #dates showing as NA . . .
report_details <- sf_describe_report(webformsReport)

And the relevant sf column metadata on Created Date column from the report_details list is:

> report_details$reportExtendedMetadata$detailColumnInfo$Lead.CreatedDate
$dataType
[1] "date"

$entityColumnName
[1] "Lead.CreatedDate"

$filterValues
list()

$filterable
[1] TRUE

$inactiveFilterValues
list()

$isLookup
[1] FALSE

$label
[1] "Created Date"

$uniqueCountable
[1] TRUE
 

Any advice appreciated!

@lorenze3
Copy link
Author

lorenze3 commented Nov 2, 2020

Not sure how I missed it the first time, but I now see that guess_types is an arg in sf_execute_report() that ultimately controls if readr::type_convert() is applied. For a local copy of sf_run_report() I have added ye olde as an arg and passed it as an arg in the call to sf_execute_report() inside the sf_run_report, which allows me to get useable data that I can convert later.

I think I see a use case for myself of being able to specify the type conversions by column to correct for issues like this. Any thoughts on the best way to add that functionality to the package? If I have time I could give it a spin, get to work for me, and then figure out pull requests to let it be reviewed for inclusion in your package.

@StevenMMortimer
Copy link
Owner

StevenMMortimer commented Nov 5, 2020

@lorenze3 Thanks for flagging. Yes, the interim solution is to use guess_types = FALSE. Reports format values differently than queries and other operations that the package performs, so I'll just have to add an exception to handle the case of date and date/time strings to this function:

salesforcer/R/utils-query.R

Lines 747 to 765 in acffaff

sf_guess_cols <- function(df, guess_types=TRUE, dataType=NULL){
if(guess_types){
if(is.null(dataType) || any(is.na(dataType)) || (length(dataType)== 0)){
df <- df %>%
type_convert(col_types = cols(.default = col_guess()))
} else {
col_spec <- sf_build_cols_spec(dataType)
# if numeric but contains Salesforce "-" then preemptively change to NA
if(grepl('i|n', col_spec)){
numeric_col_idx <- which(strsplit(col_spec, split=character(0))[[1]] %in% c("i", "n"))
df <- df %>%
mutate(across(all_of(numeric_col_idx), ~ifelse(.x == "-", NA_character_, .x)))
}
df <- df %>%
type_convert(col_types = col_spec)
}
}
return(df)
}

In this function, there is already a line of code to handle cases where reports return a dash "-" instead of a null value for integer and numeric columns. I'll do the same to handle any date fields.

@lorenze3
Copy link
Author

lorenze3 commented Nov 5, 2020

@StevenMMortimer Just clarifying (I see I wrote some lousy english) -- I did have to modify the sf_run_report function to be able to pass guess_types through to the sf_execute_report function. In my local copy I did this by adding the ellipses argument to sf_run_report and then passing that through to the call of sf_execute_report.

I might take a swing at modifying utils-query.R to add the date handling you described. IF you are ok with a dependency on the anytime package, I think I can directly copy your logic above use anytime::anydate() in the mutate call (line 758 in the block for replacing '-').

Thanks for the excellent package, btw. It's going to be a huge help at work.

@ckelly17
Copy link

Thanks to you both for this discussion. It helped me with a similar issue. It looks like this fix was implemented for sf_execute_report through guess_names = FALSE but not for sf_run_report. If I'm correct, sf_run_report is the function that allows for dynamic filtering. I'm trying to get around the API limit of 2000 records per request and am stuck between using sf_execute_report to get around the mm/dd/yyyy issue and sf_run_report to allow for filtering. Is there a way to have guess_names be an argument for sf_run_report as well?

@lorenze3
Copy link
Author

lorenze3 commented Jan 11, 2021

@ckelly17 --
<This was edited repeatedly until I got the codeblock to render reasonably well, sorry for the repeated attempts>

My workaround is a custom version of sf_run_report which adds the ... argument and passes it through to sf_get_report_instance_results. Full function pasted in below, but again my edits are only the ... two times: in the argument definition and the call to sf_get_report_instance_results. I guess that requires some commas to be added too.

sf_run_report2 <- function(report_id,
                          report_filters = NULL,
                          report_boolean_logic = NULL,
                          sort_by = character(0),
                          decreasing = FALSE,
                          top_n = NULL,
                          async = TRUE,
                          interval_seconds = 3,
                          max_attempts = 200,
                          wait_for_results = TRUE,
                          verbose = FALSE,
                          ...){
  
  # build out the body of the request based on the inputted arguments by starting 
  # with a simplified version and then adding to it based on the user inputted arguments
  request_body <- simplify_report_metadata(report_id, verbose = verbose)
  
  if(!is.null(report_filters)){
    stopifnot(is.list(report_filters))
    for(i in 1:length(report_filters)){
      report_filters[[i]] <- metadata_type_validator(obj_type = "ReportFilterItem", 
                                                     obj_data = report_filters[[i]])[[1]]
    }
    if(is.null(report_boolean_logic)){
      if(length(report_filters) > 1){
        report_boolean_logic <- paste((1:length(report_filters)), collapse=" AND ")
        message(sprintf(paste0("The argument `report_boolean_logic` was left NULL. ", 
                               "Assuming the report filters should be combined using 'AND' ", 
                               "like so: %s"), report_boolean_logic))
      } else {
        report_boolean_logic <- NA
      }
    } else {
      stopifnot(is.character(report_boolean_logic))
    }
  } else {
    # value must be null when filter logic is not specified
    report_boolean_logic <- NA
  }
  
  request_body$reportMetadata$reportBooleanFilter <- report_boolean_logic
  request_body$reportMetadata$reportFilters <- report_filters
  
  if(length(sort_by) > 0 & !(all(is.na(sort_by)))){
    if(length(sort_by) > 1){
      stop(paste0("Currently, Salesforce will only allow a report to be sorted ", 
                  "by, at most, one column."), call. = FALSE)
    } else {
      if(length(decreasing) < length(sort_by)){
        decreasing <- rep_len(decreasing, length.out = length(sort_by))  
      }
      sort_list_spec <- list()
      for(i in 1:length(sort_by)){
        sort_list_spec[[i]] <- list(sortColumn = sort_by[i], 
                                    sortOrder = if(decreasing[i]) "Desc" else "Asc")
      }
      request_body$reportMetadata$sortBy <- sort_list_spec
    }
  } else {
    # if there is no sortBy, then set it to NA, if there is, then leave it alone 
    # beause it is required when Top N is specified and the user might just want 
    # to use the existing sort order in the report
    if(is.null(request_body$reportMetadata$sortBy) || 
       is.na(request_body$reportMetadata$sortBy) || 
       length(request_body$reportMetadata$sortBy) == 0){
      request_body$reportMetadata$sortBy <- NA
    }
  }
  if(!is.null(top_n)){
    if(is.na(request_body$reportMetadata$sortBy)){
      stop(paste0("A report must be sorted by one column when requesting a ", 
                  "Top N number of rows."), call. = FALSE)
    } else if(length(request_body$reportMetadata$sortBy) > 1){
      stop(paste0("A report can only be sorted by one column when requesting a ", 
                  "Top N number of rows."), call. = FALSE)
    } else{
      # the direction is always 'Asc' because Salesforce doesn't accept 'Desc'. It 
      # relies on the 'sortOrder' element within the 'sortBy' element
      request_body$reportMetadata$topRows <- list(rowLimit = top_n, direction = "Asc")
    }
  }
  
  results <- sf_execute_report(report_id, 
                               async = async, 
                               report_metadata = request_body, 
                               verbose = verbose,...)
  
  # request the report results (still wait if async is specified)
  if(async){
    if(wait_for_results){
      status_complete <- FALSE
      z <- 1
      Sys.sleep(interval_seconds)
      while (z < max_attempts & !status_complete){
        if (verbose){
          if(z %% 5 == 0){
            message(paste0("Attempt to retrieve records #", z))
          }
        }
        Sys.sleep(interval_seconds)
        instances_list <- sf_list_report_instances(report_id, verbose = verbose)
        instance_status <- instances_list[[which(instances_list$id == results$id), "status"]]
        if(instance_status == "Error"){
          stop(sprintf("Report run failed (Report Id: %s; Instance Id: %s).", 
                       report_id, results$id), 
               call.=FALSE)
        } else {
          if(instance_status == "Success"){
            status_complete <- TRUE
          } else {
            # continue checking the status until success or max attempts
            z <- z + 1
          }
        }
      }
      results <- sf_get_report_instance_results(report_id, 
                                                results$id, 
                                                verbose = verbose,...)
    }
  }
  # if not aysnc and waiting for results, then sf_execute_report() will return 
  # the parsed dataset (if sync) or request details if async to check on the results 
  # without having the wrapper executing the wait. This is so users can leverage 
  # the simpler interface (i.e. providing function arguments) instead of researching 
  # the Salesforce documentation and building the reportMetadata property from scratch
  return(results)
}

--

@ckelly17
Copy link

Thanks! Will look into replicating.

@lorenze3
Copy link
Author

Crap -- just in case you aren't straight up copy and pasting that, I just saw that I had to ,... in a third place: the call to sf_execute_report that's used in the synchronous branch of the conditional. Which is actually the branch I use.

sorry for any confusion, and hope it works well for you.

@ckelly17
Copy link

Thanks! Copy/paste worked just fine. Really appreciate it!

@StevenMMortimer StevenMMortimer added the bug Unintended behavior that should be corrected label May 5, 2021
@StevenMMortimer StevenMMortimer self-assigned this Jun 3, 2021
@StevenMMortimer StevenMMortimer added this to the Release 0.2.3 milestone Jul 4, 2021
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
bug Unintended behavior that should be corrected
Projects
None yet
Development

No branches or pull requests

3 participants