Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Parsing Failures #12

Closed
r-lister opened this issue Dec 11, 2018 · 10 comments
Closed

Parsing Failures #12

r-lister opened this issue Dec 11, 2018 · 10 comments
Assignees
Labels
bug Unintended behavior that should be corrected

Comments

@r-lister
Copy link

r-lister commented Dec 11, 2018

Hi Steven,

I'm having an issue loading an object from salesforce. The problem is one of the columns is expected to be a double and I need it to be a string. Here is my code:

salesforce_data <- sf_query_bulk(paste0("SELECT ",field_list," FROM my_object"), object_name = "my_object")

The record is about 130k rows and I get a parsing error on 254 of them where it expected a double:
Warning: 254 parsing failures. row col expected actual file 19845 my_field a double ag:23842931048080057 literal data 19846 my_field a double ag:23842931012580057 literal data 19847 my_field a double ag:23842931069470057 literal data 19848 my_field a double ag:23842931028810057 literal data 19849 my_field a double ag:23842931028810057 literal data ..... ................. ........ .................... ............ See problems(...) for more details.
capture

(254 more of the same)

All of the issues are of the same field, I would like the whole column to be brought through as string so I don't lose these rows - as they currently come through as null after the parsing failure.
Is there anyway to specify what type to bring the columns through as? Or bring them all through as strings?

Thanks,
Ryan

@StevenMMortimer
Copy link
Owner

StevenMMortimer commented Dec 12, 2018

@r-lister I think we just need to give you the option to override or specify the col_types() argument at this part in the code:

res <- read_csv(response_text)

Salesforce is a little goofy in that the columns don't always come back in the order that you specify them in the query, so your suggestion of having a TRUE/FALSE argument to control returning everything as a string might be the best option. I'll look into making that change this weekend.

@r-lister
Copy link
Author

@StevenMMortimer Thanks for looking into it so quickly!
Let me know how you get on with the change,

Thanks again,
Ryan

@r-lister
Copy link
Author

r-lister commented Dec 12, 2018

Just to add to this while I have you, often I get a timeout error running the sf_query_bulk() function.
Function's Time Limit Exceeded. Aborting Job Now Error in match.arg(api_type) : 'arg' should be one of “Bulk 2.0”

It tells me to use the api_type of "Bulk 2.0" but if I try run
sf_query( ..., api_type = "Bulk 2.0")
I get the error
Error in match.arg(api_type) : 'arg' should be one of “REST”, “SOAP”, “Bulk 1.0”

Is there anyway to use Bulk 2.0?

@StevenMMortimer
Copy link
Owner

@r-lister I'm going to move this into a separate issue and reply there.

@StevenMMortimer
Copy link
Owner

@r-lister This has been fixed. If you want to force everything to be character columns, then just turn off the type guessing with the new argument guess_types. Here is an example that shows how you can turn off the guessing. The default is TRUE.

salesforce_data <- sf_query_bulk(paste0("SELECT ",field_list," FROM my_object"), 
                                 object_name = "my_object", 
                                 guess_types = FALSE)

In order to see this new feature you will have to re-install from GitHub using devtools::install_github('StevenMMortimer/salesforcer') or you can wait until the next release on CRAN and install from there using install.packages('salesforcer')

@r-lister
Copy link
Author

Thanks @StevenMMortimer !
Installed the new version using devtools, the new feature lets me cut out loads of code used previously to clean the csv, works like a charm now.

@Lulliter
Copy link

@r-lister This has been fixed. If you want to force everything to be character columns, then just turn off the type guessing with the new argument guess_types. Here is an example that shows how you can turn off the guessing. The default is TRUE.

Hi Steven, this might be related. I have an issue with sf_query_bulk incorrectly quering some picklist and Checkbox values that appears if I use guess_types = FALSE (which, by the way works in R 3. 5, but not in R 3.3).
Any idea? Do you think being Contact Custom Fields has an impact?
Thanks!

Below I try to give the example: ~VolunteerStatus, ~Speaker__c, ~VIP__c, default to FALSE, when instead I have some other values in SALESFORCE

1) USING guess_types = FALSE

Cont_ALL <- sf_query_bulk("SELECT OwnerId, AccountId, FirstName, LastName, ConnectionLevel__c, VIP__c, Speaker__c, GW_Volunteers__Volunteer_Status__c, npe01__Preferred_Email__c, MailingCity FROM Contact", object_name, api_type = "Bulk 1.0", interval_seconds = 5, max_attempts = 500, verbose = FALSE , guess_types = FALSE) -->

These are the wrong values

          ~FirstName, ~VolunteerStatus, ~Speaker__c, ~VIP__c, ~MailingCity,
                   "Adria",            FALSE,       FALSE,   FALSE,           NA,
              "Peter John",            FALSE,       FALSE,   FALSE,    "Yonkers",
                   "Renzo",            FALSE,       FALSE,   FALSE,    "Madison",
                  "Israel",            FALSE,       FALSE,   FALSE,   "Lawrence",
                 "Abdul R",            FALSE,       FALSE,   FALSE,    "HOUSTON"

==========================

2) USING guess_types = TRUE (default)

Cont_ALL_guess <- sf_query_bulk("SELECT OwnerId, AccountId, FirstName, LastName, ConnectionLevel__c, VIP__c, Speaker__c, GW_Volunteers__Volunteer_Status__c, npe01__Preferred_Email__c, MailingCity FROM Contact", object_name, api_type = "Bulk 1.0", interval_seconds = 5, max_attempts = 500, verbose = FALSE )

These are the correct values

               ~FirstName, ~VolunteerStatus, ~Speaker__c, ~VIP__c, ~MailingCity,
                   "Adria",         "Active",     "false", "false",           NA,
              "Peter John",               NA,      "true",  "true",    "Yonkers",
                   "Renzo",       "Inactive",     "false",  "true",    "Madison",
                  "Israel",               NA,      "true",  "true",   "Lawrence",
                 "Abdul R",       "Inactive",     "false", "false",    "HOUSTON" 

@StevenMMortimer
Copy link
Owner

StevenMMortimer commented Dec 21, 2018

@Lulliter Thanks for making a note. I'll have to look into it some more, but first, can you uninstall and re-install the salesforcer and readr packages? Every minor version of R 3.x, requires you to re-install the packages and the read_csv() function has only been recently updated to be able to guess "true"/"false" as TRUE/FALSE. I believe the version of the readr package you have with R 3.3 is probably older and doesn't have the new functionality.

UPDATE: I've also just tested with a custom checkbox and a custom picklist on the Contact object and didn't see any issues.

@Lulliter
Copy link

Thank you very much, I did and it is now working for me too - on R 3.5 which is where I was having the issue.

@danwwilson
Copy link

I know this is closed, but just wanted to add this in the thread for documentation purposes. If you're using sf_query instead of sf_query_bulk you can still use the guess_types = FALSE parameter.
E.g.

queried_records <- sf_query(my_soql, 
    object_name = "Contact", 
    api_type = "Bulk 1.0", 
    guess_types = FALSE)

@StevenMMortimer StevenMMortimer added the bug Unintended behavior that should be corrected label Jun 26, 2019
@StevenMMortimer StevenMMortimer self-assigned this Jun 26, 2019
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
bug Unintended behavior that should be corrected
Projects
None yet
Development

No branches or pull requests

4 participants