Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

MSSQL Incorrect Translation for !is.na() #1239

Closed
kmishra9 opened this issue Apr 7, 2023 · 4 comments · Fixed by #1271
Closed

MSSQL Incorrect Translation for !is.na() #1239

kmishra9 opened this issue Apr 7, 2023 · 4 comments · Fixed by #1271
Labels

Comments

@kmishra9
Copy link

kmishra9 commented Apr 7, 2023

When using !is.na() in any context while submitting queries to my MSSQL Server, my ODBC driver isn't happy about a syntax error. It appears using the ~ in the query creates an issue, where the most appropriate translation for !is.na(x) should probably be x IS NOT NULL.

suppressPackageStartupMessages(library(dplyr))
suppressPackageStartupMessages(library(dbplyr))

lf <- lazy_frame(tibble(a = 1:10, b = 2), con = simulate_mssql())

# Fails
lf %>% mutate(c = !is.na(a))
#> <SQL>
#> SELECT *, CAST(IIF(~(`a` IS NULL), 1, 0) AS BIT) AS `c`
#> FROM `df`

# Succeeds
lf %>% mutate(c = is.na(a))
#> <SQL>
#> SELECT *, CAST(IIF((`a` IS NULL), 1, 0) AS BIT) AS `c`
#> FROM `df`

# Succeeds
lf %>% mutate(c = sql('CAST(IIF((`a` IS NOT NULL), 1, 0) AS BIT)'))
#> <SQL>
#> SELECT *, CAST(IIF((`a` IS NOT NULL), 1, 0) AS BIT) AS `c`
#> FROM `df`

Created on 2023-04-07 with reprex v2.0.2

Specifically, the error is:

Error in `collect()`:
! Failed to collect lazy table.
Caused by error:
! nanodbc/nanodbc.cpp:1752: 00000: [FreeTDS][SQL Server]Statement(s) could not be prepared.  [FreeTDS][SQL Server]Incorrect syntax near 'q01'.  [FreeTDS][SQL Server]Incorrect syntax near the keyword 'IS'. 
<SQL> 'SELECT TOP 11 *, CAST(IIF(~("a" IS NULL), 1, 0) AS BIT) AS "c"
FROM (
  SELECT *
  FROM "#dbplyr_005"
) "q01"'

but when substituting with raw SQL like above, things work as expected.

I'm not sure if this is an indicator of a broader issue with how the MSSQL implementation expects to apply the "not operator" to non-existent boolean types using "bits" (MSSQL Server doesn't support booleans, which is a facepalm), but I was definitely having issues trying to invert a TRUE or 1 value in any way using ! via dbplyr.

@carlganz
Copy link

carlganz commented Apr 7, 2023

FWIW any(is.na(X)) is also translated incorrrectly for MSSQL to MAX(CAST(IIF(("X" IS NULL), 1, 0) AS BIT)), which generates error: Operand data type bit is invalid for max operator.

@mgirlich
Copy link
Collaborator

I'm afraid I can't help much here. I don't know much about MSSQL and I don't have a database to test things with. The handling of boolean values in the code is a bit complicated and to fix these kind of issues might require a new translation approach.

@ejneer
Copy link
Contributor

ejneer commented Apr 29, 2023

Looks like the translation for ! simply puts the ~ in the wrong spot. The following works for me:

SELECT ~CAST(IIF(("year" IS NULL), 1, 0) AS BIT) AS "!is.na(year)"
FROM "flights"

@kmishra9 Can you give devtools::install_github("tidyverse/dbplyr#1271") a try to confirm this works for you?

The issue with any(is.na(X)) is slightly different and more involved so it probably needs a separate issue.

@kmishra9
Copy link
Author

This also looks like it works now. Thanks for your work @ejneer

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Projects
None yet
Development

Successfully merging a pull request may close this issue.

4 participants