-
-
Notifications
You must be signed in to change notification settings - Fork 71
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
merge.xts drops column on join with an xts with no rows #222
Comments
Thanks for the report! I would also note that the current behavior differs from x <- zoo(x = matrix(1:9, 3, 3), order.by = as.Date(17167:17169))
colnames(x) <- c("a","b","c")
y <- zoo(coredata(x), order.by = index(x)+5)
z <- y[FALSE]
merge(x, z, all = c(TRUE, FALSE))
a.x b.x c.x a.z b.z c.z
2017-01-01 1 4 7 NA NA NA
2017-01-02 2 5 8 NA NA NA
2017-01-03 3 6 9 NA NA NA
merge(z, x, all = c(FALSE, TRUE))
a.z b.z c.z a.x b.x c.x
2017-01-01 NA NA NA 1 4 7
2017-01-02 NA NA NA 2 5 8
2017-01-03 NA NA NA 3 6 9
merge(z, x)
a.z b.z c.z a.x b.x c.x
2017-01-01 NA NA NA 1 4 7
2017-01-02 NA NA NA 2 5 8
2017-01-03 NA NA NA 3 6 9
merge(x, z)
a.x b.x c.x a.z b.z c.z
2017-01-01 1 4 7 NA NA NA
2017-01-02 2 5 8 NA NA NA
2017-01-03 3 6 9 NA NA NA |
yes, the zoo merge function produces the expected output shape (number of rows and columns with NA padding). however, i think xts uses different rules for handling duplicate column names. |
edge case test for all inputs having zero rows x <- xts(x = matrix(1:9, 3, 3), order.by = as.Date(17167:17169))
colnames(x) <- c("a","b","c")
merge(x[F,], x[F,]) #expect [0,6] output
merge(x[F,], x[F,], x[F,]) #expect [0,9] output seg faults x <- xts(x = matrix(1:9, 3, 3), order.by = as.Date(17167:17169))
merge(x[F,], x, x[F,]) #faults if run multiple times |
Your first example matches the output from x <- xts(matrix(1:9, 3, 3), as.Date(17167:17169), dimnames = list(NULL, c("a","b","c")))
(x0 <- x[FALSE,])
## a b c
z <- as.zoo(x)
(z0 <- z[FALSE,])
## a b c
merge(z0, z0)
## Data:
## numeric(0)
##
## Index:
## Date of length 0
merge(x0, x0)
## Data:
## integer(0)
##
## Index:
## Date of length 0 @jaryan, @ggrothendieck, @zeileis, what do you think about this edge case where we merge two zoo objects with columns but zero observations? The result is an empty zoo object, but Ethan expects an object with zero observations and 6 columns. Merging this type of zoo object with at least one zoo object that has observations returns a zoo object with the same number of columns as the sum of the columns of the inputs. So I can understand Ethan's expectation. Thoughts? Your second example errors for me, but it doesn't crash the R session even under valgrind and EDIT: Of course, it crashed right after I posted this... never mind. merge(x0, x0, x0)
## Error in merge.xts(x0, x0, x0) :
## INTEGER() can only be applied to a 'integer', not a 'double'
sessionInfo()
## R version 4.2.1 (2022-06-23)
## Platform: x86_64-pc-linux-gnu (64-bit)
## Running under: Ubuntu 20.04.5 LTS
##
## Matrix products: default
## BLAS: /usr/lib/x86_64-linux-gnu/blas/libblas.so.3.9.0
## LAPACK: /usr/lib/x86_64-linux-gnu/lapack/liblapack.so.3.9.0
##
## locale:
## [1] LC_CTYPE=en_US.UTF-8 LC_NUMERIC=C LC_TIME=en_US.UTF-8 LC_COLLATE=en_US.UTF-8
## [5] LC_MONETARY=en_US.UTF-8 LC_MESSAGES=en_US.UTF-8 LC_PAPER=en_US.UTF-8 LC_NAME=C
## [9] LC_ADDRESS=C LC_TELEPHONE=C LC_MEASUREMENT=en_US.UTF-8 LC_IDENTIFICATION=C
##
## attached base packages:
## [1] stats graphics grDevices datasets utils methods base
##
## other attached packages:
## [1] tinytest_1.3.1 xts_0.12.2.2 zoo_1.8-11
##
## loaded via a namespace (and not attached):
## [1] compiler_4.2.1 parallel_4.2.1 tools_4.2.1 bspm_0.3.10 grid_4.2.1 lattice_0.20-45
|
some color on my perspective:
|
This was wrong for a couple edge cases. Merging two or more zero-length objects with column names returned an empty object. That's consistent with zoo, but it's inconsistent with set algebra. This change keeps the number of columns for zero-length objects, so we have to process column names. (I also made sure that the result's index has the attributes from the first argument, 'x'). See #222.
testing on a3e9fb2 x <- xts(matrix(1:9, 3, 3), as.Date(17167:17169), dimnames = list(NULL, c("a","b","c")))
merge(x, 1:3) #works fine
merge(x, x, 1:3) #errors rest looks great! |
Also check both the coredata and the index, not just the coredata. See #222.
This would return 0 columns for zero-length xts objects that had dims, which could cause a segfault. See #222.
Thanks for the testing and feedback again! These are fixed now. I added the examples in your previous comment to the unit tests, and I'll add these too. Can you share more of the tests you're running? It would be great if you could add them to |
those issues all came up running through my analytics suite. will add tests if i think of anything else current patch works for all my scenarios! Thank you! |
Awesome news! Thanks for testing! |
Description
a join should always produce the union of column form the joined tables. The various types of joins alter the rows produced. specifically, a left join should always output the number of rows in the left table, and the number of columns form both the left and right tables
merge.xts
does not produce the correct output columns when one of the input xts object contains no rowsI tried to find the problem, but it looks like the bulk of the code is implemented in C and beyond my abilities. let me know if i can help
Expected behavior
Expected output is that the the columns from the empty table are included, with the value specified by the
fill
parameterMinimal, reproducible example
Session Info
The text was updated successfully, but these errors were encountered: