Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Array of 1 not represented as list in R #69

Open
svdwoude opened this issue Mar 11, 2019 · 6 comments
Open

Array of 1 not represented as list in R #69

svdwoude opened this issue Mar 11, 2019 · 6 comments
Labels

Comments

@svdwoude
Copy link

When we parse below example with yaml.load

key: value
array1:
  - item
array2:
  - item1
  - item2

we get

str(yaml::yaml.load("
key: value
array1:
  - item
array2:
  - item1
  - item2
"))
#> List of 3
#>  $ key   : chr "value"
#>  $ array1: chr "item"
#>  $ array2: chr [1:2] "item1" "item2"

I would expect to have the option that makes sure array1 is a list of 1 in this case, representing it was an array of length 1.

Just as a reference when I do the same thing for json in jsonlite

str(jsonlite::fromJSON('
{
  "key": "value",
  "array1": ["item"],
  "array2": ["item1", "item2"]
}
', simplifyVector = FALSE))
#> List of 3
#>  $ key   : chr "value"
#>  $ array1:List of 1
#>   ..$ : chr "item"
#>  $ array2:List of 2
#>   ..$ : chr "item1"
#>   ..$ : chr "item2"

Note: when using jsonlite::read_json the simplifyVector argument is set to FALSE by default

Could we add a similar simplifyVector = FALSE option to force the R object to represent the original structure?

(applies to rstudio/plumber#390 )

@viking viking added the feature label Mar 13, 2019
@viking
Copy link
Contributor

viking commented Mar 13, 2019

The easiest way to accomplish what you want at this moment is to use a combination of a custom handler and setting the as.named.list parameter to FALSE:

str(yaml.load("{ test: [123, 456] }", handlers = list(seq = function(x) x), as.named.list=FALSE))
#> List of 1
#>  $ :List of 2
#>  ..$ : int 123
#>  ..$ : int 456
#>  - attr(*, "keys")=List of 1
#>  ..$ : chr "test"

It might be a good idea to add a more user-friendly parameter, though.

@viking
Copy link
Contributor

viking commented Mar 13, 2019

Check out the documentation for as.named.list, since you might not need it. By default, yaml.load will coerce keys to strings, but there's no requirement that a map key has to be a string. If you turn off as.named.list, coercion is not performed.

@hantonita
Copy link

str(yaml.load("{ test: [123, 456] }", handlers = list(seq = function(x) x)))

is a good solution for this issue (I indeed still need the list names), but as mentioned above it would be nice to have an option similar to jsonlite’s simplifyVector = FALSE which would insert this handler behind the scenes. This would make use of this feature much more intuitive.
A good place for this could be read_yaml() since this is a convenience function with the same objective as yaml.load_file(). @viking would you accept a PR for the implementation of this feature?

@viking
Copy link
Contributor

viking commented Mar 25, 2019

I don't like the idea of only implementing it in read_yaml. That function only exists as a wrapper for those who wish to have a readr-like interface. I'm also not sure about the parameter name simplifyVector. There's more than just simplification of vectors going on. This feature idea is more complex than it seems on the surface.

The above solution works by disabling coercion of YAML sequences from lists to vectors by way of a custom handler that does nothing. Having a user-friendly option just for disabling coercion of YAML sequences seems a bit too limited in scope to deserve its own parameter. I think having a more general way to disable default handlers would be more appropriate. Maybe something like:

yaml.load("{ test: [123, 456] }",  pristine = c("seq"))

The pristine parameter (or something similar) could be used to disable default handlers for the specified YAML types.

@spgarbet
Copy link
Member

I can't see a lot of anything going on in the default handlers except for "seq". I feel like I'm missing something reading the code.

spgarbet added a commit that referenced this issue Feb 23, 2022
@salim-b
Copy link
Contributor

salim-b commented Jun 18, 2023

I think it would be important for the default settings to let a YAML input survive a yaml::read_yaml() -> yaml::write_yaml() round trip unharmed.

Currently, length-1 YAML sequences get "simplified" during a round trip which is not what most people might expect:

tmp_file <- tempfile(fileext = ".yml")

cat("key: value
array1:
  - item
array2:
  - item1
  - item2
",
file = tmp_file)

# before yaml round trip
readLines(con = tmp_file) |> cat(sep = "\n")
#> key: value
#> array1:
#>   - item
#> array2:
#>   - item1
#>   - item2

yaml::yaml.load_file(input = tmp_file) |> yaml::write_yaml(file = tmp_file)

# after yaml round trip
readLines(con = tmp_file) |> cat(sep = "\n")
#> key: value
#> array1: item
#> array2:
#> - item1
#> - item2

Created on 2023-06-18 with reprex v2.0.2

salim-b added a commit to rpkg-dev/pkgpurl that referenced this issue Jun 19, 2023
- avoid default YAML sequence simplification, cf. vubiostat/r-yaml#69
- use base R pipe and anonymous fns where possible
- refactor code and improve readability
- extend `process_pkg()` doc
- other tweaks
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Projects
None yet
Development

No branches or pull requests

5 participants