You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
When parsing HTML tables, it is frequently the case that non-unique column names appear, e.g. when column names are multi-row and the first row spans multiple columns.
It would be nice if html_table could expose .name_repair as an argument to pass through to as_tibble. As it stands, the current implementation uses a hard-coded .name_repair = "minimal" in its call to as_tibble. This currently requires users to add an extra as_tibble(.name_repair = "unique") in pipelines parsing more complicated HTML tables.
Packages that are in the business of making tibbles may even want to expose the .name_repair argument and pass it through to tibble() or as_tibble(). For example, this is the approach planned for readxl, which reads rectangular data out of Excel workbooks.
The text was updated successfully, but these errors were encountered:
As @rhalbersma indicated, this is triggered by a bunch of ths that have a rowspan >1 and one with a colspan > 1. Ideally, the name repair would default to treating the 2nd-row ths as suffixes to "Height", giving us unique colnames "Height_(ft)" and "Height_(m)".
When parsing HTML tables, it is frequently the case that non-unique column names appear, e.g. when column names are multi-row and the first row spans multiple columns.
It would be nice if
html_table
could expose.name_repair
as an argument to pass through toas_tibble
. As it stands, the current implementation uses a hard-coded.name_repair = "minimal"
in its call toas_tibble
. This currently requires users to add an extraas_tibble(.name_repair = "unique")
in pipelines parsing more complicated HTML tables.Such as an extension would be in line with the recommendation from https://www.tidyverse.org/blog/2018/11/tibble-2.0.0-pre-announce/
The text was updated successfully, but these errors were encountered: