You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
Current support is for a single row, which means where tables have an explicit multiple-row header (defined with tags) the function has no way to handle this. The tables in this page are such an example. Would it be desirable to support option of squashing multiple header rows?
The text was updated successfully, but these errors were encountered:
hadley
changed the title
Parsing multiple <th> rows header with html_table
Support multiple <th> rows
Dec 14, 2020
I think this problem is a bit too general for rvest to tackle — figuring out exactly how to represent this sort of data in R is an open question, I think.
Having come across a similar problem today, I'm wondering if there is any proposal to implement a solution to this issue? I wonder if perhaps a nested list type approach might work well for this? Particularly with the new unnest_wider and unnest_longer functions, this might be conducive to a successful workflow.
Alternatively, is there an rvest method that I am unfamiliar with that will provide access to both of the headers?
A naive approach might be to concatenate the "first" header with the "second" header (in the example given this would produce column names of, for example, "Legislative election - Last", which would be easier to process.
If we can decide on an approach that would work nicely (either list-based or concatenated header rows), I might be able to donate some time to implement a solution.
Current support is for a single row, which means where tables have an explicit multiple-row header (defined with tags) the function has no way to handle this. The tables in this page are such an example. Would it be desirable to support option of squashing multiple header rows?
The text was updated successfully, but these errors were encountered: