Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Support multiple <th> rows #286

Closed
geotheory opened this issue Oct 1, 2020 · 2 comments
Closed

Support multiple <th> rows #286

geotheory opened this issue Oct 1, 2020 · 2 comments
Labels
feature a feature request or enhancement table 🏓

Comments

@geotheory
Copy link

Current support is for a single row, which means where tables have an explicit multiple-row header (defined with tags) the function has no way to handle this. The tables in this page are such an example. Would it be desirable to support option of squashing multiple header rows?

@hadley hadley changed the title Parsing multiple <th> rows header with html_table Support multiple <th> rows Dec 14, 2020
@hadley hadley added feature a feature request or enhancement table 🏓 labels Dec 14, 2020
@hadley
Copy link
Member

hadley commented Dec 19, 2020

I think this problem is a bit too general for rvest to tackle — figuring out exactly how to represent this sort of data in R is an open question, I think.

@hadley hadley closed this as completed Dec 19, 2020
@dansharkey
Copy link

dansharkey commented Sep 8, 2023

Having come across a similar problem today, I'm wondering if there is any proposal to implement a solution to this issue? I wonder if perhaps a nested list type approach might work well for this? Particularly with the new unnest_wider and unnest_longer functions, this might be conducive to a successful workflow.
Alternatively, is there an rvest method that I am unfamiliar with that will provide access to both of the headers?
A naive approach might be to concatenate the "first" header with the "second" header (in the example given this would produce column names of, for example, "Legislative election - Last", which would be easier to process.

If we can decide on an approach that would work nicely (either list-based or concatenated header rows), I might be able to donate some time to implement a solution.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
feature a feature request or enhancement table 🏓
Projects
None yet
Development

No branches or pull requests

3 participants