Use colnames() rather than names() in gather()? #138

jarodmeng · 2015-11-12T17:00:05Z

We're trying to implement gather() and spread() for SQL databases using S3 methods. However, gather() cannot be extended this way because it uses names() to get column names. For SQL database backends, names() would return the names of the list elements rather than the column names.

Is it possible to use colnames() instead, so that it works for both data frames and SQL backends?

The text was updated successfully, but these errors were encountered:

hadley · 2015-11-12T19:13:20Z

Would be better to use dplyrs tbl_vars(). But what SQL are you going to generate? I always assumed gather/spread in SQL would be prohibitively difficult

jarodmeng · 2015-11-12T21:54:09Z

Since it's used in the generic gather function, using colnames would preserve the functionalities for data frames, but allow it to be extended to SQL backends.

I implemented gather for SQL backends by building a bunch of lazy dots (mutate dots to create flag columns to indicate whether a row matches a key_col value, summarize dots to aggregate the product of the newly created flags and value_col, and finally select dots to only select those key_col value columns). It actually works fairly quickly and reliably.

hadley closed this as completed in 3819854 Dec 30, 2015

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Use colnames() rather than names() in gather()? #138

Use colnames() rather than names() in gather()? #138

jarodmeng commented Nov 12, 2015

hadley commented Nov 12, 2015

jarodmeng commented Nov 12, 2015

Use colnames() rather than names() in gather()? #138

Use colnames() rather than names() in gather()? #138

Comments

jarodmeng commented Nov 12, 2015

hadley commented Nov 12, 2015

jarodmeng commented Nov 12, 2015