[Experimental] An AbstractDataFrame that's a composite type with columns as type members #471
Add this suggestion to a batch that can be applied as a single commit.
This suggestion is invalid because no changes were made to the code.
Suggestions cannot be applied while the pull request is closed.
Suggestions cannot be applied while viewing a subset of changes.
Only one suggestion per line can be applied in a batch.
Add this suggestion to a batch that can be applied as a single commit.
Applying suggestions on deleted lines is not supported.
You must change the existing code in this line in order to create a valid suggestion.
Outdated suggestions cannot be applied.
This suggestion has been applied or marked resolved.
Suggestions cannot be applied from pending reviews.
Suggestions cannot be applied on multi-line comments.
Suggestions cannot be applied while the pull request is queued to merge.
Suggestion cannot be applied right now. Please check back later.
This is related to #451.
CDataFrame
is anAbstractDataFrame
made of composite types made on the fly. Columns are directly type members. This has some advantages:df.colA
.It has some disadvantages:
df["newcol"] = something
. We would need an API that treats DataFrames as immutable. For example, to add a column, I used this:newdf = tdataframe(olddf, newcol = something)
.Here are the results of some tests in test/cdataframe.jl:
The composite-style indexing is nearly as fast as indexing with the raw vectors.
Note that I didn't implement everything needed for it to be an AbstractDataFrame. Here are some things that do work:
I'm not sure this is a good idea, but we do need some way to get these speed advantages and also to get
df.colA
.