Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Update StatsBase.df to dof #1097

Merged
merged 1 commit into from
Oct 7, 2016
Merged

Update StatsBase.df to dof #1097

merged 1 commit into from
Oct 7, 2016

Conversation

ararslan
Copy link
Member

@ararslan ararslan commented Oct 6, 2016

why is this even defined here

@ararslan
Copy link
Member Author

ararslan commented Oct 7, 2016

We should tag a patch release once this is merged, but we'll have to cherry-pick around the nullable change.

@ararslan ararslan merged commit e4ab277 into master Oct 7, 2016
@ararslan ararslan deleted the aa/dof branch October 7, 2016 00:50
ararslan added a commit that referenced this pull request Oct 7, 2016
(cherry picked from commit e4ab277)
@simonster
Copy link
Contributor

This is defined here for models fit on DataFrames because we wrap them in a DataFrameStatisticalModel/DataFrameRegressionModel. We do this to allow models to be fit on DataFrames without requiring explicit package support, but while still keeping track of the mapping between coefficients and columns.

@ararslan
Copy link
Member Author

ararslan commented Oct 7, 2016

@simonster Oh okay, makes sense. Thanks for the explanation!

@nalimilan
Copy link
Member

Yes, these should be moved to a separate package together with the model frame code.

@simonster
Copy link
Contributor

Maybe? From a usability perspective, it seems like it may be hard for people to figure out that to fit a GLM on a DataFrame, you need GLM, DataFrames, and then some other package that makes them talk to each other.

@nalimilan
Copy link
Member

We could still reexport it by default. The idea is 1) to shrink the large code base of Data Frames, and 2) allow fitting models from alternative data sources (TypedTables, databases...).

@ararslan
Copy link
Member Author

ararslan commented Oct 7, 2016

Or perhaps GLM could reexport it?

@nalimilan
Copy link
Member

Possibly, but that would mean adding this to every modeling package, which are more numerous than data sources.

@ararslan
Copy link
Member Author

ararslan commented Oct 7, 2016

Ah yeah. Well, I can start the process of migration... Repo name? StatModels.jl, in keeping with the name of the file in which this stuff is defined here? That's pretty general though.

@nalimilan
Copy link
Member

The name needs to be general since that's a general framework. StatModels isn't bad.

Though before doing that work, make sure everybody agrees with that plan by asking for comments somewhere. I think I remember some opposition.

Cc: @dmbates @kleinschmidt @andreasnoack

@ararslan
Copy link
Member Author

ararslan commented Oct 7, 2016

Oh sure, I wouldn't take anything out of here before getting comments. Also cc @johnmyleswhite.

If we're going to do this I also think it would be a good time to rethink how formulas are specified. It would be nice if base Julia could stop making a special case to parse ~ as an infix macro for the sake of formulas. Doug brought up alternative syntax on the mailing list, wherein he proposed using pairs instead. Anyway, something to think about when we get there.

@kleinschmidt
Copy link
Contributor

Might make sense to discuss this under #1018...

nalimilan pushed a commit that referenced this pull request Jul 8, 2017
nalimilan pushed a commit that referenced this pull request Jul 8, 2017
rofinn pushed a commit that referenced this pull request Aug 17, 2017
nalimilan pushed a commit that referenced this pull request Aug 25, 2017
quinnj pushed a commit that referenced this pull request Sep 2, 2017
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

4 participants