Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

AutoML on Titanic dataset samples #18

Merged

Conversation

LionelMassoulard
Copy link
Collaborator

Add 2 scripts that run auto-ml on titanic. Those can be used as sample.
I included a simple script and an "advanced" one with more options.

Documentation on the "advanced" options is still needed.

@gfournier gfournier changed the title first commit : sample scripts AutoML on Titanic dataset samples Oct 1, 2019
@gfournier gfournier merged commit b4a1f59 into societe-generale:master Oct 1, 2019
gfournier pushed a commit that referenced this pull request Feb 28, 2020
gfournier added a commit that referenced this pull request Feb 28, 2020
* bump version to 0.1.3

* bump version to 0.1.4-dev

* massive black reformating (#18)

* fix bug when type_of_problem is setted (#21)

* fix bug when type_of_problem is setted

* add default

* add specific test

* Fix numericalencoder (#22)

* fix NumericalEncoder with default values

* fix NumericalEncoder with default values

* Fix doc (#26)

* ignore .bat file to create doc

* fix doc

* comment in english

* clean test

* requirements pandas >= 0.23 (#27)

* node -> nodes (was deprecated and is now absent) (#31)

* Add test picklable (#23)

* test numerical encoder is picklable

* test numerical encoder is picklable

* test target encoder is picklable

* black

* black

* remove warning printing

* add test : unpickled object behave like original object

* improve auto ml doc (#30)

* re-index 'fit_params' that are indexable

* change version 0.1.5

* conversion model to json (#36)

* v0.1.6-dev

* add conversion model to json : 'param_from_sklearn_model' + corresponding tests

* add new numpy type to be cast to python type

* test if object can be json serialized

* refactoring of columns selection (#29)

* add conversion model to json : 'param_from_sklearn_model' + corresponding tests

* change wrapper, 'drop_used_columns' and 'drop_unused_columns'

* temp : remove useless attribute

* temp : fix test

* allow selector to select of type of variable among TypeOfVariables.CAT / TEXT / NUM

* change default for numerical encoder

* change test

* temp : new test

* renamming

* comments

* clean docstring

* change text models

* change 'base' models

* change corresponding tests

* add numpy array support

* fix tests

* fix test

* fix random_forest_addins columns_to_use

* fix Targetencoder

* fix special case when no column to pass to the model

* typo

* fix get_feature_names

* allow not to raise when shape between fit and transform differs

* corresponding tests

* cleanning

* fix doc + default

* rename

* fix registration

* black reformat

* clean

* add helper method

* temp add fitting test

* clean

* add test : try to fit model

* add custom default hyper-parameters

* fix inf

* clean

* add test not inf CdfScaler

* cap number of component to nb of rows

* fix seed by default

* clean

* make CdfScaler to very small, almost equal values

* test very close and  very small values

* cleanning

* divers

* add min_count param

* more data in test

* * remove cast of string that can be parsed

* corresponding test

* change version 1.0.0

* dev version 1.0.1-dev

* change version 0.2.0

* dev 0.2.1-dev
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants