fix failling test for pandas 0.25 #15

LionelMassoulard · 2019-09-07T12:27:34Z

fix failling test for pandas 0.25 : change conversion sparse to dense when dtype == object

* fix seed * new test CdfScaler

* Bump version to 0.1.1 * Fix bug automl block search (#10) * fix bug when no elements to iterator on * remove useless space * Categorical handling (WIP) (#9) * add failing test for categorie * - add a function that can replace categorical columns by object columns - recognize 'category' as a CAT type of variable * ajoute de get ride of categories modifications des transfo numericalencoder et targetencoder ajout d un test de guess_type_of_variables * - add a get_rid_of_categories in the fit_transform of targetencoder - add test of targetencoder with categorical dtype - add test of numericalencoder with categorical dtype * modif de test_guesss_type_of_variable * ajout d'un test permettant de vérifier que le numerical encoder ne transforme pas les colonnes catégorielles ayant des int en colonnes numériques. pour l'instant, le test fail * modification du code pour que le numericalencoder et le targetencoder fonctionnent correctement ajout de tests * modifs prenant en compte les comments de la pull request * remaining changes for the pull request * clean commit * Dispatch groups (#7) * Block Search + other (#2) * add make_pipeline function (works like sklearn) * fix type "_if_fitted" -> "_already_fitted" * * add handling of columns_to_encode == "--object--" in target encoder * corresponding test * add Numerical encoder test for "columns_to_encode == '--object--' " * expose command argument parser outside, to be able to add new arguments. * change WordVectorizer in char mod distributions + fix bug in HyperRangeBetaInt * change default behavior : encode "columns_to_encode == '--object--' " * remove 'bug' (double return) * allow text preprocessors to concat their inputs * add 'RandomTrainTestCv' and 'IndexTrainCv' cv-like object. * same api as a regular cv object ... * ... but only one split * add 'use_for_block_search' attribute + filter models based on that * * add block search iterator * automl config : models_to_keep_block_search * fix typo in test * ignore Warning in test * move 'function_has_named_argument' from .transformers.model_wrapper to .tools.helper_functions * cleanning * dispatch and split the groups variable to the estimator * add groups to methods + dispatch it to estimators within the pipeline * test on cross validation and pipeline to check the passing of groups * remove useless import * remove useless * fix X -> lastX * debug help * fix after merge * make sur benchmark can be computed * input np.inf as well as np.nan * spaces * don't split and tokenize if not needed * new tests auto-ml, when only numerical values * allow scoring to return multiple values * allow cross_validation to be in Parallel # Conflicts: # aikit/cross_validation.py * add a custom CV for groups * * froze init param * allow additionnal function to be computed * read additionnal results * allow guiding to be done on an "addtionnal metric" * typo * add name of excel print * test if name of columns has change * Clean load (#12) * remove config.json * fix loading * remove nltk addtional path * accelerate code using map and dict (#13) * accelerate code using map and dict * accelerate concatenation code * Update categories * * fix test new columns name (#15) * fix seed * new test CdfScaler * Ml graph improve (#8) * * new helpers function (merge node and subbranch search) * fix ordering in graph from edges * * generalize the notion of model graph * change name representation * Block Search + other (#2) * add make_pipeline function (works like sklearn) * fix type "_if_fitted" -> "_already_fitted" * * add handling of columns_to_encode == "--object--" in target encoder * corresponding test * add Numerical encoder test for "columns_to_encode == '--object--' " * expose command argument parser outside, to be able to add new arguments. * change WordVectorizer in char mod distributions + fix bug in HyperRangeBetaInt * change default behavior : encode "columns_to_encode == '--object--' " * remove 'bug' (double return) * allow text preprocessors to concat their inputs * add 'RandomTrainTestCv' and 'IndexTrainCv' cv-like object. * same api as a regular cv object ... * ... but only one split * add 'use_for_block_search' attribute + filter models based on that * * add block search iterator * automl config : models_to_keep_block_search * fix typo in test * ignore Warning in test * fix type : TransformToBlockManager * add number of output utils function * spaces * new tests with impossible graphs * fix merged * fix notebook error * add list test * remove useless import * spaces * fix docstring * merge 2 loops * remove duplicate edge * add a few ploting functions (#14) * add a few ploting functions * add assert * bump version 0.1.2 * DEV bump version * doc typo (#16) * Add matplotlib, seaborn to test requirements * Fixes on dataset load from public URL * Fix dataset path load unit test

LionelMassoulard added 3 commits September 7, 2019 14:24

change conversion "sparse to dense" when Object dtype

6ecd3cb

raise when no y

c2d1dcc

fix change

98a30b2

gfournier approved these changes Sep 17, 2019

View reviewed changes

gfournier merged commit 936475b into societe-generale:master Sep 23, 2019

gfournier mentioned this pull request Sep 23, 2019

Issue on sparse dataframe with pandas >= 0.24.0 #4

Closed

gfournier pushed a commit that referenced this pull request Oct 1, 2019

* fix test new columns name (#15)

17f8a9a

* fix seed * new test CdfScaler

gfournier pushed a commit that referenced this pull request Oct 1, 2019

* fix test new columns name (#15)

cb3ac85

* fix seed * new test CdfScaler

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

fix failling test for pandas 0.25 #15

fix failling test for pandas 0.25 #15

LionelMassoulard commented Sep 7, 2019

fix failling test for pandas 0.25 #15

fix failling test for pandas 0.25 #15

Conversation

LionelMassoulard commented Sep 7, 2019