You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
After I run the following code, the problem comes out.
for sampler in [None, rus, ros, sm]:
if not sampler:
pipeline = Pipeline([('lr', linear_model.LogisticRegression(n_jobs=-1))])
else:
pipeline = Pipeline([('sampler', sampler), ('lr', linear_model.LogisticRegression(n_jobs=-1))])
/usr/local/lib/python2.7/dist-packages/sklearn/model_selection/_search.pyc in fit(self, X, y, groups, **fit_params)
636 error_score=self.error_score)
637 for parameters, (train, test) in product(candidate_params,
--> 638 cv.split(X, y, groups)))
639
640 # if one choose to see train score, "out" will contain train score info
/usr/local/lib/python2.7/dist-packages/sklearn/model_selection/_split.pyc in split(self, X, y, groups)
330 n_samples))
331
--> 332 for train, test in super(_BaseKFold, self).split(X, y, groups):
333 yield train, test
334
/usr/local/lib/python2.7/dist-packages/sklearn/model_selection/_split.pyc in split(self, X, y, groups)
93 X, y, groups = indexable(X, y, groups)
94 indices = np.arange(_num_samples(X))
---> 95 for test_index in self._iter_test_masks(X, y, groups):
96 train_index = indices[np.logical_not(test_index)]
97 test_index = indices[test_index]
/usr/local/lib/python2.7/dist-packages/sklearn/model_selection/_split.pyc in _iter_test_masks(self, X, y, groups)
624
625 def _iter_test_masks(self, X, y=None, groups=None):
--> 626 test_folds = self._make_test_folds(X, y)
627 for i in range(self.n_splits):
628 yield test_folds == i
/usr/local/lib/python2.7/dist-packages/sklearn/model_selection/_split.pyc in make_test_folds(self, X, y)
611 for test_fold_indices, per_cls_splits in enumerate(zip(*per_cls_cvs)):
612 for cls, (, test_split) in zip(unique_y, per_cls_splits):
--> 613 cls_test_folds = test_folds[y == cls]
614 # the test split can be too big because we used
615 # KFold(...).split(X[:max(c, n_splits)]) when data is not 100%
IndexError: too many indices for array
I am looking forward to your reply. Thanks
The text was updated successfully, but these errors were encountered:
Dear Taccio Yamamoto,
The project is remarkable. I have the following questions to need your help.
My data is creditcard.csv from kaggle fraud-detection . The preprocessing is from https://github.com/yazanobeidi/fraud-detection/blob/master/project.ipynb
After I run the following code, the problem comes out.
for sampler in [None, rus, ros, sm]:
if not sampler:
pipeline = Pipeline([('lr', linear_model.LogisticRegression(n_jobs=-1))])
else:
pipeline = Pipeline([('sampler', sampler), ('lr', linear_model.LogisticRegression(n_jobs=-1))])
Fitting 3 folds for each of 10 candidates, totalling 30 fits
IndexError Traceback (most recent call last)
in ()
10 parameters = {'lr_C':[.0001, .001, .01, .1, .5, .75, 1, 2.5, 5, 10]}
11 gscv = GridSearchCV(cv=3, error_score='raise', n_jobs=-1,scoring=aucpr, verbose=True,estimator=pipeline, param_grid=parameters)
---> 12 gscv.fit(XX_train, YY_train)
13 print(gscv.best_estimator_)
14 print(gscv.best_score_)
/usr/local/lib/python2.7/dist-packages/sklearn/model_selection/_search.pyc in fit(self, X, y, groups, **fit_params)
636 error_score=self.error_score)
637 for parameters, (train, test) in product(candidate_params,
--> 638 cv.split(X, y, groups)))
639
640 # if one choose to see train score, "out" will contain train score info
/usr/local/lib/python2.7/dist-packages/sklearn/model_selection/_split.pyc in split(self, X, y, groups)
330 n_samples))
331
--> 332 for train, test in super(_BaseKFold, self).split(X, y, groups):
333 yield train, test
334
/usr/local/lib/python2.7/dist-packages/sklearn/model_selection/_split.pyc in split(self, X, y, groups)
93 X, y, groups = indexable(X, y, groups)
94 indices = np.arange(_num_samples(X))
---> 95 for test_index in self._iter_test_masks(X, y, groups):
96 train_index = indices[np.logical_not(test_index)]
97 test_index = indices[test_index]
/usr/local/lib/python2.7/dist-packages/sklearn/model_selection/_split.pyc in _iter_test_masks(self, X, y, groups)
624
625 def _iter_test_masks(self, X, y=None, groups=None):
--> 626 test_folds = self._make_test_folds(X, y)
627 for i in range(self.n_splits):
628 yield test_folds == i
/usr/local/lib/python2.7/dist-packages/sklearn/model_selection/_split.pyc in make_test_folds(self, X, y)
611 for test_fold_indices, per_cls_splits in enumerate(zip(*per_cls_cvs)):
612 for cls, (, test_split) in zip(unique_y, per_cls_splits):
--> 613 cls_test_folds = test_folds[y == cls]
614 # the test split can be too big because we used
615 # KFold(...).split(X[:max(c, n_splits)]) when data is not 100%
IndexError: too many indices for array
I am looking forward to your reply. Thanks
The text was updated successfully, but these errors were encountered: