-
Notifications
You must be signed in to change notification settings - Fork 1.6k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Error Generating Pipeline Code during Fit #233
Comments
I'm trying to replicate your issue, but I'm not sure where you're getting the |
No, something I wrote. I will post the output dataframes. I'm re-running Thanks for looking at it Keith On Aug 22, 2016 9:11 AM, "Daniel" notifications@github.com wrote:
|
Ok, replicated the error with a smaller dataset, this time on Windows. Including a reduced data-set that also generates the error. Code:
Result: Python 2.7.11 |Anaconda 4.0.0 (64-bit)| (default, Feb 16 2016, 09:58:36) [MSC v.1500 64 bit (AMD64)] on win32
GP Progress: 0%| | 0/120 [00:00<?, ?pipeline/s]
Attaching screenshot and datafiles. |
Your problem seems identical to #234. I'm guessing the shape of your labels is |
I read that and have tried explicitly reshaping to (n,). The problem does On Aug 22, 2016 9:06 PM, "Daniel" notifications@github.com wrote: Your problem seems identical to #234 — |
Here's another version which explicitly reshapes the target array. The problem occurs with numpy array input with the label array of shape (n,). This example uses X2.csv and y2.csv posted with my earlier comment. Keith Code:
Results (Note shape of X and y):
|
Thank you for the bug report, @KeithBrodie! There seems to be an issue with our "compile to sklearn Pipeline" functionality for Python 2.7. We need to dig into it soon and see what we can find out. In the meantime, we thoroughly tested on Python 3.5 and TPOT should run without a hitch there. |
Hi @KeithBrodie, Thank you for sharing your data and a reproducible example so we could figure out what's going on. From looking at your data, it looks like your predicted target is continuous, which is a regression problem. At the moment, TPOT only supports classification problems. We plan to add support for regression problems in the next release (0.6), hopefully within a couple weeks. |
Thanks, sorry about wasting your time, and thanks for TPOT, totally cool. On Aug 23, 2016 10:55 AM, "Randy Olson" notifications@github.com wrote:
|
Not a waste at all! You helped us realize that we could output a more useful failure message when users pass data in a format that scikit-learn can't handle. |
Hi @KeithBrodie, we just released TPOT v0.6 today. Try upgrading TPOT via pip and using it on your regression data set. Usage docs: link |
Worked - very cool. Thanks |
While fitting an X, y pair presented as pandas dataframes TPOT crashed in generate pipeline code.
Context of the issue
Ubuntu 16.04 LTS
Python 2.7.12
deap 1.02
TPOT 0.5.0
Code is a copy of the example code from the documentation replacing the dataset with one generated locally.
Process to reproduce the issue
fit()
function with training dataTypeError
Expected result
Fit method to complete without crashing
Current result
GP Progress: 0%| | 0/120 [00:00<?, ?pipeline/s]
GP Progress: 10%|# | 12/120 [00:00<00:00, 118.37pipeline/s]
GP Progress: 13%|#3 | 16/120 [1:19:04<10:16:46, 355.83s/pipeline]
GP Progress: 14%|#4 | 17/120 [1:19:22<7:16:51, 254.48s/pipeline]
GP Progress: 15%|#5 | 18/120 [1:19:23<5:03:10, 178.34s/pipeline]
GP Progress: 16%|#5 | 19/120 [1:22:02<4:50:39, 172.67s/pipeline]
GP Progress: 17%|#6 | 20/120 [1:22:02<3:21:31, 120.91s/pipeline]
Traceback (most recent call last):
File "/home/northwood/Dropbox/AutoDex/Extractor/tp1.py", line 31, in
pipeopt.fit(X_train, y_train)
File "/usr/local/lib/python2.7/dist-packages/tpot/tpot.py", line 307, in fit
self._fitted_pipeline = self._toolbox.compile(expr=self._optimized_pipeline)
File "/usr/local/lib/python2.7/dist-packages/tpot/tpot.py", line 431, in _compile_to_sklearn
sklearn_pipeline = generate_pipeline_code(expr_to_tree(expr))
File "/usr/local/lib/python2.7/dist-packages/tpot/export_utils.py", line 80, in expr_to_tree
for node in ind:
TypeError: 'NoneType' object is not iterable
Possible fix
I don't know.
Screenshot
The text was updated successfully, but these errors were encountered: