Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

How to use TPOT in the text domain? #544

Closed
ben0it8 opened this issue Aug 8, 2017 · 1 comment
Closed

How to use TPOT in the text domain? #544

ben0it8 opened this issue Aug 8, 2017 · 1 comment
Labels

Comments

@ben0it8
Copy link

ben0it8 commented Aug 8, 2017

Hello,

My question is if it's possible to use TPOT in the text domain for classification task?
Given a labeled corpus (eg. label - document pairs) I'd like to perform classification to infer the label of an unseen piece of document.

Thanks,
Oliver

@weixuanfu
Copy link
Contributor

I think this issue is related to #507. We are working on a configurable grammar in #523 to add the support for text classification. For now, you may try to transform text to numeric matrix using CountVectorizer, TFIDFVectorizer and HashingVectorizer before using TPOTClassifier for your problem.

@ben0it8 ben0it8 closed this as completed Aug 15, 2017
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Projects
None yet
Development

No branches or pull requests

2 participants