-
Notifications
You must be signed in to change notification settings - Fork 78
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Switch from Tensorflow to Pytorch #115
Conversation
.travis.yml
Outdated
@@ -44,20 +50,30 @@ before_install: | |||
- python --version | |||
- pip --version | |||
|
|||
# Install PyTorch for Linux with Python 3.6 and no CUDA |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
# Install PyTorch for Linux and no CUDA
Since this is doing both versions
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
👍
fonduer/learning/disc_learning.py
Outdated
@@ -1,132 +1,84 @@ | |||
from __future__ import absolute_import, division, print_function, unicode_literals |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I believe since we're only supporting Python 3.6+ we shouldn't need these, right?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
👍
fonduer/learning/disc_learning.py
Outdated
import torch.optim as optim | ||
|
||
from .classifier import Classifier | ||
from .utils import LabelBalancer, reshape_marginals |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Use explicit imports, not relative ones, e.g. from fonduer.learning.classifier import Classifier
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
👍
fonduer/learning/classifier.py
Outdated
@@ -90,6 +92,7 @@ def score( | |||
f_beta = 0.0 | |||
return p, r, f_beta | |||
|
|||
# TODO: need update, this only works for debugging labeling functions now | |||
def error_analysis( |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Can you add a docstring?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
👍
fonduer/learning/disc_learning.py
Outdated
|
||
from fonduer.learning.classifier import Classifier | ||
from fonduer.learning.utils import LabelBalancer, reshape_marginals | ||
def SoftCrossEntropyLoss(input, target): |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Can you add a docstring?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Done.
fonduer/learning/disc_learning.py
Outdated
and not torch.cuda.is_available() | ||
): | ||
self.model_kwargs["host_device"] = "CPU" | ||
self.logger.info("GPU is not available, switching to CPU...") |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
We notify when using CPU, should we also log when GPU is being used? Maybe just at DEBUG level.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Add one for GPU as well.
@@ -1,194 +1,118 @@ | |||
from __future__ import absolute_import, division, unicode_literals |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
No need for __future__
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Done.
current_dir = new_dir | ||
tries += 1 | ||
|
||
return config |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This should just be the same config in utils, right? Shouldn't be duplicated?
Add tests to tests/utils/test_config.py
to make sure its behaving as expected.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
We will also want to update the docs to show these parameters like we do for features.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
done!
from scipy.sparse import issparse | ||
|
||
from fonduer.learning.disc_learning import NoiseAwareModel | ||
from fonduer.learning.disc_models.rnn.config import get_config |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Probably should use utils config. See previous comment.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
done!
return {v: k for k, v in self.d.iteritems()} | ||
|
||
|
||
def mention_to_tokens(mention, token_type="words", lowercase=False): |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Docstring?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
done!
I'm also seeing this warning when running
|
This avoids the metaclass conflict of NoiseAwareModel. See #115
This avoids the metaclass conflict of NoiseAwareModel. See #115
docs/user/learning.rst
Outdated
|
||
The different learning parameters are explained in this section. | ||
|
||
[TODO] give descriptions for the following:: |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Can you help give some descriptions here?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
LGTM. Any improvement we can make to the docs are always helpful, but we can look at that more earnestly in a future PR.
In this PR, we switch the learning part to PyTorch and support Logistic Regression and LSTM.
One thing to be addressed here is the input of training/prediction function needs to be (candidates, features) instead of features.