Switch from Tensorflow to Pytorch #115

senwu · 2018-08-22T09:41:40Z

In this PR, we switch the learning part to PyTorch and support Logistic Regression and LSTM.

One thing to be addressed here is the input of training/prediction function needs to be (candidates, features) instead of features.

lukehsiao · 2018-08-22T17:38:32Z

.travis.yml

@@ -44,20 +50,30 @@ before_install:
 - python --version
 - pip --version

+# Install PyTorch for Linux with Python 3.6 and no CUDA


# Install PyTorch for Linux and no CUDA

Since this is doing both versions

lukehsiao · 2018-08-22T17:42:48Z

fonduer/learning/disc_learning.py

@@ -1,132 +1,84 @@
+from __future__ import absolute_import, division, print_function, unicode_literals


I believe since we're only supporting Python 3.6+ we shouldn't need these, right?

lukehsiao · 2018-08-22T17:43:49Z

fonduer/learning/disc_learning.py

+import torch.optim as optim
+
+from .classifier import Classifier
+from .utils import LabelBalancer, reshape_marginals


Use explicit imports, not relative ones, e.g. from fonduer.learning.classifier import Classifier

lukehsiao · 2018-08-22T17:44:53Z

fonduer/learning/classifier.py

@@ -90,6 +92,7 @@ def score(
                f_beta = 0.0
            return p, r, f_beta

+    # TODO: need update, this only works for debugging labeling functions now
    def error_analysis(


Can you add a docstring?

lukehsiao · 2018-08-22T17:45:14Z

fonduer/learning/disc_learning.py


-from fonduer.learning.classifier import Classifier
-from fonduer.learning.utils import LabelBalancer, reshape_marginals
+def SoftCrossEntropyLoss(input, target):


Can you add a docstring?

lukehsiao · 2018-08-22T20:58:14Z

fonduer/learning/disc_learning.py

+            and not torch.cuda.is_available()
+        ):
+            self.model_kwargs["host_device"] = "CPU"
+            self.logger.info("GPU is not available, switching to CPU...")


We notify when using CPU, should we also log when GPU is being used? Maybe just at DEBUG level.

Add one for GPU as well.

lukehsiao · 2018-08-22T20:59:48Z

fonduer/learning/disc_models/logistic_regression.py

@@ -1,194 +1,118 @@
+from __future__ import absolute_import, division, unicode_literals


No need for __future__

lukehsiao · 2018-08-22T21:02:17Z

fonduer/learning/disc_models/rnn/config.py

+        current_dir = new_dir
+        tries += 1
+
+    return config


This should just be the same config in utils, right? Shouldn't be duplicated?

Add tests to tests/utils/test_config.py to make sure its behaving as expected.

We will also want to update the docs to show these parameters like we do for features.

lukehsiao · 2018-08-22T21:04:05Z

fonduer/learning/disc_models/rnn/lstm.py

+from scipy.sparse import issparse
+
+from fonduer.learning.disc_learning import NoiseAwareModel
+from fonduer.learning.disc_models.rnn.config import get_config


Probably should use utils config. See previous comment.

lukehsiao · 2018-08-22T21:04:47Z

fonduer/learning/disc_models/rnn/utils.py

+        return {v: k for k, v in self.d.iteritems()}
+
+
+def mention_to_tokens(mention, token_type="words", lowercase=False):


lukehsiao · 2018-08-22T21:19:44Z

I'm also seeing this warning when running make docs

WARNING: autodoc: failed to import module 'fonduer.learning'; the following exception was raised:
Traceback (most recent call last):
  File "/home/lwhsiao/repos/fonduer/.venv/lib/python3.6/site-packages/sphinx/ext/autodoc/importer.py", line 152, in import_module
    __import__(modname)
  File "/home/lwhsiao/repos/fonduer/fonduer/learning/__init__.py", line 1, in <module>
    from fonduer.learning.disc_models.logistic_regression import LogisticRegression
  File "/home/lwhsiao/repos/fonduer/fonduer/learning/disc_models/logistic_regression.py", line 8, in <module>
    from fonduer.learning.disc_learning import NoiseAwareModel
  File "/home/lwhsiao/repos/fonduer/fonduer/learning/disc_learning.py", line 26, in <module>
    class NoiseAwareModel(Classifier, nn.Module):
TypeError: metaclass conflict: the metaclass of a derived class must be a (non-strict) subclass of the metaclasses of all its bases

…o pytorch

This avoids the metaclass conflict of NoiseAwareModel. See #115

lukehsiao · 2018-08-23T00:44:12Z

docs/user/learning.rst

+
+The different learning parameters are explained in this section.
+
+[TODO] give descriptions for the following::


Can you help give some descriptions here?

lukehsiao

LGTM. Any improvement we can make to the docs are always helpful, but we can look at that more earnestly in a future PR.

senwu added 5 commits August 14, 2018 17:36

update the setup

c12f538

update travis for PyTorch install on linux

61d8844

support python3.7

246e00d

update lxml requirement

a24b273

update namedtuple

f47e759

senwu requested a review from lukehsiao August 22, 2018 09:41

lukehsiao added the enhancement New feature or request label Aug 22, 2018

lukehsiao added this to the v0.3.0 milestone Aug 22, 2018

add LSTM

76c9530

senwu force-pushed the pytorch branch from 902a8df to 76c9530 Compare August 22, 2018 18:21

senwu added 2 commits August 22, 2018 11:59

update timeout on travis

59241c8

merge master

893d22b

lukehsiao reviewed Aug 22, 2018

View reviewed changes

update travis comment

c6a1770

Add doc structure for config

31ead39

lukehsiao changed the title ~~Pytorch~~ Switch from Tensorflow to Pytorch Aug 22, 2018

senwu and others added 4 commits August 22, 2018 14:31

address comments

9ca4225

address comments and add learning config test

4eaae90

Merge branch 'pytorch' of https://github.com/HazyResearch/fonduer int…

0e92239

…o pytorch

Include doc build in test

cd7ae99

lukehsiao added a commit that referenced this pull request Aug 23, 2018

Mock the Classifier during doc build

91bf10a

This avoids the metaclass conflict of NoiseAwareModel. See #115

Mock the Classifier during doc build

b1530d1

This avoids the metaclass conflict of NoiseAwareModel. See #115

lukehsiao force-pushed the pytorch branch from 91bf10a to b1530d1 Compare August 23, 2018 00:41

lukehsiao reviewed Aug 23, 2018

View reviewed changes

add some descriptions about LSTM params

738ff1b

lukehsiao approved these changes Aug 23, 2018

View reviewed changes

lukehsiao assigned senwu Aug 23, 2018

add some descriptions about LSTM params

7d6a6d5

senwu force-pushed the pytorch branch from 738ff1b to 7d6a6d5 Compare August 23, 2018 01:07

senwu and others added 3 commits August 22, 2018 18:09

update docs

0dca75c

Add Python 3.7

512fc88

Update CHANGELOG

98e7143

lukehsiao force-pushed the pytorch branch from 5357c36 to 98e7143 Compare August 23, 2018 01:14

Add sphinx theme

2a8f7b4

senwu merged commit 6817fdd into master Aug 23, 2018

senwu deleted the pytorch branch August 23, 2018 04:07

senwu mentioned this pull request Aug 23, 2018

Change the base learning class to pytorch #6

Closed

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Switch from Tensorflow to Pytorch #115

Switch from Tensorflow to Pytorch #115

senwu commented Aug 22, 2018

lukehsiao Aug 22, 2018

senwu Aug 22, 2018

lukehsiao Aug 22, 2018

senwu Aug 22, 2018

lukehsiao Aug 22, 2018

senwu Aug 22, 2018

lukehsiao Aug 22, 2018

senwu Aug 22, 2018

lukehsiao Aug 22, 2018

senwu Aug 22, 2018

lukehsiao Aug 22, 2018

senwu Aug 22, 2018

lukehsiao Aug 22, 2018

senwu Aug 22, 2018

lukehsiao Aug 22, 2018

lukehsiao Aug 22, 2018

senwu Aug 22, 2018

lukehsiao Aug 22, 2018

senwu Aug 22, 2018

lukehsiao Aug 22, 2018

senwu Aug 22, 2018

lukehsiao commented Aug 22, 2018 •

edited

Loading

lukehsiao Aug 23, 2018

lukehsiao left a comment

		@@ -1,132 +1,84 @@
		from __future__ import absolute_import, division, print_function, unicode_literals

		@@ -1,194 +1,118 @@
		from __future__ import absolute_import, division, unicode_literals

		return {v: k for k, v in self.d.iteritems()}


		def mention_to_tokens(mention, token_type="words", lowercase=False):


		The different learning parameters are explained in this section.

		[TODO] give descriptions for the following::

Switch from Tensorflow to Pytorch #115

Switch from Tensorflow to Pytorch #115

Conversation

senwu commented Aug 22, 2018

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

lukehsiao commented Aug 22, 2018 • edited Loading

Choose a reason for hiding this comment

lukehsiao left a comment

Choose a reason for hiding this comment

lukehsiao commented Aug 22, 2018 •

edited

Loading