Add this suggestion to a batch that can be applied as a single commit.
This suggestion is invalid because no changes were made to the code.
Suggestions cannot be applied while the pull request is closed.
Suggestions cannot be applied while viewing a subset of changes.
Only one suggestion per line can be applied in a batch.
Add this suggestion to a batch that can be applied as a single commit.
Applying suggestions on deleted lines is not supported.
You must change the existing code in this line in order to create a valid suggestion.
Outdated suggestions cannot be applied.
This suggestion has been applied or marked resolved.
Suggestions cannot be applied from pending reviews.
Suggestions cannot be applied on multi-line comments.
Suggestions cannot be applied while the pull request is queued to merge.
Suggestion cannot be applied right now. Please check back later.
I tried to implement "iterative parameter mixing" strategy for distributed training of structured perceptron:
The idea is the following:
So communication should involve only transferring learned weights, and each shard could have its own training data.
ParallelStructuredPerceptron is an attempt to reimplement StructuredPerceptron in terms of OneEpochPerceptrons. It has n_jobs parameter, and ideally it should use multiprocessing or multithreading (numpy/scipy releases GIL and the bottleneck is in dot product isn't it?) for faster training. But I didn't manage to make multiprocessing work without copying shard's X/y/lengths each iteration, so n_jobs = N just creates N OneEpochPerceptrons and trains them sequentially.
Ideally, I want OneEpochPerceptron to be easy to use with IPython.parallel in distributed environment, and ParallelStructuredPerceptron to be easy to use on single machine.
Issues with current implementation:
sequence_ids shuffling method is changed to make ParallelStructuredPerceptron and StructuredPerceptron learn exactly the same weights given the same random_state.
With n_jobs=1 ParallelStructuredPerceptron is about 10% slower than StructuredPerceptron on my data; I think we could join these classes when (and if) ParallelStructuredPerceptron will be ready.