[WIP v1 - deprecated] Add learning of target features #1710

francoishernandez · 2020-01-27T13:51:48Z

This feature has been asked about quite a lot, and might be useful in several tasks, so here it is!

Done

create a Generator class to handle multiple generators
create an OnmtBatch class inheriting from torchtext.data.Batch to allow feature shifting
create additional criterion for loss computation (we may create others later for other types of features)
adapt greedy decoding
adapt translation predictions building
adapt beam search
make time shift optional

TODO

verify / fix compatibility with some features
include changes from [WIP] Source Word Features in Translation #1564 ?
adapt unit tests
probably some adaptations to do in REST server

guillaumekln · 2020-01-27T14:53:48Z

Great! Do I read correctly that you implemented the target shifting as in OpenNMT-lua? I no longer think this was a good idea and I don't think it was proven to be any better than just keeping the features aligned. What do you think?

francoishernandez · 2020-01-27T15:29:09Z

Do I read correctly that you implemented the target shifting as in OpenNMT-lua?

True

I no longer think this was a good idea and I don't think it was proven to be any better than just keeping the features aligned. What do you think?

This implementation seemed ok and fit to the task, at least at first glance. I'm only beginning to run some 'realistic' tests so I don't have enough feedback for now. Do you have specific results in mind?

I agree that for greedy search the shift should not be necessary. But what would you propose for the beam search case, where the feature would highly depend on the predicted token?

guillaumekln · 2020-01-27T16:04:37Z

The concern is that, yes, target features do depend on the predicted token but in many cases they also depend on the attended source context. For the later, it seems better to keep the features aligned so that attention values are valid for the predictions of both the word and its features. How important is the context vs the predicted token may depend on the type of features.

I don't have a specific results in mind, this is just something that came to mind recently and I'm raising it here. Of course if you have the time and resources, it could be interesting to compare shift vs. no shift.

francoishernandez · 2020-01-27T16:23:23Z

That's a valid concern though it's not an easy one to wrap one's head around. The best would be indeed to experiment.

Also, I agree it might depend on the type of features --> I'll add a flag to make the shift optional and facilitate such tests.

…eddings

eduamf · 2020-04-20T20:41:06Z

Could someone with write access check the solution?

francoishernandez · 2020-04-20T21:10:47Z

Hey @eduamf, thanks for your interest in this topic.
This PR was voluntarily put on standby. This is basically working but there are some strange behaviors making training way too slow.
Feel free to checkout my branch to test and PR on it if you see some useful additions.

francoishernandez added 9 commits January 22, 2020 15:25

create build_generator function

7fb6cdb

create Generator class

44bb962

remove some prints/comments

772cee9

create OnmtBatch class

db356cf

handle no tgt case in OnmtBatch

797bef6

greedy decoding approximately working

ef21fde

remove some print

345dd1d

use several criterions for loss compute, adapt training

92c2f26

fix preprocess multi corpus, translation eos condition

c27a921

francoishernandez added 7 commits January 28, 2020 18:32

refactor build_generator and Generator class, allow share_decoder_emb…

f4481d9

…eddings

default features to None in compute_loss

fe2dc10

introduce feat_no_time_shift flag, fix translation for no feat case

e83f487

fix indent

bf0e824

fix typo

3427860

fix some flake

5e2363c

adapt beam search, fix some structure for greedy/beam compat

81fd85d

vince62s mentioned this pull request May 9, 2020

Word features in Translation #1534

Closed

francoishernandez mentioned this pull request Sep 9, 2021

Source features support for V2.0 #2090

Merged

vince62s changed the title ~~[WIP] Add learning of target features~~ [WIP v1 - deprecated] Add learning of target features Jan 19, 2023

anderleich mentioned this pull request Feb 2, 2023

[WIP] Support source and target features #2289

Closed

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

[WIP v1 - deprecated] Add learning of target features #1710

[WIP v1 - deprecated] Add learning of target features #1710

francoishernandez commented Jan 27, 2020 •

edited

Loading

guillaumekln commented Jan 27, 2020

francoishernandez commented Jan 27, 2020

guillaumekln commented Jan 27, 2020 •

edited

Loading

francoishernandez commented Jan 27, 2020

eduamf commented Apr 20, 2020 •

edited

Loading

francoishernandez commented Apr 20, 2020 •

edited

Loading

[WIP v1 - deprecated] Add learning of target features #1710

Are you sure you want to change the base?

[WIP v1 - deprecated] Add learning of target features #1710

Conversation

francoishernandez commented Jan 27, 2020 • edited Loading

Done

TODO

guillaumekln commented Jan 27, 2020

francoishernandez commented Jan 27, 2020

guillaumekln commented Jan 27, 2020 • edited Loading

francoishernandez commented Jan 27, 2020

eduamf commented Apr 20, 2020 • edited Loading

francoishernandez commented Apr 20, 2020 • edited Loading

francoishernandez commented Jan 27, 2020 •

edited

Loading

guillaumekln commented Jan 27, 2020 •

edited

Loading

eduamf commented Apr 20, 2020 •

edited

Loading

francoishernandez commented Apr 20, 2020 •

edited

Loading