Data Preprocessing #1

Cartus · 2019-09-08T07:53:22Z

Hi, thanks for the great work!

I try to run the code. However, I don't know how to do data preprocessing for AMR corpus. May I ask how can I do data preprocessing?

Amazing-J · 2019-09-09T01:46:07Z

Our baseline input could be the same linearized amr chart as konstas.
Only concept nodes are retained for input to the transformer model.
-train_src # concept node sequence
-train_structure1 # Xi to Xj path of the first token.
-train_structure2 # Xi to Xj path of the second token.
........

Cartus · 2019-09-09T03:48:07Z

Hi @Amazing-J ,

Thank you for your prompt reply!

For the concept node sequence, I can use NeuralAmr https://github.com/sinantie/NeuralAmr to get the linearized sequence.

I also have two questions. The first one is how to construct the structural sequence. Since the model requires to sub-word units by BPE, how to generate the concept node sequence under this setting?

dungtn · 2019-09-23T20:00:51Z

Hi @Amazing-J,

Thank you for releasing the code! As @Cartus pointed out, can you provide the code for BPE over the source a.k.a linearized AMRs?

Best!

dungtn · 2019-09-24T03:30:31Z

Assuming that I've done the right thing for BPE by running

subword-nmt learn-bpe -s 10000 < ...LDC2015E86/training_source > codes.bpe
subword-nmt apply-bpe -c codes.bpe < ...LDC2015E86/dev_source > dev_source_bpe

then I still got this error:

FileNotFoundError: [Errno 2] No such file or directory: ...LDC2015E86/data_vocab.pt

How can I generate this file?

dungtn · 2019-09-24T04:03:34Z

Alright, I found out that I also have to run preprocess.sh. Thanks!

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Data Preprocessing #1

Data Preprocessing #1

Cartus commented Sep 8, 2019

Amazing-J commented Sep 9, 2019

Cartus commented Sep 9, 2019

dungtn commented Sep 23, 2019

dungtn commented Sep 24, 2019

dungtn commented Sep 24, 2019 •

edited

Loading

Data Preprocessing #1

Data Preprocessing #1

Comments

Cartus commented Sep 8, 2019

Amazing-J commented Sep 9, 2019

Cartus commented Sep 9, 2019

dungtn commented Sep 23, 2019

dungtn commented Sep 24, 2019

dungtn commented Sep 24, 2019 • edited Loading

dungtn commented Sep 24, 2019 •

edited

Loading