non-ascii characters throw errors in FB13 #2

talkhaldi · 2019-11-08T08:35:31Z

Hi,

Thank you for your work and publishing the code. I'm trying to run triple_classification example for FB13 as in the readme file, but I'm getting the following error:

Traceback (most recent call last): File "run_bert_triple_classifier.py", line 847, in <module> main() File "run_bert_triple_classifier.py", line 556, in main train_examples = processor.get_train_examples(args.data_dir) File "run_bert_triple_classifier.py", line 120, in get_train_examples self._read_tsv(os.path.join(data_dir, "train.tsv")), "train", data_dir) File "run_bert_triple_classifier.py", line 173, in _create_examples ent_lines = f.readlines() File "/mnt/orange/ubrew/data/opt/python/lib/python3.6/encodings/ascii.py", line 26, in decode return codecs.ascii_decode(input, self.errors)[0] UnicodeDecodeError: 'ascii' codec can't decode byte 0xc3 in position 27: ordinal not in range(128)

Is this behavior expectable?
I can change the open command to have encoding="utf-8" argument, but it becomes extremely slow. How did you deal with this issue?

The text was updated successfully, but these errors were encountered:

yao8839836 · 2019-11-12T06:30:53Z

@talkhaldi

Hi, I didn't see this problem with my Python 3.5 or 3.6 enviorement.

You may want to try this:

import sys
reload(sys)
sys.setdefaultencoding('utf-8')

Zhizhizhi997 · 2020-12-22T18:44:13Z

@yao8839836 I also meet with this problem. But I think the method you provide only works for python2 instead of python3. I change

with open(os.path.join(data_dir, "entity2text.txt"), 'r')
to
with open(os.path.join(data_dir, "entity2text.txt"), 'r', encoding="utf")

It seems to work

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

non-ascii characters throw errors in FB13 #2

non-ascii characters throw errors in FB13 #2

talkhaldi commented Nov 8, 2019

yao8839836 commented Nov 12, 2019 •

edited

Loading

Zhizhizhi997 commented Dec 22, 2020

non-ascii characters throw errors in FB13 #2

non-ascii characters throw errors in FB13 #2

Comments

talkhaldi commented Nov 8, 2019

yao8839836 commented Nov 12, 2019 • edited Loading

Zhizhizhi997 commented Dec 22, 2020

yao8839836 commented Nov 12, 2019 •

edited

Loading