You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
I get IOExceptions (more often IO Exception -- I guess it depends on where the error occurs, i.e. enough words are written to the stdin of the dead process) for some input to the HFST Analyzer module.
Example output:
IO Exception: null
IO Exception: null
IO Exception: null
IO Exception: null
IO Exception: null
IO Exception: null
IO Exception: null
java.io.IOException: Broken pipe
at java.io.FileOutputStream.writeBytes(Native Method)
at java.io.FileOutputStream.write(FileOutputStream.java:326)
at java.io.BufferedOutputStream.flushBuffer(BufferedOutputStream.java:82)
at java.io.BufferedOutputStream.flush(BufferedOutputStream.java:140)
at sun.nio.cs.StreamEncoder.implFlush(StreamEncoder.java:297)
at sun.nio.cs.StreamEncoder.flush(StreamEncoder.java:141)
at java.io.OutputStreamWriter.flush(OutputStreamWriter.java:229)
at hu.nytud.hfst.Analyzer$WorkerProcess.write(Analyzer.java:218)
at hu.nytud.hfst.Analyzer$WorkerProcess.run(Analyzer.java:206)
at java.lang.Thread.run(Thread.java:745)
The culprit is the very long token Pécs-Nagykanizsa-Graz-Aussee-Ischl-Salzburg-Zürich-Luzern-Rigire-Zürich-München-Linz-Bécs-Győr-Mohács-Pécs, but presumable other inputs could induce the error as well. What is strange is that if I run hfst-lookup with the same parameters it is run by GATE:
The full example. I first ran it through quntoken (quntoken qterror.txt), and parsed the non-ws tokens from it. The resulting file is qterror.tokens.txt. Then I ran hfst-lookup on it, as described above, and no errors. I then tried it with GATE, and got the aforementioned problems. I also printed all tokens sent to HFST-Wrapper, and it is exactly the same as qterror.tokens.txt. So the error must be in the wrapper somewhere.
I get
IOException
s (more oftenIO Exception
-- I guess it depends on where the error occurs, i.e. enough words are written to thestdin
of the dead process) for some input to the HFSTAnalyzer
module.Example output:
Example input from the Hungarian Webcorpus:
ioexception.input.txt
The culprit is the very long token Pécs-Nagykanizsa-Graz-Aussee-Ischl-Salzburg-Zürich-Luzern-Rigire-Zürich-München-Linz-Bécs-Győr-Mohács-Pécs, but presumable other inputs could induce the error as well. What is strange is that if I run
hfst-lookup
with the same parameters it is run by GATE:, it is processed without a hitch.
The text was updated successfully, but these errors were encountered: