Downloading from Github Releases sometimes times out #9

micky-git · 2021-01-19T11:08:16Z

tokenizer = Tokenizer.load("en")

ValueError: error decoding response body: operation timed out

bminixhofer · 2021-01-19T12:09:54Z

That's an issue with file hosting via Github Releases. You're the second person to report this, I might check if it's possible to host the files elsewhere.
In the meantime, the workaround mentioned in #6 (comment) works. Download the files manually from the Releases and load them with:

tokenizer = Tokenizer("path/to/en_tokenizer.bin")

micky-git · 2021-01-20T02:26:37Z

gunzipping failed: Custom { kind: UnexpectedEof, error: "failed to fill whole buffer" }

bminixhofer · 2021-01-20T09:14:41Z

Can you paste your code? And make sure the version of the binary and the library match, maybe that's the issue.

micky-git · 2021-01-23T05:07:32Z

version info:
platform:windows64 & mac bigsur
rustc 1.49.0 (e1884a8e3 2020-12-29)
Python 3.7.6
conda 4.9.2
nlprule 0.3.0
tokenizer:https://github.com/bminixhofer/nlprule/releases/download/0.3.0/en_tokenizer.bin.gz
IDE Spyder

code here(python3.7):

# -- coding: utf-8 --
"""
Created on Tue Jan 19 18:41:01 2021
@author: A
"""
from nlprule import Tokenizer, Rules, SplitOn
tokenizer = Tokenizer.load("en_tokenizer.bin")
rules = Rules.load("en", tokenizer, SplitOn([".", "?", "!"]))
rules.correct("He wants that you send him an email.")
rules.correct("Thanks for your’s and Lucy’s help.")
rules.correct("I can due his homework.")
suggestions = rules.suggest("She was not been here since Monday.")
for s in suggestions:
print(s.start, s.end, s.text, s.source, s.message)

output here:

runfile('D:/ai/test/helloworld/test.py', wdir='D:/ai/test/helloworld')
Traceback (most recent call last):

File "D:\ai\test\helloworld\test.py", line 8, in
tokenizer = Tokenizer.load("en_tokenizer.bin")

PanicException: gunzipping failed: Custom { kind: UnexpectedEof, error: "failed to fill whole buffer" }`

micky-git · 2021-01-23T05:09:35Z

btw:unzip with bandizip(windows_64):http://www.bandisoft.com/

micky-git · 2021-01-23T05:39:14Z

I tried:
tokenizer = Tokenizer.load("en_tokenizer.bin")#copy en_tokenizer.bin to workpath
tokenizer = Tokenizer.load("en_tokenizer.bin")#copy en_tokenizer.bin to package_nlprule_path

tokenizer = Tokenizer.load("D:\ai\test\helloworld\en_tokenizer.bin")
tokenizer = Tokenizer.load("D:\ai\test\helloworld\en_tokenizer.bin")
tokenizer = Tokenizer.load("D:/ai/test/helloworld/en_tokenizer.bin")

tokenizer = Tokenizer.load(r"D:\ai\test\helloworld\en_tokenizer.bin")
tokenizer = Tokenizer.load(r"D:\ai\test\helloworld\en_tokenizer.bin")
tokenizer = Tokenizer.load(r"D:\ai\test\helloworld\en_tokenizer.bin")

tokenizer = Tokenizer.load(f"D:\ai\test\helloworld\en_tokenizer.bin")
tokenizer = Tokenizer.load(f"D:\ai\test\helloworld\en_tokenizer.bin")
tokenizer = Tokenizer.load(r"D:\ai\test\helloworld\en_tokenizer.bin")

tokenizer = Tokenizer.load(f"en_tokenizer.bin")
tokenizer = Tokenizer.load("../en_tokenizer.bin")

I'm not quite understand the argument that func_load passes

bminixhofer · 2021-01-23T08:26:33Z

Thanks for the code. So .load takes a language code e.g "en" or "de" as input and downloads the binary automatically. Since this does not seem to work in your case, you can load it manually with the constructor of the Tokenizer, not with .load.

This code should work:

tokenizer = Tokenizer("en_tokenizer.bin") # no .load, just Tokenizer(..)!

bminixhofer · 2021-02-12T15:42:32Z

The binaries are now significantly smaller so I hope this won't happen anymore. I'm closing this for now, please comment / reopen if there's anymore issues with timeouts.

bminixhofer changed the title ~~can't load tokenizer~~ Downloading from Github Releases sometimes times out Jan 19, 2021

bminixhofer mentioned this issue Feb 2, 2021

API to include the correct binaries at compile time #12

Closed

bminixhofer closed this as completed Feb 12, 2021

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Downloading from Github Releases sometimes times out #9

Downloading from Github Releases sometimes times out #9

micky-git commented Jan 19, 2021

bminixhofer commented Jan 19, 2021 •

edited

Loading

micky-git commented Jan 20, 2021

bminixhofer commented Jan 20, 2021 •

edited

Loading

micky-git commented Jan 23, 2021

micky-git commented Jan 23, 2021

micky-git commented Jan 23, 2021

bminixhofer commented Jan 23, 2021

bminixhofer commented Feb 12, 2021

Downloading from Github Releases sometimes times out #9

Downloading from Github Releases sometimes times out #9

Comments

micky-git commented Jan 19, 2021

bminixhofer commented Jan 19, 2021 • edited Loading

micky-git commented Jan 20, 2021

bminixhofer commented Jan 20, 2021 • edited Loading

micky-git commented Jan 23, 2021

micky-git commented Jan 23, 2021

micky-git commented Jan 23, 2021

bminixhofer commented Jan 23, 2021

bminixhofer commented Feb 12, 2021

bminixhofer commented Jan 19, 2021 •

edited

Loading

bminixhofer commented Jan 20, 2021 •

edited

Loading