-
Notifications
You must be signed in to change notification settings - Fork 888
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
[FR] Add --offline
#317
Comments
After taking a look at the code, it seems just setting #!/usr/bin/env python3
##
import os
import tiktoken
import sys
##
HOME = os.environ["HOME"]
tiktoken_cache_dir = f"{HOME}/tmp/tiktoken_cache"
os.environ["TIKTOKEN_CACHE_DIR"] = tiktoken_cache_dir
##
def num_tokens_from_message(message, model="gpt-4"):
encoding = tiktoken.encoding_for_model(model)
num_tokens = len(encoding.encode(message))
return num_tokens
message = sys.stdin.read()
num_tokens = num_tokens_from_message(message, "gpt-4")
print(num_tokens) And I have ~0.26s latency for counting a single token!
This kind of latency is terrible. |
@NightMachinery The issue of import os
os.environ["TIKTOKEN_CACHE_DIR"] = "path/to/tiktoken_dir"
# rest of tiktoken code |
@nkilm Please report latency. The latency is still terrible even with the file not deleted, isn't it? |
There are some inconvenient workarounds for using this software without making an internet connection (which adds considerable latency on unstable networks). This use case should see official support. I propose adding the latest versions of the tokenizers to the pip package, and just using them without checking for updates when the user supplies
--offline
. Of course, the tokenizers can be updated any time the user doesn't use this flag.To summarize, I propose two changes:
pip
, it is ready to go.--offline
.PS: I have skimmed the workaround, and while it works for offline usage, I am not sure it would solve the latency issue on an online machine. The workaround needs lots of manual steps, too, and it's not just some script we can run and be done with it.
Related:
.tiktoken
file gets deleted automatically on Linux #279The text was updated successfully, but these errors were encountered: