-
-
Notifications
You must be signed in to change notification settings - Fork 4.4k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Adding /get_tokenizer
to api_server for lm-evaluation-harness ease integration.
#2643
base: main
Are you sure you want to change the base?
Adding /get_tokenizer
to api_server for lm-evaluation-harness ease integration.
#2643
Conversation
In these cases
wouldn't the user of lm-evaluation-harness be able to figure out the tokenizer ahead of time? |
I would like to wait a bit on this PR to see how popular this request it. Thank you for posting! @AguirreNicolas |
cfa4b85
to
f89a2b1
Compare
Hi @simon-mo , My team and I want to provide support for That is, to use The issue we are encountering is in Under the current For this reason, we believe it would be valuable to move forward with the current PR. Moreover, I was thinking of updating the current PR adding a boolean option on the vLLM side to enable or disable this option. Finally, open another PR in elif self.tokenizer_backend == "vllm":
import transformers # noqa: E401
# Get tokenizer
response = requests.get(base_url+"/get_tokenizer")
tokenizer_objects = json.loads(response.content)
# save into files
for key, value in tokenizer_objects.items():
# create folder if not exists
if not os.path.exists(self.tokenizer_ephimeral_path):
os.mkdir(self.tokenizer_ephimeral_path)
with open(os.path.join(self.tokenizer_ephimeral_path, key + ".json"), "w") as f:
json.dump(value, f)
f.close()
try:
shutil.rmtree(self.tokenizer_ephimeral_path)
eval_logger.debug(f"Ephimeral '{self.tokenizer_ephimeral_path.name}' directory removed successfully.")
eval_logger.debug(f"Tokenizer objects availables: {str(self.tokenizer_objects.keys())}")
except OSError as e:
raise RuntimeError(f"Error removing '{self.tokenizer_ephimeral_path.name}' directory: {e}")
# load tokenizer
self.tokenizer = transformers.AutoTokenizer.from_pretrained(self.tokenizer_ephimeral_path)
self.vocab_size = self.tokenizer.vocab
self.end_of_text_token_id = self.tokenizer.eos_token What do you think? Would you consider in any way valuable our time into vLLM on this issue ? |
8fe1af7
to
3f3d7fc
Compare
This pull request has been automatically marked as stale because it has not had any activity within 90 days. It will be automatically closed if no further activity occurs within 30 days. Leave a comment if you feel this pull request should remain open. Thank you! |
This pull request has merge conflicts that must be resolved before it can be |
OpenAI is already supporting logprobs for chat models in most model cases.
In paralell, lm-evaluation-harness is in dev to add support to this feature.
In order to run a evaluation of an endpoint backended by vLLM, and similarly as in the openai native case using tiktoken, the tokenizer will be needed.
The simple way would be to return the HF
<organization>/<repo>
string, and then relay on the use ofAutoTokenizer.from_pretrained
.But given that some usecase like:
I propose to add the
get_tokenizer
endpoit theapi_server
so then it can be used like:The next step would be to have logprobs supported in vLLM for chat cases in the OAI entryppoint, and then a PR on
lm-evaluation-harness
to catch the vLLM tokenizer feature (like using the code above).