Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Local model error #100

Closed
lzw-lzw opened this issue Nov 12, 2023 · 12 comments
Closed

Local model error #100

lzw-lzw opened this issue Nov 12, 2023 · 12 comments

Comments

@lzw-lzw
Copy link

lzw-lzw commented Nov 12, 2023

Hello, thank you for the excellent framework. When I tried to run the local model following the tutorial, I encountered the following problem:ValueError: llama-2-7b-chat-hf is not registered. Please register with the .register("llama-2-7b-chat-hf") method provided in LLMRegistry registry. What could be the reason for this? Thanks.

@chenweize1998
Copy link
Collaborator

Are you using the latest code? Did you set llm_type: local to all the agents in your config?

@lzw-lzw
Copy link
Author

lzw-lzw commented Nov 12, 2023

I am using the latest code, and after setting llm_type to local, a new error appears: KeyError: 'Could not automatically map llama-2-7b-chat-hf to a tokeniser. Please use tiktoken.get_encoding to explicitly get the tokeniser you expect.'

@chenweize1998
Copy link
Collaborator

I think the problem should have been fixed in a previous commit. Are you using the latest code on github repo? And did you install AgentVerse with pip install -e .?

@lzw-lzw
Copy link
Author

lzw-lzw commented Nov 12, 2023

Thanks for your patient reply. I am using the the latest code on github repo, and I also install AgentVerse with pip install -e . .After that, I change the MODEL_PATH and MODEL_NAME to the path of my llama-2-7b-chat-hf and "llama-2-7b-chat-hf", and then I run the run_local_model_server.sh. After that, I created a directory under brainstorming, which contains a config.yaml, with llm_type:local and model: llama-2-7b-chat-hf, then I run "python3 agentverse_command/main_tasksolving_cli.py --task tasksolving/brainstorming/llama-2-7b-chat-hf", this error occurred.

chenweize1998 added a commit that referenced this issue Nov 12, 2023
@chenweize1998
Copy link
Collaborator

Just made some updates to the code. Please check if it's working correctly now. At the moment, I don't have access to a machine with a GPU, so I'm unable to fully run the process with local LLM. If the issue persists, I'll try to find a GPU machine for further debugging.

@lzw-lzw
Copy link
Author

lzw-lzw commented Nov 12, 2023

I'm sorry it still doesn't work properly, waiting for your try. Thanks!

@xymou
Copy link

xymou commented Nov 13, 2023

I can get it running locally, but the output seems incorrect. The first prompt keeps repeating, no response is generated.
1699844641986

@chenweize1998
Copy link
Collaborator

Pull the latest code and try again. It works fine on my GPU machine now. After launching the FastChat service, check if it's running correctly by executing this command curl http://127.0.0.1:5000/v1/models-2-7b-chat-hf. It should returns something like

{"object":"list","data":[{"id":"llama-2-7b-chat-hf","object":"model","created":1699856748,"owned_by":"fastchat","root":"llama-2-7b-chat-hf","parent":null,"permission":[{"id":"modelperm-7bcKCjaRGuVKoeajAXkSgP","object":"model_permission","created":1699856748,"allow_create_engine":false,"allow_sampling":true,"allow_logprobs":true,"allow_search_indices":true,"allow_view":true,"allow_fine_tuning":false,"organization":"*","group":null,"is_blocking":false}]}]}

After confirming the service is running, run the benchmark script with the following command:
python agentverse_command/benchmark.py --task tasksolving/commongen/llama-2-7b-chat-hf --dataset_path data/commongen/commongen_hard.jsonl

This should execute successfully. However, please note that while the script should run, we cannot guarantee its performance as open-sourced LLMs generally lack behind OpenAI's GPTs.

chenweize1998 added a commit that referenced this issue Nov 13, 2023
@chenweize1998
Copy link
Collaborator

chenweize1998 commented Nov 13, 2023

I can get it running locally, but the output seems incorrect. The first prompt keeps repeating, no response is generated. 1699844641986

@xymou The issue you're encountering might be due to the local model not adhering to the specific response format we've set. In the NLP classroom example, we enforce a strict format where the model's output should be structured as follows:

Action: [specific action]
Action Input: [related input]

OpenAI's GPT models usually comply with this format quite reliably. However, local LLMs might have difficulty consistently generating responses in this precise structure. Oour system is designed to automatically retry if it doesn't detect the required pattern in the model's response. This automatic retry mechanism could explain why you're noticing the prompt being repeated.

@xymou
Copy link

xymou commented Nov 13, 2023

I can get it running locally, but the output seems incorrect. The first prompt keeps repeating, no response is generated. 1699844641986

@xymou The issue you're encountering might be due to the local model not adhering to the specific response format we've set. In the NLP classroom example, we enforce a strict format where the model's output should be structured as follows:

Action: [specific action]
Action Input: [related input]

OpenAI's GPT models usually comply with this format quite reliably. However, local LLMs might have difficulty consistently generating responses in this precise structure. Oour system is designed to automatically retry if it doesn't detect the required pattern in the model's response. This automatic retry mechanism could explain why you're noticing the prompt being repeated.

Thank you for your reply! I've noticed that the open-source LLMs do not follow the instructions to generate reponses in required structure. Do you have any suggestions to solve this? e.g., giving the models an in-context example? But I guess the input length may be a restriction.😰

@chenweize1998
Copy link
Collaborator

I can get it running locally, but the output seems incorrect. The first prompt keeps repeating, no response is generated. 1699844641986

@xymou The issue you're encountering might be due to the local model not adhering to the specific response format we've set. In the NLP classroom example, we enforce a strict format where the model's output should be structured as follows:

Action: [specific action]
Action Input: [related input]

OpenAI's GPT models usually comply with this format quite reliably. However, local LLMs might have difficulty consistently generating responses in this precise structure. Oour system is designed to automatically retry if it doesn't detect the required pattern in the model's response. This automatic retry mechanism could explain why you're noticing the prompt being repeated.

Thank you for your reply! I've noticed that the open-source LLMs do not follow the instructions to generate reponses in required structure. Do you have any suggestions to solve this? e.g., giving the models an in-context example? But I guess the input length may be a restriction.😰

A workaround may be using the constrained generation, e.g., outlines. But we don't support it yet. You may need to investigate and make some code edition.

@lzw-lzw
Copy link
Author

lzw-lzw commented Nov 14, 2023

It's working fine now, thanks for your patient reply.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

3 participants