- 
                Notifications
    You must be signed in to change notification settings 
- Fork 932
Closed
Labels
Description
Describe the bug
- Install all the pip dependencies for latest 254-llmchatbot notebook
- Follow the steps to convert "llama-2-chat-7b" model to int4 format with default configuration.
- Select device as "CPU"
- Select model to run "INT4"
- Run step to Load and compile the model
- Set max_new_token=500and runov_model.generatewith prompt "Describe Intel in 100 words or less"
Expected behavior
Output should be produced in English. However we are getting output in German language.
- Is there any issue with converting llama-2-chat-7b model into int4 format with OpenVino ?
- Is the issue with latest openvino==2023.3.0 or nncf==2.9.0.dev0+84b46f58 ?
Installation instructions (Please mark the checkbox)
- I followed the installation guide at https://github.com/openvinotoolkit/openvino_notebooks#-installation-guide to install the notebooks.
Additional context
I tried playing around with model_compression_params but it didn't help to resolve this issue.
"llama-2-chat-7b": {
    "mode": nncf.CompressWeightsMode.INT4_SYM,
    "group_size": 128,
    "ratio": 0.8,
},
"llama-2-chat-7b": {
    "mode": nncf.CompressWeightsMode.INT4_ASYM,
    "group_size": 128,
    "ratio": 0.8,
},
"llama-2-chat-7b": {
    "mode": nncf.CompressWeightsMode.INT4_SYM,
    "group_size": 64,
    "ratio": 0.8,
},
"llama-2-chat-7b": {
    "mode": nncf.CompressWeightsMode.INT4_SYM,
    "group_size": 64,
    "ratio": 0.6,
},


