Help with llamacpp configuration #1396

ShinokuS · 2023-05-10T21:34:07Z

ShinokuS
May 10, 2023

I use llama-cpp-python in llama-index as follows:

from langchain.llms import LlamaCpp
from langchain.callbacks.manager import CallbackManager
from langchain.callbacks.streaming_stdout import StreamingStdOutCallbackHandler
from llama_index import SimpleDirectoryReader, GPTListIndex, PromptHelper, load_index_from_storage, StorageContext
from llama_index import LLMPredictor, ServiceContext

# define prompt helper
max_input_size = 2048
# set number of output tokens
num_output = 256
# set maximum chunk overlap
max_chunk_overlap = 20
prompt_helper = PromptHelper(max_input_size, num_output, max_chunk_overlap)

# Callbacks support token-wise streaming
callback_manager = CallbackManager([StreamingStdOutCallbackHandler()])
# Verbose is required to pass to the callback manager

# Make sure the model path is correct for your system!
llama = LlamaCpp(
    model_path="./ggml-model-q4_0.bin", 
    callback_manager=callback_manager, 
    verbose=False,
    max_tokens=256,
    n_ctx=1024,
    n_batch=256,
)

llm_predictor = LLMPredictor(llm=llama)
service_context = ServiceContext.from_defaults(llm_predictor=llm_predictor, prompt_helper=prompt_helper)

# Load the your data
documents = SimpleDirectoryReader('./docs').load_data()
index = GPTListIndex.from_documents(documents, service_context=service_context)
index.storage_context.persist(persist_dir="./index/")
# storage_context = StorageContext.from_defaults(persist_dir="./index/")
# index = load_index_from_storage(storage_context, service_context=service_context)

# Query and print response
query_engine = index.as_query_engine()
response = query_engine.query("<my_query>")
print(response)

But I have a problem with settings of llamacpp and llamaindex. I'm new to NLP, could you tell me which parameters are best configured for fast indexing of any files of any size and a quick response?the response generation is too long (1.5 minutes) and does not end after the message: Llama.generate: prefix-match hit

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Help with llamacpp configuration #1396

{{title}}

Replies: 0 comments

Select a reply

Help with llamacpp configuration #1396

ShinokuS May 10, 2023

Replies: 0 comments

ShinokuS
May 10, 2023