-
Looking through the docs, it seems, by default, NeMo uses the L6 tokenizer to embed both the text in the colang scripts and the text in the markdown files in the knowledgebase folder. It looks like the maximum input to the L6 model is 256 words. I had two questions about this process.
Any insights or pointers to relevant documentation sections would be greatly appreciated! |
Beta Was this translation helpful? Give feedback.
Replies: 1 comment 1 reply
-
Good questions @spehl-max!
|
Beta Was this translation helpful? Give feedback.
Good questions @spehl-max!
When computing the embedding, we send the text as is, so it will be truncated automatically. This could indeed be a problem, thanks for pointing this. The user and bot messages defined in a Colang config are not typically that long and flows are indexed line by line (https://github.com/NVIDIA/NeMo-Guardrails/blob/develop/nemoguardrails/actions/llm/generation.py#L179). So this is very unlikely to happen. But for the input coming from the user, this could be the case.
The embeddings are computed when the configuration is initialized (https://github.com/NVIDIA/NeMo-Guardrails/blob/develop/nemoguardrails/actions/llm/generation.py#L105). In the prompt, typically …