You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
When I run doc index on the langchain repository, I receive the following error:
⠇ Processing 494 files...Error during traversal: The text contains a special token that is not allowed: <|endoftext|>
Failed to find `autodoc.config.json` file. Did you run `doc init`?
Error: The text contains a special token that is not allowed: <|endoftext|>
at module.exports.__wbindgen_error_new (/usr/local/Cellar/node/19.8.1/lib/node_modules/@context-labs/autodoc/node_modules/@dqbd/tiktoken/tiktoken_bg.cjs:398:17)
at wasm://wasm/00b63e2e:wasm-function[15]:0xebb8
at wasm://wasm/00b63e2e:wasm-function[154]:0x48af5
at Tiktoken.encode (/usr/local/Cellar/node/19.8.1/lib/node_modules/@context-labs/autodoc/node_modules/@dqbd/tiktoken/tiktoken_bg.cjs:257:18)
at processFile (file:///usr/local/Cellar/node/19.8.1/lib/node_modules/@context-labs/autodoc/dist/cli/commands/index/processRepository.js:24:40)
at async file:///usr/local/Cellar/node/19.8.1/lib/node_modules/@context-labs/autodoc/dist/cli/utils/traverseFileSystem.js:42:21
at async Promise.all (index 2)
at async dfs (file:///usr/local/Cellar/node/19.8.1/lib/node_modules/@context-labs/autodoc/dist/cli/utils/traverseFileSystem.js:38:13)
at async file:///usr/local/Cellar/node/19.8.1/lib/node_modules/@context-labs/autodoc/dist/cli/utils/traverseFileSystem.js:25:21
at async Promise.all (index 0)
I believe this is an issue with autodoc, rather than the langchain repository, as I have followed the instructions in the README file and run doc init in the langchain repository before running doc index.
Here is some information about my environment:
Operating system: macOS Monterey 12.6.3 (21G419)
Node.js version: v19.8.1
Please let me know if there is any additional information I can provide or steps I can take to resolve this issue.
The text was updated successfully, but these errors were encountered:
Get the same problem trying to process the microsoft/semantic-kernel repo. Managed to get things working by catching the error, but it's a hack as I don't understand what's throwing it. src/cli/commands/index/processRepository.ts
let summaryLength: number;
try {
summaryLength = encoding.encode(summaryPrompt).length;
} catch (error) {
console.error(
`Error during encoding of summary prompt: ${(error as Error).message}`,
);
// set summaryLength to a default value
summaryLength = 0;
}
let questionLength: number;
try {
questionLength = encoding.encode(questionsPrompt).length;
} catch (error) {
console.error(
`Error during encoding of question prompt: ${(error as Error).message}`,
);
// set questionLength to a default value
questionLength = 0;
}
For langchain, I resolved the issue by deleting docs/modules/agents/toolkits/examples/openai_openapi.yml.
For semantic-kernel, I resolved the issue by deleting dotnet/src/SemanticKernel/Connectors/OpenAI/Tokenizers/Settings/encoder.json.
This issue is related to <|endoftext|> which is used when interacting with OpenAI. Since lanchain and semantic-kernel contain this special character in their repo, the doc index command fails.
When I run
doc index
on the langchain repository, I receive the following error:I believe this is an issue with autodoc, rather than the langchain repository, as I have followed the instructions in the README file and run
doc init
in the langchain repository before runningdoc index
.Here is some information about my environment:
Please let me know if there is any additional information I can provide or steps I can take to resolve this issue.
The text was updated successfully, but these errors were encountered: