-
Notifications
You must be signed in to change notification settings - Fork 375
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Error: Unknown node type: DOCUMENT #1179
Comments
what's your code snippt? I'm guessing there's dual module in the output |
I cannot reproduce the issue but I think this might related to the bundler |
Here's my /**
* @type {import("next").NextConfig}
*/
import { fileURLToPath } from "url";
import _jiti from "jiti";
import withLlamaIndex from "llamaindex/next";
const jiti = _jiti(fileURLToPath(import.meta.url));
// Import env files to validate at build time. Use jiti so we can load .ts files in here.
jiti("./src/env");
const isStaticExport = "false";
const nextConfig = {
basePath: process.env.NEXT_PUBLIC_BASE_PATH,
env: {
BUILD_STATIC_EXPORT: isStaticExport,
},
// Trailing slashes must be disabled for Next Auth callback endpoint to work
// https://stackoverflow.com/a/78348528
trailingSlash: false,
modularizeImports: {
"@mui/icons-material": {
transform: "@mui/icons-material/{{member}}",
},
"@mui/material": {
transform: "@mui/material/{{member}}",
},
"@mui/lab": {
transform: "@mui/lab/{{member}}",
},
},
webpack(config) {
config.module.rules.push({
test: /\.svg$/,
use: ["@svgr/webpack"],
});
// To allow chatbot to work
// Extracted from: https://github.com/neondatabase/examples/blob/main/ai/llamaindex/rag-nextjs/next.config.mjs
config.resolve.alias = {
...config.resolve.alias,
sharp$: false,
"onnxruntime-node$": false,
};
return config;
},
...(isStaticExport === "true" && {
output: "export",
}),
};
export default withLlamaIndex(nextConfig); All config.ts import { OpenAI, OpenAIEmbedding, Settings } from "llamaindex";
import { env } from "~/env";
const aiGlobalConfig = globalThis as unknown as {
llmModel?: OpenAI;
embedModel?: OpenAIEmbedding;
};
const llmModel =
aiGlobalConfig.llmModel ??
new OpenAI({
apiKey: env.OPENAI_API_KEY,
model: env.OPENAI_MODEL_NAME,
});
if (env.NODE_ENV !== "production") {
aiGlobalConfig.llmModel = llmModel;
}
const embedModel =
aiGlobalConfig.embedModel ??
new OpenAIEmbedding({
apiKey: env.OPENAI_API_KEY,
model: env.OPENAI_EMBED_MODEL_NAME,
});
if (env.NODE_ENV !== "production") {
aiGlobalConfig.embedModel = embedModel;
}
// LlamaIndex Settings
Settings.llm = llmModel;
Settings.embedModel = embedModel;
export { Settings };
import { PGVectorStore } from "llamaindex/vector-store/PGVectorStore";
import { env } from "~/env";
const DIMMS = env.EMBED_DIM;
const CONN_STRING = env.VECTOR_STORE_PG_URL;
const globalForVectorStore = globalThis as unknown as {
vectorStore: PGVectorStore | undefined;
};
const storedVectorStore =
globalForVectorStore.vectorStore ??
new PGVectorStore({
dimensions: DIMMS,
connectionString: CONN_STRING,
});
if (env.NODE_ENV !== "production") {
globalForVectorStore.vectorStore = storedVectorStore;
}
const vectorStore = storedVectorStore;
export { vectorStore };
import type { NextRequest } from "next/server";
import { ContextChatEngine, serviceContextFromDefaults, VectorStoreIndex } from "llamaindex";
import type { ChatBotPayload } from "./types";
import { env } from "~/env";
import { Settings } from "./config";
import { vectorStore } from "./vector-store";
/**
* Key used to store the space ID in the metadata.
*/
const METADATA_SPACE_ID_KEY = "space_id";
export async function POST(request: NextRequest) {
try {
const { messages = [], spaces: confluenceSpaces = [] } =
(await request.json()) as ChatBotPayload;
if (confluenceSpaces.length === 0) {
throw new Error("No confluence spaces provided.");
}
const userMessages = messages.filter((i) => i.role === "user");
const query = userMessages[userMessages.length - 1]?.content;
if (!query) {
throw new Error("No query provided.");
}
const serviceContext = serviceContextFromDefaults({ embedModel: Settings.embedModel });
const index = await VectorStoreIndex.fromVectorStore(vectorStore, serviceContext);
const retriever = index.asRetriever({
topK: {
TEXT: env.TOP_K_SIMILARITY_TEXT,
IMAGE: env.TOP_K_SIMILARITY_IMAGE,
},
filters: {
// Limit the search to the allowed confluence spaces.
filters: confluenceSpaces.map((spaceId) => ({
key: METADATA_SPACE_ID_KEY,
value: spaceId,
operator: "==",
})),
condition: "or",
},
});
const chatEngine = new ContextChatEngine({
retriever,
});
const encoder = new TextEncoder();
const customReadable = new ReadableStream({
async start(controller) {
const stream = await chatEngine.chat({
message: query,
chatHistory: messages,
stream: true,
verbose: true,
});
for await (const chunk of stream) {
controller.enqueue(encoder.encode(chunk.response));
}
controller.close();
},
});
return new Response(customReadable, {
headers: {
Connection: "keep-alive",
"Content-Encoding": "none",
"Cache-Control": "no-cache, no-transform",
"Content-Type": "text/plain; charset=utf-8",
},
});
} catch (error) {
const errorMessage =
error instanceof Error ? error.message : "An error ocurred while processing the request.";
return Response.json(
{ error: errorMessage },
{
headers: { "Content-Type": "application/json" },
status: 500,
},
);
}
} |
I try this on my locally. It runs well, I'm guessing you are using different version of llamaindex & |
Tested with the latest 0.6.0 and the error is still present. The error comes from here export function splitNodesByType(nodes: BaseNode[]): NodesByType {
const result: NodesByType = {};
for (const node of nodes) {
let type: ModalityType;
if (node instanceof ImageNode) {
type = ModalityType.IMAGE;
} else if (node instanceof TextNode) {
type = ModalityType.TEXT;
} else {
throw new Error(`Unknown node type: ${node.type}`);
}
if (type in result) {
result[type]?.push(node);
} else {
result[type] = [node];
}
}
return result;
} Link to source: https://github.com/run-llama/LlamaIndexTS/blob/main/packages/core/src/schema/node.ts#L438-L446 Wouldn't it be safer to replace the |
@AndreMaz If we do that change, we would also have to include subclasses of |
It’s weird to me that we didn’t change any code related to you error stack. It’s confusing to me |
but I think it might caused by dual package somewhere, I know bundler will cause two ImageNode, TextNode in some cases |
@himself65 with 0.6.2 the However, now I'm seeing the I also don't mix ESM and CJS imports so I don't know what's causing this Do you have any pointers on what should I be looking at to solve this issue? Unrelated constructor(configOrClient?: PGVectorStoreConfig | pg.ClientBase) {
// We cannot import pg from top level, it might have side effects
// so we only check if the config.connect function exists
if (
configOrClient &&
"connect" in configOrClient &&
typeof configOrClient.connect === "function"
) {
const db = configOrClient as pg.ClientBase;
super();
this.db = db;
} else {
const config = configOrClient as PGVectorStoreConfig;
super(config?.embedModel);
this.schemaName = config?.schemaName ?? PGVECTOR_SCHEMA;
this.tableName = config?.tableName ?? PGVECTOR_TABLE;
this.database = config?.database;
this.connectionString = config?.connectionString;
this.dimensions = config?.dimensions ?? 1536;
}
} |
@himself65 what if we add something like export type PGVectorStoreConfig = {
schemaName?: string | undefined;
tableName?: string | undefined;
database?: string | undefined;
connectionString?: string | undefined;
dimensions?: number | undefined;
embedModel?: BaseEmbedding | undefined;
pgClientConfig?: pg.ClientConfig | undefined; <=========== CUSTOM PG CLIENT CONFIGS
};
export declare class PGVectorStore extends VectorStoreBase implements VectorStoreNoEmbedModel {
// ... other vars //
private db?: pg.ClientBase;
private pgClientConfig?: pg.ClientConfig; // <=========== REF TO CUSTOM PG CLIENT CONFIGS
/**
* Constructs a new instance of the PGVectorStore
*
* If the `connectionString` is not provided the following env variables are
* used to connect to the DB:
* PGHOST=your database host
* PGUSER=your database user
* PGPASSWORD=your database password
* PGDATABASE=your database name
* PGPORT=your database port
*/
constructor(config?: PGVectorStoreConfig) {
super(config?.embedModel);
this.schemaName = config?.schemaName ?? PGVECTOR_SCHEMA;
this.tableName = config?.tableName ?? PGVECTOR_TABLE;
this.database = config?.database;
this.connectionString = config?.connectionString;
this.dimensions = config?.dimensions ?? 1536;
this.pgClientConfig = config?.pgClientConfig ?? {}; <=========== REF TO CUSTOM PG CLIENT CONFIGS
}
// ... other fns //
private async getDb(): Promise<pg.ClientBase> {
if (!this.db) {
try {
const pg = await import("pg");
const { Client } = pg.default ? pg.default : pg;
const { registerType } = await import("pgvector/pg");
// Create DB connection
// Read connection params from env - see comment block above
const db = new Client({
...this.pgClientConfig, <=========== INJECT CUSTOM PARAMS
database: this.database,
connectionString: this.connectionString,
});
await db.connect();
// Check vector extension
await db.query("CREATE EXTENSION IF NOT EXISTS vector");
await registerType(db);
// All good? Keep the connection reference
this.db = db;
} catch (err) {
console.error(err);
return Promise.reject(err instanceof Error ? err : new Error(`${err}`));
}
}
const db = this.db;
// Check schema, table(s), index(es)
await this.checkSchema(db);
return Promise.resolve(this.db);
} This way we can easily config the PGVectorStore and the client. This should allow the user to pass custom certs (#366) |
@himself65 here are the versions that I'm currently using pnpm why -F @web llamaindex
Legend: production dependency, optional only, dev only
@web@6.0.1 /home/Dev/web/apps/web
dependencies:
llamaindex 0.6.2 pnpm why -F @web @llamaindex/core
Legend: production dependency, optional only, dev only
@web@6.0.1 /home/Dev/web/apps/web
dependencies:
llamaindex 0.6.2
├─┬ @llamaindex/cloud 0.2.6
│ └── @llamaindex/core 0.2.2 peer
├── @llamaindex/core 0.2.2
├─┬ @llamaindex/groq 0.0.3
│ └─┬ @llamaindex/openai 0.1.4
│ └── @llamaindex/core 0.2.2
└─┬ @llamaindex/openai 0.1.4
└── @llamaindex/core 0.2.2 |
@himself65 after googling around and checking the issue that you've linked in #1214 What's the chance of some change after 0.5.20 (last one that was working without complaining) caused CJS and ESM bundling as described in this article https://www.codejam.info/2024/02/esm-cjs-dupe.html ? |
Thanks for feedback, I didn't consider that case. |
Im guessing you are using pnpm monorepo, It's very offen to have dual module issue, you need check your pnpm-lock.yml and run |
Or do sth like this in your config https://github.com/toeverything/AFFiNE/pull/1276/files#diff-197cd8ca285a4abd2f21479e0bf6e36e90b08528fcd7f3bdbe8d1221897e377dR87 replace yjs to llamaindex |
@himself65 bumped to
0.5.24
to see If the #1176 is fixed. It is but now I'm getting the following error:Note: This error was not present in the v0.5.20. Something that I initially described in #1172
I can repro the issue exactly in the same way as I've mentioned in the ticket above.
From the ticket:
The text was updated successfully, but these errors were encountered: