-
Notifications
You must be signed in to change notification settings - Fork 64
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
(EAI-587): Clean up shared imports in mongodb-rag-core + migrate @azure/openai
to openai
#546
Conversation
export * from "mongodb"; | ||
export * from "@azure/openai"; | ||
export * as OpenAI from "openai"; | ||
export * from "./langchain"; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@nlarew in future PR, i wanna do the hared langchain and mongodb exports w/ the * as Name
convention for continuing the better namespacing of modules in this package,
ie:
export * as MongoDB from "mongodb";
export * as Langchain from "langchain";
not doing in this PR b/c it'd dramatically increase the # of files touched.
relatedly, i think it'd probably be more correct to change these external dependencies that we re-export as peerDependencies
in the package.json, but also since our monorepo is the main consumer of the mongodb-rag-core
package, it probably would add more overhead and annoyance to us than value
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I'm directionally ok with that export * as Namespace
approach, though I want to make sure that it wouldn't make it harder to do tree shaking or otherwise bloat bundles.
If it is an issue, we might consider an alternative of two package entry points. i.e.
- In this file, we
export * from "./langchain"
so that there's no namespace wrapper object - We need to create that file we export from, i.e.
./langchain/index.ts
. It does theexport * from "openai";
- In
package.json
we add aexports
for./langchain
.
So basically I could do either of these as a consumer:
import { LangChainGithubRepoLoader } from "mongodb-rag-core";
import { LangChainGithubRepoLoader } from "mongodb-rag-core/langchain";
Maybe too complex?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Interesting thought on the peer dependencies, I'd have to think through it a bit or see it in action
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
i'm notionally in agreement w/ the idea of mongodb-rag-core/<namespace>
, however in practice i've repeatedly run into issues w/ any packages that we compose like this.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
that's why i didn't use that pattern. but let me use this as an opportunity to at least properly understand why this doesn't work, even if we cant switch to it..tho maybe we'll be able to switch..let's see.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
going to merge this PR, and play w/ this in separate PR. would require touching a lot of files b/c so many imports would be changed
@azure/openai
to openai
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
LGTM! Yeoman's work on this one. Just a few small nits & questions.
@@ -41,7 +40,7 @@ const systemPrompt = `You are an expert data labeler employed by MongoDB. | |||
You must label metadata about the user query based on its context in the conversation. | |||
Your pay is determined by the accuracy of your labels as judged against other expert labelers, so do excellent work to maximize your earnings to support your family.`; | |||
|
|||
const fewShotExamples: ChatCompletionMessageParam[] = [ | |||
const fewShotExamples: OpenAI.default.Chat.ChatCompletionMessageParam[] = [ |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Weird type is an unfortunate side effect of the namespace exports :/
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
yeah, also the openai
package does some funky stuff w/ exports
packages/chatbot-server-mongodb-public/src/processors/makeStepBackUserQuery.test.ts
Outdated
Show resolved
Hide resolved
// This module is also required for the mock to work | ||
// in the tests | ||
const { updateFrontMatter } = jest.requireActual("mongodb-chatbot-server"); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Interesting - do you know why updateFrontMatter
but not any other exports from the server?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
yea b/c that's needed in one of the methods. don't want it overwritten
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I guess the meat of the question is - when would we need to add other methods here? Like is this for a specific test or just in general we need the actual function?
if ( | ||
response.function_call === undefined || | ||
response.function_call === null | ||
) { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Can keep this as a one-liner if you want
if ( | |
response.function_call === undefined || | |
response.function_call === null | |
) { | |
if ((response.function_call ?? undefined) === undefined) { |
or if the typing makes sense could go even simpler
if ( | |
response.function_call === undefined || | |
response.function_call === null | |
) { | |
if (!response.function_call) { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
keeping explicit in case of ""
@@ -458,16 +460,21 @@ export async function streamGenerateResponseMessage({ | |||
}); | |||
} | |||
|
|||
return { messages: newMessages }; | |||
return { messages: newMessages.map(convertMessageFromLlmToDb) }; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Why add the convertMessageFromLlmToDb()
call here?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
b/c in our implementation it's still type functionCall
, whereas the new openai lib is function_call
...baseChatbotRepoConfig, | ||
name: "ingest_testData", | ||
metadata: { | ||
productName: "Ingest Test Data", | ||
}, | ||
filter: (path) => path.includes("ingest/testData"), | ||
filter: (path) => path.includes(Path.join("mongodb-rag-core", "testData")), |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Do we need the Path
import here? Can't you just do "mongodb-rag-core/testData"
since this is joining two static strings?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
not if you're on windows!
…BackUserQuery.test.ts Co-authored-by: Nick Larew <nick.larew@mongodb.com>
Jira: https://jira.mongodb.org/browse/EAI-587
Changes
openai
module instead of@azure/openai
mongodb-rag-core
to better folow namespace convention ofCamelCase
(w/ first letter uppercase)Notes