Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Bake the MiniLM model into the Docker container for deployment #12

Merged
merged 10 commits into from
Dec 9, 2024

Conversation

karlhigley
Copy link
Collaborator

@karlhigley karlhigley commented Dec 2, 2024

This reworks the context generation code to:

  • Load the MiniLM model from a local copy baked into the Docker container (instead of downloading from HF)
  • Create the OpenAI client inside the ContextGenerator component

And also addresses some issues with deployment:

  • Remove the ContextGenerator component from other pipelines to avoid loading the MiniLM model multiple times
  • Bump the OpenAI version to account for a breaking change in the underlying dependency httpx
  • Increase the amount of memory and ephemeral storage allocated to the Lambda function

@karlhigley karlhigley self-assigned this Dec 2, 2024
@karlhigley karlhigley marked this pull request as draft December 2, 2024 19:47
@karlhigley karlhigley force-pushed the karl/refactor/minilm-model branch from 3ad0baa to 3f48adb Compare December 5, 2024 16:41
@karlhigley karlhigley requested a review from zentavious December 5, 2024 16:42
@karlhigley karlhigley marked this pull request as ready for review December 5, 2024 16:44
Copy link
Owner

@zentavious zentavious left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Solved the merge! Thank you for making these changes.

@zentavious zentavious merged commit ca5bc4c into main Dec 9, 2024
2 of 5 checks passed
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants