Bake the MiniLM model into the Docker container for deployment #12

karlhigley · 2024-12-02T19:47:32Z

This reworks the context generation code to:

Load the MiniLM model from a local copy baked into the Docker container (instead of downloading from HF)
Create the OpenAI client inside the ContextGenerator component

And also addresses some issues with deployment:

Remove the ContextGenerator component from other pipelines to avoid loading the MiniLM model multiple times
Bump the OpenAI version to account for a breaking change in the underlying dependency httpx
Increase the amount of memory and ephemeral storage allocated to the Lambda function

Remove stray reference to SentenceTransformers

Remove API key

zentavious

Solved the merge! Thank you for making these changes.

karlhigley self-assigned this Dec 2, 2024

karlhigley marked this pull request as draft December 2, 2024 19:47

karlhigley added 9 commits December 4, 2024 15:18

Add all-MiniLM-L6-v2 to models directory

74aa054

Use the local copy of all-MiniLM-L6-v2

0b7214c

Remove stray reference to SentenceTransformers

Remove the PyTorch version of all-MiniLM-L6-v2

648916c

Remove the training script

ec348e2

Fix model path

0a28bd5

Move model loading and client creation inside ContextGenerator

32a1794

Remove API key

Update openai to 1.55.3

63c3e63

Increase the allocated memory and ephemeral storage

09b73cb

Only include the context generator in the locality pipeline

3f48adb

karlhigley force-pushed the karl/refactor/minilm-model branch from 3ad0baa to 3f48adb Compare December 5, 2024 16:41

karlhigley requested a review from zentavious December 5, 2024 16:42

karlhigley marked this pull request as ready for review December 5, 2024 16:44

Merge branch 'main' into karl/refactor/minilm-model

4e6f58d

zentavious approved these changes Dec 9, 2024

View reviewed changes

zentavious merged commit ca5bc4c into main Dec 9, 2024
2 of 5 checks passed

Provide feedback