Skip to content

Conversation

@jcabrero
Copy link
Member

@jcabrero jcabrero commented Sep 2, 2025

This PR adds the nilai f910 endpoint configuration for models

@jcabrero jcabrero changed the title feat: added nilai-f910 config feat: added nilai prod configs Sep 2, 2025
@jcabrero jcabrero force-pushed the feat/nilai-f910-config branch from dd489ed to 6e16a93 Compare September 2, 2025 14:26
@jcabrero jcabrero requested a review from Copilot September 2, 2025 14:28
Copy link
Contributor

Copilot AI left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull Request Overview

This PR adds production configuration for the nilai endpoint by introducing a new Docker Compose file that defines two GPU-enabled services for running LLM models. The configuration supports both a Llama 3.1-8B model and a GPT-OSS-20B model with specific GPU memory allocation and performance optimizations.

  • Adds GPU-enabled service configurations for Llama 3.1-8B and GPT-OSS-20B models
  • Configures health checks and service dependencies for proper startup ordering
  • Sets up shared volume for Hugging Face model caching

Tip: Customize your code reviews with copilot-instructions.md. Create the file or learn how to get started.

@jcabrero jcabrero requested a review from blefo September 2, 2025 15:31
@jcabrero jcabrero force-pushed the feat/nilai-f910-config branch from 6e16a93 to 9c8556b Compare September 2, 2025 15:37
@jcabrero jcabrero force-pushed the feat/nilai-f910-config branch from 9c8556b to aabee82 Compare September 4, 2025 08:58
@jcabrero jcabrero merged commit a9af0d2 into main Sep 10, 2025
15 of 16 checks passed
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants