Skip to content

Conversation

@dtrawins
Copy link
Collaborator

🛠 Summary

Reduced default cache size to avoid OOM on hosts with limited RAM

🧪 Checklist

  • Unit tests added.
  • The documentation updated.
  • Change follows security best practices.
    ``

@dtrawins dtrawins requested review from atobiszei and mzegla October 28, 2025 15:48
Copy link
Collaborator

@mzegla mzegla left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We need a change in llm_calculator.proto as well.
Additionally, doesn't our demo with benchmarking need an update to mention somewhere that with default deployment cache size might be too small for throughput oriented benchmark?

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants