-
Notifications
You must be signed in to change notification settings - Fork 37
Add definition of new action input #123
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Merged
Merged
Conversation
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Signed-off-by: Shmuel Kallner <kallner@il.ibm.com>
mayabar
approved these changes
Aug 6, 2025
smarunich
pushed a commit
to smarunich/llm-d-inference-sim
that referenced
this pull request
Aug 14, 2025
Signed-off-by: Shmuel Kallner <kallner@il.ibm.com> Signed-off-by: Sergey Marunich <marunich.s@gmail.com>
irar2
added a commit
that referenced
this pull request
Aug 28, 2025
* Add definition of new action input (#123) Signed-off-by: Shmuel Kallner <kallner@il.ibm.com> Signed-off-by: Sergey Marunich <marunich.s@gmail.com> * KV cache and tokenization related configuration (#125) Signed-off-by: Ira <IRAR@il.ibm.com> Signed-off-by: Sergey Marunich <marunich.s@gmail.com> * Another attempt at adding a latest tag only on release builds (#124) Signed-off-by: Shmuel Kallner <kallner@il.ibm.com> Signed-off-by: Sergey Marunich <marunich.s@gmail.com> * Publish kv-cache events (#126) * Publish kv-cache events Signed-off-by: Ira <IRAR@il.ibm.com> * Fix lint errors Signed-off-by: Ira <IRAR@il.ibm.com> * Review fixes Signed-off-by: Ira <IRAR@il.ibm.com> * Sleep to allow prevous sub to close Signed-off-by: Ira <IRAR@il.ibm.com> --------- Signed-off-by: Ira <IRAR@il.ibm.com> Signed-off-by: Sergey Marunich <marunich.s@gmail.com> * Add failure injection mode to simulator Introduces a 'failure' mode to the simulator, allowing random injection of OpenAI API-compatible error responses for testing error handling. Adds configuration options for failure injection rate and specific failure types, implements error response logic, and updates documentation and tests to cover the new functionality. Signed-off-by: Sergey Marunich <marunich.s@gmail.com> * Refactor failure injection and update simulator error handling Failure injection is now controlled by a dedicated 'failure-injection-rate' parameter instead of a separate 'failure' mode. Failure type constants are centralized, and error handling in the simulator is refactored to use a unified method for sending error responses. Documentation and tests are updated to reflect these changes, and the OpenAI error response format now includes an 'object' field. Signed-off-by: Sergey Marunich <marunich.s@gmail.com> * Make tokenizer version configurable from Dockerfile Extracts TOKENIZER_VERSION from the Dockerfile and uses it in the download-tokenizer target. This allows the Makefile to automatically use the correct tokenizer version specified in the Dockerfile, improving maintainability and consistency. Signed-off-by: Sergey Marunich <marunich.s@gmail.com> * Add failure injection mode to simulator Introduces a 'failure' mode to the simulator, allowing random injection of OpenAI API-compatible error responses for testing error handling. Adds configuration options for failure injection rate and specific failure types, implements error response logic, and updates documentation and tests to cover the new functionality. Signed-off-by: Sergey Marunich <marunich.s@gmail.com> * Refactor failure injection and update simulator error handling Failure injection is now controlled by a dedicated 'failure-injection-rate' parameter instead of a separate 'failure' mode. Failure type constants are centralized, and error handling in the simulator is refactored to use a unified method for sending error responses. Documentation and tests are updated to reflect these changes, and the OpenAI error response format now includes an 'object' field. Signed-off-by: Sergey Marunich <marunich.s@gmail.com> * KV cache and tokenization related configuration (#125) Signed-off-by: Ira <IRAR@il.ibm.com> Signed-off-by: Sergey Marunich <marunich.s@gmail.com> * Publish kv-cache events (#126) * Publish kv-cache events Signed-off-by: Ira <IRAR@il.ibm.com> * Fix lint errors Signed-off-by: Ira <IRAR@il.ibm.com> * Review fixes Signed-off-by: Ira <IRAR@il.ibm.com> * Sleep to allow prevous sub to close Signed-off-by: Ira <IRAR@il.ibm.com> --------- Signed-off-by: Ira <IRAR@il.ibm.com> Signed-off-by: Sergey Marunich <marunich.s@gmail.com> * Use same version of tokenizer in both Dockerfile and Makefile (#132) * - Use same version of tokenizer in both Dockerfile and Makefile - Fixes in readme file Signed-off-by: Maya Barnea <mayab@il.ibm.com> * updates according PR's review Signed-off-by: Maya Barnea <mayab@il.ibm.com> --------- Signed-off-by: Maya Barnea <mayab@il.ibm.com> Signed-off-by: Sergey Marunich <marunich.s@gmail.com> * Clarify failure injection rate documentation Removed redundant lines and updated comments and help text to clarify that 'failure-injection-rate' is the probability of injecting failures, not specifically tied to failure mode. Signed-off-by: Sergey Marunich <marunich.s@gmail.com> * Set default failure injection rate to 0 Signed-off-by: Sergey Marunich <marunich.s@gmail.com> * rebase duplicates Signed-off-by: Sergey Marunich <marunich.s@gmail.com> * re-base the changes Signed-off-by: Sergey Marunich <marunich.s@gmail.com> KV cache and tokenization related configuration (#125) Signed-off-by: Ira <IRAR@il.ibm.com> Publish kv-cache events (#126) * Publish kv-cache events Signed-off-by: Ira <IRAR@il.ibm.com> * Fix lint errors Signed-off-by: Ira <IRAR@il.ibm.com> * Review fixes Signed-off-by: Ira <IRAR@il.ibm.com> * Sleep to allow prevous sub to close Signed-off-by: Ira <IRAR@il.ibm.com> --------- Signed-off-by: Ira <IRAR@il.ibm.com> Signed-off-by: Sergey Marunich <marunich.s@gmail.com> Use same version of tokenizer in both Dockerfile and Makefile (#132) * - Use same version of tokenizer in both Dockerfile and Makefile - Fixes in readme file Signed-off-by: Maya Barnea <mayab@il.ibm.com> * updates according PR's review Signed-off-by: Maya Barnea <mayab@il.ibm.com> --------- Signed-off-by: Maya Barnea <mayab@il.ibm.com> Signed-off-by: Sergey Marunich <marunich.s@gmail.com> Replaces usage of param.NewOpt with openai.Int for MaxTokens and openai.Bool with param.NewOpt for IncludeUsage in simulator_test.go to align with updated API usage. Signed-off-by: Sergey Marunich <marunich.s@gmail.com> * Update option constructors in simulator tests Replaces usage of param.NewOpt with openai.Int for MaxTokens and openai.Bool with param.NewOpt for IncludeUsage in simulator_test.go to align with updated API usage. Signed-off-by: Sergey Marunich <marunich.s@gmail.com> * Document failure injection options in README Added descriptions for `failure-injection-rate` and `failure-types` configuration options to clarify their usage and defaults. Signed-off-by: Sergey Marunich <marunich.s@gmail.com> * Set FailureInjectionRate default to 0 in config Changed the default value of FailureInjectionRate from 10 to 0 in newConfig to disable failure injection as was enabled by default with previous mode that deprecated Signed-off-by: Sergey Marunich <marunich.s@gmail.com> * Refactor failure type usage and error response format Signed-off-by: Sergey Marunich <marunich.s@gmail.com> * Refactor failure type flag handling and code formatting Signed-off-by: Sergey Marunich <marunich.s@gmail.com> * Fix config validation and simulator test argument handling Signed-off-by: Sergey Marunich <marunich.s@gmail.com> * remove duplicate Signed-off-by: Sergey Marunich <marunich.s@gmail.com> * Refactor failure handling to use CompletionError struct Failure handling in the simulator now uses the CompletionError struct from the openai-server-api package, replacing custom error fields with a unified structure. This improves consistency in error responses and simplifies error injection logic. Associated tests and error handling code have been updated to reflect this change. Signed-off-by: Sergey Marunich <marunich.s@gmail.com> * Use one type for all errors. Map code to type Signed-off-by: Ira <IRAR@il.ibm.com> * Review comments Signed-off-by: Ira <IRAR@il.ibm.com> --------- Signed-off-by: Shmuel Kallner <kallner@il.ibm.com> Signed-off-by: Sergey Marunich <marunich.s@gmail.com> Signed-off-by: Ira <IRAR@il.ibm.com> Signed-off-by: Maya Barnea <mayab@il.ibm.com> Signed-off-by: Ira Rosen <irar@il.ibm.com> Co-authored-by: Shmuel Kallner <kallner@il.ibm.com> Co-authored-by: Ira Rosen <irar@il.ibm.com> Co-authored-by: Maya Barnea <mayab@il.ibm.com>
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.
This suggestion is invalid because no changes were made to the code.
Suggestions cannot be applied while the pull request is closed.
Suggestions cannot be applied while viewing a subset of changes.
Only one suggestion per line can be applied in a batch.
Add this suggestion to a batch that can be applied as a single commit.
Applying suggestions on deleted lines is not supported.
You must change the existing code in this line in order to create a valid suggestion.
Outdated suggestions cannot be applied.
This suggestion has been applied or marked resolved.
Suggestions cannot be applied from pending reviews.
Suggestions cannot be applied on multi-line comments.
Suggestions cannot be applied while the pull request is queued to merge.
Suggestion cannot be applied right now. Please check back later.
No description provided.