Skip to content

Conversation

@shmuelk
Copy link
Collaborator

@shmuelk shmuelk commented Aug 6, 2025

No description provided.

Signed-off-by: Shmuel Kallner <kallner@il.ibm.com>
@shmuelk shmuelk merged commit 0308c8f into llm-d:main Aug 6, 2025
2 checks passed
@shmuelk shmuelk deleted the ci-updates branch August 6, 2025 13:17
smarunich pushed a commit to smarunich/llm-d-inference-sim that referenced this pull request Aug 14, 2025
Signed-off-by: Shmuel Kallner <kallner@il.ibm.com>
Signed-off-by: Sergey Marunich <marunich.s@gmail.com>
irar2 added a commit that referenced this pull request Aug 28, 2025
* Add definition of new action input (#123)

Signed-off-by: Shmuel Kallner <kallner@il.ibm.com>
Signed-off-by: Sergey Marunich <marunich.s@gmail.com>

* KV cache and tokenization related configuration (#125)

Signed-off-by: Ira <IRAR@il.ibm.com>
Signed-off-by: Sergey Marunich <marunich.s@gmail.com>

* Another attempt at adding a latest tag only on release builds (#124)

Signed-off-by: Shmuel Kallner <kallner@il.ibm.com>
Signed-off-by: Sergey Marunich <marunich.s@gmail.com>

* Publish kv-cache events (#126)

* Publish kv-cache events

Signed-off-by: Ira <IRAR@il.ibm.com>

* Fix lint errors

Signed-off-by: Ira <IRAR@il.ibm.com>

* Review fixes

Signed-off-by: Ira <IRAR@il.ibm.com>

* Sleep to allow prevous sub to close

Signed-off-by: Ira <IRAR@il.ibm.com>

---------

Signed-off-by: Ira <IRAR@il.ibm.com>
Signed-off-by: Sergey Marunich <marunich.s@gmail.com>

* Add failure injection mode to simulator

Introduces a 'failure' mode to the simulator, allowing random injection of OpenAI API-compatible error responses for testing error handling. Adds configuration options for failure injection rate and specific failure types, implements error response logic, and updates documentation and tests to cover the new functionality.

Signed-off-by: Sergey Marunich <marunich.s@gmail.com>

* Refactor failure injection and update simulator error handling

Failure injection is now controlled by a dedicated 'failure-injection-rate' parameter instead of a separate 'failure' mode. Failure type constants are centralized, and error handling in the simulator is refactored to use a unified method for sending error responses. Documentation and tests are updated to reflect these changes, and the OpenAI error response format now includes an 'object' field.

Signed-off-by: Sergey Marunich <marunich.s@gmail.com>

* Make tokenizer version configurable from Dockerfile

Extracts TOKENIZER_VERSION from the Dockerfile and uses it in the download-tokenizer target. This allows the Makefile to automatically use the correct tokenizer version specified in the Dockerfile, improving maintainability and consistency.

Signed-off-by: Sergey Marunich <marunich.s@gmail.com>

* Add failure injection mode to simulator

Introduces a 'failure' mode to the simulator, allowing random injection of OpenAI API-compatible error responses for testing error handling. Adds configuration options for failure injection rate and specific failure types, implements error response logic, and updates documentation and tests to cover the new functionality.

Signed-off-by: Sergey Marunich <marunich.s@gmail.com>

* Refactor failure injection and update simulator error handling

Failure injection is now controlled by a dedicated 'failure-injection-rate' parameter instead of a separate 'failure' mode. Failure type constants are centralized, and error handling in the simulator is refactored to use a unified method for sending error responses. Documentation and tests are updated to reflect these changes, and the OpenAI error response format now includes an 'object' field.

Signed-off-by: Sergey Marunich <marunich.s@gmail.com>

* KV cache and tokenization related configuration (#125)

Signed-off-by: Ira <IRAR@il.ibm.com>
Signed-off-by: Sergey Marunich <marunich.s@gmail.com>

* Publish kv-cache events (#126)

* Publish kv-cache events

Signed-off-by: Ira <IRAR@il.ibm.com>

* Fix lint errors

Signed-off-by: Ira <IRAR@il.ibm.com>

* Review fixes

Signed-off-by: Ira <IRAR@il.ibm.com>

* Sleep to allow prevous sub to close

Signed-off-by: Ira <IRAR@il.ibm.com>

---------

Signed-off-by: Ira <IRAR@il.ibm.com>
Signed-off-by: Sergey Marunich <marunich.s@gmail.com>

* Use same version of tokenizer in both Dockerfile and Makefile (#132)

* - Use same version of tokenizer in both Dockerfile and Makefile
- Fixes in readme file

Signed-off-by: Maya Barnea <mayab@il.ibm.com>

* updates according PR's review

Signed-off-by: Maya Barnea <mayab@il.ibm.com>

---------

Signed-off-by: Maya Barnea <mayab@il.ibm.com>
Signed-off-by: Sergey Marunich <marunich.s@gmail.com>

* Clarify failure injection rate documentation

Removed redundant lines and updated comments and help text to clarify that 'failure-injection-rate' is the probability of injecting failures, not specifically tied to failure mode.

Signed-off-by: Sergey Marunich <marunich.s@gmail.com>

* Set default failure injection rate to 0

Signed-off-by: Sergey Marunich <marunich.s@gmail.com>

* rebase duplicates

Signed-off-by: Sergey Marunich <marunich.s@gmail.com>

* re-base the changes

Signed-off-by: Sergey Marunich <marunich.s@gmail.com>

KV cache and tokenization related configuration (#125)

Signed-off-by: Ira <IRAR@il.ibm.com>

Publish kv-cache events (#126)

* Publish kv-cache events

Signed-off-by: Ira <IRAR@il.ibm.com>

* Fix lint errors

Signed-off-by: Ira <IRAR@il.ibm.com>

* Review fixes

Signed-off-by: Ira <IRAR@il.ibm.com>

* Sleep to allow prevous sub to close

Signed-off-by: Ira <IRAR@il.ibm.com>

---------

Signed-off-by: Ira <IRAR@il.ibm.com>
Signed-off-by: Sergey Marunich <marunich.s@gmail.com>

Use same version of tokenizer in both Dockerfile and Makefile (#132)

* - Use same version of tokenizer in both Dockerfile and Makefile
- Fixes in readme file

Signed-off-by: Maya Barnea <mayab@il.ibm.com>

* updates according PR's review

Signed-off-by: Maya Barnea <mayab@il.ibm.com>

---------

Signed-off-by: Maya Barnea <mayab@il.ibm.com>
Signed-off-by: Sergey Marunich <marunich.s@gmail.com>

Replaces usage of param.NewOpt with openai.Int for MaxTokens and openai.Bool with param.NewOpt for IncludeUsage in simulator_test.go to align with updated API usage.

Signed-off-by: Sergey Marunich <marunich.s@gmail.com>

* Update option constructors in simulator tests

Replaces usage of param.NewOpt with openai.Int for MaxTokens and openai.Bool with param.NewOpt for IncludeUsage in simulator_test.go to align with updated API usage.

Signed-off-by: Sergey Marunich <marunich.s@gmail.com>

* Document failure injection options in README

Added descriptions for `failure-injection-rate` and `failure-types` configuration options to clarify their usage and defaults.

Signed-off-by: Sergey Marunich <marunich.s@gmail.com>

* Set FailureInjectionRate default to 0 in config

Changed the default value of FailureInjectionRate from 10 to 0 in newConfig to disable failure injection as was enabled by default with previous mode that deprecated

Signed-off-by: Sergey Marunich <marunich.s@gmail.com>

* Refactor failure type usage and error response format

Signed-off-by: Sergey Marunich <marunich.s@gmail.com>

* Refactor failure type flag handling and code formatting

Signed-off-by: Sergey Marunich <marunich.s@gmail.com>

* Fix config validation and simulator test argument handling

Signed-off-by: Sergey Marunich <marunich.s@gmail.com>

* remove duplicate

Signed-off-by: Sergey Marunich <marunich.s@gmail.com>

* Refactor failure handling to use CompletionError struct

Failure handling in the simulator now uses the CompletionError struct from the openai-server-api package, replacing custom error fields with a unified structure. This improves consistency in error responses and simplifies error injection logic. Associated tests and error handling code have been updated to reflect this change.

Signed-off-by: Sergey Marunich <marunich.s@gmail.com>

* Use one type for all errors. Map code to type

Signed-off-by: Ira <IRAR@il.ibm.com>

* Review comments

Signed-off-by: Ira <IRAR@il.ibm.com>

---------

Signed-off-by: Shmuel Kallner <kallner@il.ibm.com>
Signed-off-by: Sergey Marunich <marunich.s@gmail.com>
Signed-off-by: Ira <IRAR@il.ibm.com>
Signed-off-by: Maya Barnea <mayab@il.ibm.com>
Signed-off-by: Ira Rosen <irar@il.ibm.com>
Co-authored-by: Shmuel Kallner <kallner@il.ibm.com>
Co-authored-by: Ira Rosen <irar@il.ibm.com>
Co-authored-by: Maya Barnea <mayab@il.ibm.com>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants