Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

AI jobs getting senderNonce: too many values #3378

Open
Titan-Node opened this issue Feb 2, 2025 · 3 comments
Open

AI jobs getting senderNonce: too many values #3378

Titan-Node opened this issue Feb 2, 2025 · 3 comments
Labels
status: triage this issue has not been evaluated yet

Comments

@Titan-Node
Copy link

Describe the bug
When sending AI jobs for expensive models (such as DeepSeek), or in the case of LLM pipeline, sending large max_tokens parameter such as 163K tokens, it causes a lot of payment tickets to be sent at once.
The Orch will show this message.

Error receiving ticket sessionID=33_meta-llama/Meta-Llama-3.1-8B-Instruct recipientRandHash=7905016d8d201e4bb0d13f78234e107018b2effe42343325691a55844a1d54cf senderNonce=178: invalid ticket senderNonce: too many values sender=0x5bE44e23041E93CDF9bCd5A0968524e104e38ae1 nonce=178

There is a nonce cap of 150 currently. We need to allow infinite nonce count to be accepted by the Orch, or some way to manage this limit.

For instance, if LLM context windows keep increasing or price keep getting higher, which I believe it will, we need higher throughput of tickets to be redeemed.

This is also prevalent when multiple jobs are sent at the same time, the ticket nonce stacks up and will reach the limit quickly.

To Reproduce
Steps to reproduce the behavior:

  1. Start AI Gateway
  2. Start Orchestrator with 7 USD per 1 million tokens
  3. Send LLM request with 163K max_tokens parameter
  4. See error
@github-actions github-actions bot added the status: triage this issue has not been evaluated yet label Feb 2, 2025
@leszko
Copy link
Contributor

leszko commented Feb 3, 2025

I believe it is a long-standing bug, when there are many tickets sent at once, we are sometimes getting this one.

We can play with ticketEV to receive less tickets, this is what we've being doing so far.

@ad-astra-video
Copy link
Collaborator

I think it would be nice for Gateways to be able to use a range of ticket EV instead of one number set by the orchestrator. Gateways can setup different gateways for each pipeline to adjust the ticketEV parameters that are acceptable. Orchestrators have to try and set one ticketEV that will apply to all pipelines which creates situations of significant overpayment on some pipelines and on some others we will have too many tickets.

If we use a range of ticketEV we can use a large maxTicketEV (say 10000 gwei) could be set that is quite large but would cover larger payments needed for things like text-to-video . minTicketEV could be 8 gwei which is the current default ticketEV to support smaller payment streaming used in transcoding and live video AI. This would help minimize overpayments if the Gateway can right size the ticket EV to get within 10-20% of payment amount needed.

A next step that would help visualize this is metrics or log lines that indicate the balances remaining when payment/balance sessions are cleaned up.

@leszko
Copy link
Contributor

leszko commented Feb 3, 2025

Hmm, and what the idea that the ticketEV would be set by Gateway instead of Orchestrator? Then, we could optimize to "always" send 1 ticket only. Wdyt?

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
status: triage this issue has not been evaluated yet
Projects
None yet
Development

No branches or pull requests

3 participants