Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[Feature]: local-only load model cost #1434

Closed
timlrx opened this issue Jan 13, 2024 · 11 comments
Closed

[Feature]: local-only load model cost #1434

timlrx opened this issue Jan 13, 2024 · 11 comments
Labels
enhancement New feature or request

Comments

@timlrx
Copy link

timlrx commented Jan 13, 2024

The Feature

Currently model_cost is initialised by default by fetching the model price json file from the repository.

Lazy loading model_cost would allow a user to override model_cost and eliminate the additional network request if it is not required. Sketch of suggested implementation:

model_cost = None 

def load_model_cost():
    global model_cost
    if model_cost is None:
       get_model_cost_map(url=model_cost_map_url) # default fallback behaviour

For all functions that use litellm.model_cost, add a load_model_cost() line. Functions that would be affected include cost_per_token, register_model, get_max_tokens, get_model_info, trim_messages.

Thanks for considering!

Motivation, pitch

  • I would like to load model_cost locally, but currently there does not seem to be a way to do so since it is set when litellm is imported
  • Deferring the load might also be good for other users who do not use the cost related functions

Twitter / LinkedIn details

No response

@timlrx timlrx added the enhancement New feature or request label Jan 13, 2024
@krrishdholakia
Copy link
Contributor

krrishdholakia commented Jan 13, 2024

Hey @timlrx we already support local model cost - https://docs.litellm.ai/docs/completion/token_usage#8-register_model

Is this what you were looking for?

--
Loading the map is how we add support for new models without requiring users to upgrade versions each time.

@timlrx
Copy link
Author

timlrx commented Jan 13, 2024

Hi @krrishdholakia, not quite. Given the following code:

import litellm

litellm.register_model(model_cost={"gpt-4": {"max_tokens": 8192}})

Because litellm init gets executed first, the user has to pull model cost from github before registering a new model / overriding the model_cost object.

Hence, I am proposing to load model cost only when it is being called, using the current logic(pull from github, user does not need to upgrade manually). This will also allow a user to override model_cost directly and no network request needs to be made.

@krrishdholakia
Copy link
Contributor

@timlrx seeing this late - sorry! can we do a quick call on this? Want to understand this better

Attaching my calendly if that helps - https://calendly.com/d/4mp-gd3-k5k/litellm-1-1-onboarding-chat

@krrishdholakia
Copy link
Contributor

Hi @timlrx is your issue now solved with - https://docs.litellm.ai/docs/proxy/custom_pricing

If not, can you help me understand what the problem you're currently facing is?

@timlrx
Copy link
Author

timlrx commented Jan 25, 2024

Sorry for the slow reply, but it would be pretty easy for me to show over a call. Let's chat tomorrow (might be later today for you).

@Manouchehri
Copy link
Collaborator

Related side-comment: eventually (in a few months) we will be using LiteLLM in an "offline"/isolated environment. (To be more specific, we will have access to api.openai.com and to our VMs, but nothing else. i.e. downloading more stuff at runtime from GitHub.com will not work.)

I think the way to test this, would be to run a CI/CD test on LiteLLM with firewall rules applied at runtime.

@krrishdholakia krrishdholakia changed the title [Feature]: Lazy load model cost [Feature]: local-only load model cost Feb 2, 2024
@krrishdholakia
Copy link
Contributor

updating title based on conversation

@krrishdholakia
Copy link
Contributor

This is now supported - ec427ae

If you have firewalls, and want to just use the local copy of the model cost map, you can do so like this:

export LITELLM_LOCAL_MODEL_COST_MAP="True"

Note: this means you will need to upgrade to get updated pricing, and newer models.

@krrishdholakia
Copy link
Contributor

@timlrx feel free to close the issue if this solves your problem

@timlrx
Copy link
Author

timlrx commented Feb 2, 2024

@krrishdholakia, yes the changes looks good to me, thanks!

@Manouchehri
Copy link
Collaborator

Thanks! I'm going to set it on my Cloud Run deployment as well, since the extra fetch probably isn't benefiting cold start times (even if small). =P

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
enhancement New feature or request
Projects
None yet
Development

No branches or pull requests

3 participants