Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Adds support for Vertex AI Unicorn #1277

Merged
merged 3 commits into from
Dec 30, 2023
Merged

Conversation

AshGreh
Copy link
Contributor

@AshGreh AshGreh commented Dec 30, 2023

The module was crashing for Unicorn and when I went through the stack trace, it looks like it was because the model cost map didn't have an entry for it. I calculated the price per token based off https://cloud.google.com/vertex-ai/pricing with the assumption that 4 characters is one token.

I also think the price for the other Vertex AI models is off. If you agree, I can correct them in another PR.

Copy link

vercel bot commented Dec 30, 2023

The latest updates on your projects. Learn more about Vercel for Git ↗︎

Name Status Preview Comments Updated (UTC)
litellm ✅ Ready (Inspect) Visit Preview 💬 Add feedback Dec 30, 2023 0:55am

@krrishdholakia krrishdholakia merged commit 05795ea into BerriAI:main Dec 30, 2023
2 checks passed
@krrishdholakia
Copy link
Contributor

Hey @AshGreh yes - a 2nd pr to help fix the vertex ai pricing would be great - which pricing specifically was off?

@krrishdholakia
Copy link
Contributor

Curious - how're you using litellm today?

@AshGreh
Copy link
Contributor Author

AshGreh commented Dec 30, 2023

Hello @krrishdholakia, thank you for the quick merge!

I was mainly looking at the Bison models last night, they cost 0.00025 per 1000 characters/250 tokens and so, my estimate for the cost per token was 0.00025/250=$0.000001/token.

We have an internal ChatGPT where the backend is backed by LangChain. LangChain promises a lot of things and under delivers on most fronts(non-openai LLMs). We always imagined LangChain to be the ORM for LLMs but that's far from the truth. Each LLM has it's own unique set of "preferred" query. So what may work for GPT-4, may not work for Unicorn or Claude and vice versa. Our primary goal was to create a RAG based ChatGPT for questions related to internal data but tools from LangChain just doesn't dynamically pick on the search engine and instead, always queries our Kendra Cluster. Writing custom queries for tools has given us great results with Unicorn and Claude v2. From all the reading and conversations I had with folks on reddit, LangChain only works great for GPT family(even with format customization) with a very difficult flow for debugging and as my company is an AWS shop, having just Azure OpenAI won't work for us. In turn, if we are going to write custom queries for each model anyway, it doesn't make sense for us to use a heavy framework like LangChain, at least for the time being.

Hence, we decided to try out LiteLLM to keep our LLM code as DRY as possible. :)

@krrishdholakia
Copy link
Contributor

Hey @AshGreh

Can you chat for 10 minutes this/next week?
https://calendly.com/d/4mp-gd3-k5k/litellm-1-1-onboarding-chat

want to make sure we're good for your scenario.

@ishaan-jaff
Copy link
Contributor

Hi @AshGreh wanted to follow up - we'd love to learn about how we can improve litellm for you, this is our cal link for your convenience: https://calendly.com/d/4mp-gd3-k5k/litellm-1-1-onboarding-chat

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants