-
-
Notifications
You must be signed in to change notification settings - Fork 1.7k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Adds support for Vertex AI Unicorn #1277
Conversation
Adds support for unicorn
Adds simplified name for Unicorn.
Corrected the math.
The latest updates on your projects. Learn more about Vercel for Git ↗︎
|
Hey @AshGreh yes - a 2nd pr to help fix the vertex ai pricing would be great - which pricing specifically was off? |
Curious - how're you using litellm today? |
Hello @krrishdholakia, thank you for the quick merge! I was mainly looking at the Bison models last night, they cost 0.00025 per 1000 characters/250 tokens and so, my estimate for the cost per token was 0.00025/250=$0.000001/token. We have an internal ChatGPT where the backend is backed by LangChain. LangChain promises a lot of things and under delivers on most fronts(non-openai LLMs). We always imagined LangChain to be the ORM for LLMs but that's far from the truth. Each LLM has it's own unique set of "preferred" query. So what may work for GPT-4, may not work for Unicorn or Claude and vice versa. Our primary goal was to create a RAG based ChatGPT for questions related to internal data but tools from LangChain just doesn't dynamically pick on the search engine and instead, always queries our Kendra Cluster. Writing custom queries for tools has given us great results with Unicorn and Claude v2. From all the reading and conversations I had with folks on reddit, LangChain only works great for GPT family(even with format customization) with a very difficult flow for debugging and as my company is an AWS shop, having just Azure OpenAI won't work for us. In turn, if we are going to write custom queries for each model anyway, it doesn't make sense for us to use a heavy framework like LangChain, at least for the time being. Hence, we decided to try out LiteLLM to keep our LLM code as DRY as possible. :) |
Hey @AshGreh Can you chat for 10 minutes this/next week? want to make sure we're good for your scenario. |
Hi @AshGreh wanted to follow up - we'd love to learn about how we can improve litellm for you, this is our cal link for your convenience: https://calendly.com/d/4mp-gd3-k5k/litellm-1-1-onboarding-chat |
The module was crashing for Unicorn and when I went through the stack trace, it looks like it was because the model cost map didn't have an entry for it. I calculated the price per token based off https://cloud.google.com/vertex-ai/pricing with the assumption that 4 characters is one token.
I also think the price for the other Vertex AI models is off. If you agree, I can correct them in another PR.