-
Notifications
You must be signed in to change notification settings - Fork 23
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
JFYI - Your code also works with Ultra and Embeddings! #14
Comments
See scripts using your gem here: https://github.com/palladius/genai-googlecloud-scripts/tree/main/09-langchainrb-playground |
Nice! I was trying to get this working but had failed. Where are you setting these? Is ultra even a thing still, I thought they renamed it to advanced, but maybe the model name stayed the same. Are you in the beta program? |
I work for Google. I found also embeddings work:
I got the model name from these docs: https://cloud.google.com/vertex-ai/generative-ai/docs/embeddings/get-text-embeddings |
Ouch! So changing to vertex-ai-api, though it uses the same models, wasn't free and now I have a $120 bill lol. Google's having two very similar AI APis, with one being free, is super frustrating. Any idea how to see which embeddings are available in generative-language-api? I can't figure out how to get it to list them, and the old 'embedding-001' is the only one I can find a reference to online. The new ones in vertex-ai-api like text-embedding-preview-0409 aren't there. Kinda wish I'd found this before, but https://ollama.com/ is pretty amazing for running one of the free models locally. |
@inspire22 Ouch, sorry to hear that! Yeah, Ollama is great; I support a gem for using it: ollama-ai About embeddings: I added new methods for embedding in version 4.0.0 of the gem. I tested, and I can get embedding working with the model result = client.embed_content(
{ content: { parts: [{ text: 'What is life?' }] } }
) Or Vertex AI API: result = client.predict(
{ instances: [{ content: 'What is life?' }],
parameters: { autoTruncate: true } }
) @palladius It was great to meet you! Thanks for reaching out. I generated a script to test which model I have access to. Here's my result:
@inspire22 This may help you with "what model works with what API" ☝️ I will do some benchmarks with |
Brilliant list, thanks! How'd you get the list of models/embeddings to test? I've been using text-embedding-preview-0409 since (was, still quite high) at the top of the MTEB https://huggingface.co/spaces/mteb/leaderboard Ollama - oh nice, I hadn't noticed you were behind both, thanks for the great projects! |
Unfortunately, there's no API to list models in Vertex AI API to my knowledge. With Generative Language API, this endpoint works for me:
In the gem, you can use it like this: client = Gemini.new(
credentials: {
service: 'generative-language-api',
api_key: ENV.fetch('GOOGLE_API_KEY', nil)
},
options: { model: 'text-embedding-004', server_sent_events: true }
)
models = client.models { 'models' =>
[{ 'name' => 'models/gemini-1.0-pro',
'version' => '001',
'displayName' => 'Gemini 1.0 Pro',
'description' => 'The best model for scaling across a wide range of tasks',
'inputTokenLimit' => 30_720,
'outputTokenLimit' => 2048,
'supportedGenerationMethods' => %w[generateContent countTokens],
'temperature' => 0.9,
'topP' => 1 },
{ 'name' => 'models/gemini-1.0-pro-001',
'version' => '001',
'displayName' => 'Gemini 1.0 Pro 001 (Tuning)',
'description' => 'The best model for scaling across a wide range of tasks. This is a stable model that supports tuning.',
'inputTokenLimit' => 30_720,
'outputTokenLimit' => 2048,
'supportedGenerationMethods' => %w[generateContent countTokens createTunedModel],
'temperature' => 0.9,
'topP' => 1 }] } The others I manually extracted from reading these pages: |
I've tried to substitute 'gemini-pro' with other strings provided by the Gemini Vertex AI playground code, and JFYI they both works:
JFYI - Looking fwd to speaking to you in person (Laurencio is arranging a chat).
Riccardo
The text was updated successfully, but these errors were encountered: