Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Amazon bedrock plugin for llm cli #785

Open
1 task
irthomasthomas opened this issue Apr 3, 2024 · 1 comment
Open
1 task

Amazon bedrock plugin for llm cli #785

irthomasthomas opened this issue Apr 3, 2024 · 1 comment
Labels
Anthropic-ai Related to anthropic.ai and their Claude LLMs CLI-UX Command Line Interface user experience and best practices code-generation code generation models and tools like copilot and aider data-validation Validating data structures and formats llm Large Language Models Models LLM and ML model repos and links Papers Research papers

Comments

@irthomasthomas
Copy link
Owner

Amazon bedrock plugin for llm cli

Description: Plugin for LLM adding support for Anthropic's Claude models.

Installation

Install this plugin in the same environment as LLM. From the current directory:

llm install llm-bedrock-anthropic

Configuration

You will need to specify AWS Configuration with the normal boto3 and environment variables.

For example, to use the region us-west-2 and AWS credentials under the personal profile, set the environment variables:

export AWS_DEFAULT_REGION=us-west-2
export AWS_PROFILE=personal

Usage

This plugin adds models called bedrock-claude and bedrock-claude-instant.

You can query them like this:

llm -m bedrock-claude-instant "Ten great names for a new space station"
llm -m bedrock-claude "Compare and contrast the leadership styles of Abraham Lincoln and Boris Johnson."

Options

  • max_tokens_to_sample, default 8_191: The maximum number of tokens to generate before stopping.

Use like this:

llm -m bedrock-claude -o max_tokens_to_sample 20 "Sing me the alphabet"

Here is the alphabet song:

A B C D E F G
H I J

URL: https://github.com/sblakey/llm-bedrock-anthropic

Suggested labels

@irthomasthomas irthomasthomas added Anthropic-ai Related to anthropic.ai and their Claude LLMs CLI-UX Command Line Interface user experience and best practices code-generation code generation models and tools like copilot and aider data-validation Validating data structures and formats llm Large Language Models Models LLM and ML model repos and links Papers Research papers labels Apr 3, 2024
@irthomasthomas
Copy link
Owner Author

Related content

#361: gorilla-llm/gorilla-cli: LLMs for your CLI

### DetailsSimilarity score: 0.87 - [ ] [gorilla-llm/gorilla-cli: LLMs for your CLI](https://github.com/gorilla-llm/gorilla-cli)

Gorilla CLI

Gorilla CLI powers your command-line interactions with a user-centric tool. Simply state your objective, and Gorilla CLI will generate potential commands for execution. Gorilla today supports ~1500 APIs, including Kubernetes, AWS, GCP, Azure, GitHub, Conda, Curl, Sed, and many more. No more recalling intricate CLI arguments! 🦍

Developed by UC Berkeley as a research prototype, Gorilla-CLI prioritizes user control and confidentiality:

Commands are executed solely with your explicit approval.
While we utilize queries and error logs (stderr) for model enhancement, we NEVER collect output data (stdout).

Suggested labels

{ "key": "llm-evaluation", "value": "Evaluating the performance and behavior of Large Language Models through human-written evaluation sets" } { "key": "llm-serving-optimisations", "value": "Tips, tricks and tools to speed up the inference of Large Language Models" }

#396: astra-assistants-api: A backend implementation of the OpenAI beta Assistants API

### DetailsSimilarity score: 0.87 - [ ] [datastax/astra-assistants-api: A backend implementation of the OpenAI beta Assistants API](https://github.com/datastax/astra-assistants-api)

Astra Assistant API Service

A drop-in compatible service for the OpenAI beta Assistants API with support for persistent threads, files, assistants, messages, retrieval, function calling and more using AstraDB (DataStax's db as a service offering powered by Apache Cassandra and jvector).

Compatible with existing OpenAI apps via the OpenAI SDKs by changing a single line of code.

Getting Started

  1. Create an Astra DB Vector database
  2. Replace the following code:
client = OpenAI(
    api_key=OPENAI_API_KEY,
)

with:

client = OpenAI(
    base_url="https://open-assistant-ai.astra.datastax.com/v1", 
    api_key=OPENAI_API_KEY,
    default_headers={
        "astra-api-token": ASTRA_DB_APPLICATION_TOKEN,
    }
)

Or, if you have an existing astra db, you can pass your db_id in a second header:

client = OpenAI(
    base_url="https://open-assistant-ai.astra.datastax.com/v1", 
    api_key=OPENAI_API_KEY,
    default_headers={
        "astra-api-token": ASTRA_DB_APPLICATION_TOKEN,
        "astra-db-id": ASTRA_DB_ID
    }
)
  1. Create an assistant
assistant = client.beta.assistants.create(
  instructions="You are a personal math tutor. When asked a math question, write and run code to answer the question.",
  model="gpt-4-1106-preview",
  tools=[{"type": "retrieval"}]
)

By default, the service uses AstraDB as the database/vector store and OpenAI for embeddings and chat completion.

Third party LLM Support

We now support many third party models for both embeddings and completion thanks to litellm. Pass the api key of your service using api-key and embedding-model headers.

For AWS Bedrock, you can pass additional custom headers:

client = OpenAI(
    base_url="https://open-assistant-ai.astra.datastax.com/v1", 
    api_key="NONE",
    default_headers={
        "astra-api-token": ASTRA_DB_APPLICATION_TOKEN,
        "embedding-model": "amazon.titan-embed-text-v1",
        "LLM-PARAM-aws-access-key-id": BEDROCK_AWS_ACCESS_KEY_ID,
        "LLM-PARAM-aws-secret-access-key": BEDROCK_AWS_SECRET_ACCESS_KEY,
        "LLM-PARAM-aws-region-name": BEDROCK_AWS_REGION,
    }
)

and again, specify the custom model for the assistant.

assistant = client.beta.assistants.create(
    name="Math Tutor",
    instructions="You are a personal math tutor. Answer questions briefly, in a sentence or less.",
    model="meta.llama2-13b-chat-v1",
)

Additional examples including third party LLMs (bedrock, cohere, perplexity, etc.) can be found under examples.

To run the examples using poetry:

  1. Create a .env file in this directory with your secrets.
  2. Run:
poetry install
poetry run python examples/completion/basic.py
poetry run python examples/retreival/basic.py
poetry run python examples/function-calling/basic.py

Coverage

See our coverage report here.

Roadmap

  • Support for other embedding models and LLMs
  • Function calling
  • Pluggable RAG strategies
  • Streaming support

Suggested labels

{ "key": "llm-function-calling", "value": "Integration of function calling with Large Language Models (LLMs)" }

#183: litellm: Call all LLM APIs using the OpenAI format. Use Bedrock, Azure, OpenAI, Cohere, Anthropic, Ollama, Sagemaker, HuggingFace, Replicate (100+ LLMs)

### DetailsSimilarity score: 0.87 - [ ] [BerriAI/litellm: Call all LLM APIs using the OpenAI format. Use Bedrock, Azure, OpenAI, Cohere, Anthropic, Ollama, Sagemaker, HuggingFace, Replicate (100+ LLMs)](https://github.com/BerriAI/litellm)
Call all LLM APIs using the OpenAI format. Use Bedrock, Azure, OpenAI, Cohere, Anthropic, Ollama, Sagemaker, HuggingFace, Replicate (100+ LLMs)</details>

#678: chroma/README.md at main · chroma-core/chroma

### DetailsSimilarity score: 0.86 - [ ] [chroma/README.md at main · chroma-core/chroma](https://github.com/chroma-core/chroma/blob/main/README.md?plain=1)

chroma/README.md at main · chroma-core/chroma

Chroma logo

Chroma - the open-source embedding database.
The fastest way to build Python or JavaScript LLM apps with memory!

Discord | License | Docs | Homepage

Integration Tests | Tests

pip install chromadb # python client
# for javascript, npm install chromadb!
# for client-server mode, chroma run --path /chroma_db_path

The core API is only 4 functions (run our 💡 Google Colab or Replit template):

import chromadb
# setup Chroma in-memory, for easy prototyping. Can add persistence easily!
client = chromadb.Client()

# Create collection. get_collection, get_or_create_collection, delete_collection also available!
collection = client.create_collection("all-my-documents")

# Add docs to the collection. Can also update and delete. Row-based API coming soon!
collection.add(
    documents=["This is document1", "This is document2"], # we handle tokenization, embedding, and indexing automatically. You can skip that and add your own embeddings as well
    metadatas=[{"source": "notion"}, {"source": "google-docs"}], # filter on these!
    ids=["doc1", "doc2"], # unique for each doc
)

# Query/search 2 most similar results. You can also .get by id
results = collection.query(
    query_texts=["This is a query document"],
    n_results=2,
    # where={"metadata_field": "is_equal_to_this"}, # optional filter
    # where_document={"$contains":"search_string"}  # optional filter
)

Features

  • Simple: Fully-typed, fully-tested, fully-documented == happiness
  • Integrations: 🦜️🔗 LangChain (python and js), 🦙 LlamaIndex and more soon
  • Dev, Test, Prod: the same API that runs in your python notebook, scales to your cluster
  • Feature-rich: Queries, filtering, density estimation and more
  • Free & Open Source: Apache 2.0 Licensed

Use case: ChatGPT for ______

For example, the "Chat your data" use case:

  1. Add documents to your database. You can pass in your own embeddings, embedding function, or let Chroma embed them for you.
  2. Query relevant documents with natural language.
  3. Compose documents into the context window of an LLM like GPT3 for additional summarization or analysis.

Embeddings?

What are embeddings?

  • Read the guide from OpenAI
  • Literal: Embedding something turns it from image/text/audio into a list of numbers. 🖼️ or 📄 => [1.2, 2.1, ....]. This process makes documents "understandable" to a machine learning model.
  • By analogy: An embedding represents the essence of a document. This enables documents and queries with the same essence to be "near" each other and therefore easy to find.
  • Technical: An embedding is the latent-space position of a document at a layer of a deep neural network. For models trained specifically to embed data, this is the last layer.
  • A small example: If you search your photos for "famous bridge in San Francisco". By embedding this query and comparing it to the embeddings of your photos and their metadata - it should return photos of the Golden Gate Bridge.

Embeddings databases (also known as vector databases) store embeddings and allow you to search by nearest neighbors rather than by substrings like a traditional database. By default, Chroma uses Sentence Transformers to embed for you but you can also use OpenAI embeddings, Cohere (multilingual) embeddings, or your own.

View on GitHub

Suggested labels

#62: Simonw's llm cli: Template usage.

### DetailsSimilarity score: 0.85 Here are the code blocks extracted from the readme file:
llm 'Summarize this: $input' --save summarize
llm --system 'Summarize this' --save summarize
llm --system 'Summarize this' --model gpt-4 --save summarize
llm --system 'Summarize this text in the voice of $voice' \
  --model gpt-4 -p voice GlaDOS --save summarize
curl -s https://example.com/ | llm -t summarize
curl -s https://llm.datasette.io/en/latest/ | \
  llm -t summarize -m gpt-3.5-turbo-16k
llm templates
llm templates edit summarize
prompt: 'Summarize this: $input'
prompt: >
    Summarize the following text.

    Insert frequent satirical steampunk-themed illustrative anecdotes.
    Really go wild with that.

    Text to summarize: $input
curl -s 'https://til.simonwillison.net/macos/imovie-slides-and-audio' | \
  strip-tags -m | llm -t steampunk -m 4
system: Summarize this
system: You speak like an excitable Victorian adventurer  
prompt: 'Summarize this: $input'
prompt: |
    Suggest a recipe using ingredients: $ingredients

    It should be based on cuisine from this country: $country
llm -t recipe -p ingredients 'sausages, milk' -p country Germany
system: Summarize this text in the voice of $voice
curl -s 'https://til.simonwillison.net/macos/imovie-slides-and-audio' | \
  strip-tags -m | llm -t summarize -p voice GlaDOS 
system: Summarize this text in the voice of $voice
defaults:
  voice: GlaDOS
model: gpt-4
system: roast the user at every possible opportunity, be succinct
llm -t roast 'How are you today?'
```</details>

### #328: llama-cpp-python: OpenAI compatible web server - Local Copilot replacement - Function Calling support - Vision API support
<details><summary>### Details</summary>Similarity score: 0.85
> **Python Bindings for llama.cpp**
> 
> Simple Python bindings for @ggerganov's llama.cpp library. This package provides:
> 
> - Low-level access to C API via ctypes interface.
> - High-level Python API for text completion
> - OpenAI-like API
> - LangChain compatibility
> - OpenAI compatible web server
> - Local Copilot replacement
> - Function Calling support
> - Vision API support
> - Multiple Models
> 
> Documentation is available at [https://llama-cpp-python.readthedocs.io/en/latest](https://llama-cpp-python.readthedocs.io/en/latest).
> 
> **Installation**
> 
> llama-cpp-python can be installed directly from PyPI as a source distribution by running:
> 
> ```
> pip install llama-cpp-python
> ```
> 
> This will build llama.cpp from source using cmake and your system's c compiler (required) and install the library alongside this python package.
> 
> If you run into issues during installation add the `--verbose` flag to the `pip install` command to see the full cmake build log.
> 
> **Installation with Specific Hardware Acceleration (BLAS, CUDA, Metal, etc)**
> 
> The default `pip install` behaviour is to build llama.cpp for CPU only on Linux and Windows and use Metal on MacOS.
> 
> llama.cpp supports a number of hardware acceleration backends depending including OpenBLAS, cuBLAS, CLBlast, HIPBLAS, and Metal. See the llama.cpp README for a full list of supported backends.
> 
> All of these backends are supported by llama-cpp-python and can be enabled by setting the `CMAKE_ARGS` environment variable before installing.
> 
> On Linux and Mac you set the `CMAKE_ARGS` like this:
> 
> ```
> CMAKE_ARGS="-DLLAMA_BLAS=ON -DLLAMA_BLAS_VENDOR=OpenBLAS" pip install llama-cpp-python
> ```
> 
> On Windows you can set the `CMAKE_ARGS` like this:
> 
> ```
> $env:CMAKE_ARGS = "-DLLAMA_BLAS=ON -DLLAMA_BLAS_VENDOR=OpenBLAS"
> pip install llama-cpp-python
> ```
> 
> **OpenBLAS**
> 
> To install with OpenBLAS, set the `LLAMA_BLAS` and `LLAMA_BLAS_VENDOR` environment variables before installing:
> 
> ```
> CMAKE_ARGS="-DLLAMA_BLAS=ON -DLLAMA_BLAS_VENDOR=OpenBLAS" pip install llama-cpp-python
> ```
> 
> **cuBLAS**
> 
> To install with cuBLAS, set the `LLAMA_CUBLAS=1` environment variable before installing:
> 
> ```
> CMAKE_ARGS="-DLLAMA_CUBLAS=on" pip install llama-cpp-python
> ```
> 
> **Metal**
> 
> To install with Metal (MPS), set the `LLAMA_METAL=on` environment variable before installing:
> 
> ```
> CMAKE_ARGS="-DLLAMA_METAL=on" pip install llama-cpp-python
> ```
> 
> #### Suggested labels
> 
> { "key": "llm-python-bindings", "value": "Python bindings for llama.cpp library" }</details>

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Anthropic-ai Related to anthropic.ai and their Claude LLMs CLI-UX Command Line Interface user experience and best practices code-generation code generation models and tools like copilot and aider data-validation Validating data structures and formats llm Large Language Models Models LLM and ML model repos and links Papers Research papers
Projects
None yet
Development

No branches or pull requests

1 participant