Amazon bedrock plugin for llm cli #785

irthomasthomas · 2024-04-03T11:20:20Z

Amazon bedrock plugin for llm cli

Amazon bedrock plugin for llm cli

Description: Plugin for LLM adding support for Anthropic's Claude models.

Installation

Install this plugin in the same environment as LLM. From the current directory:

llm install llm-bedrock-anthropic

Configuration

You will need to specify AWS Configuration with the normal boto3 and environment variables.

For example, to use the region us-west-2 and AWS credentials under the personal profile, set the environment variables:

export AWS_DEFAULT_REGION=us-west-2
export AWS_PROFILE=personal

Usage

This plugin adds models called bedrock-claude and bedrock-claude-instant.

You can query them like this:

llm -m bedrock-claude-instant "Ten great names for a new space station"
llm -m bedrock-claude "Compare and contrast the leadership styles of Abraham Lincoln and Boris Johnson."

Options

max_tokens_to_sample, default 8_191: The maximum number of tokens to generate before stopping.

Use like this:

llm -m bedrock-claude -o max_tokens_to_sample 20 "Sing me the alphabet"

Here is the alphabet song:

A B C D E F G
H I J

URL: https://github.com/sblakey/llm-bedrock-anthropic

Suggested labels

The text was updated successfully, but these errors were encountered:

irthomasthomas · 2024-04-03T11:20:55Z

#361: gorilla-llm/gorilla-cli: LLMs for your CLI

### Details

Similarity score: 0.87 - [ ] [gorilla-llm/gorilla-cli: LLMs for your CLI](https://github.com/gorilla-llm/gorilla-cli)

Gorilla CLI

Gorilla CLI powers your command-line interactions with a user-centric tool. Simply state your objective, and Gorilla CLI will generate potential commands for execution. Gorilla today supports ~1500 APIs, including Kubernetes, AWS, GCP, Azure, GitHub, Conda, Curl, Sed, and many more. No more recalling intricate CLI arguments! 🦍

Developed by UC Berkeley as a research prototype, Gorilla-CLI prioritizes user control and confidentiality:

Commands are executed solely with your explicit approval.
While we utilize queries and error logs (stderr) for model enhancement, we NEVER collect output data (stdout).

Suggested labels

{ "key": "llm-evaluation", "value": "Evaluating the performance and behavior of Large Language Models through human-written evaluation sets" } { "key": "llm-serving-optimisations", "value": "Tips, tricks and tools to speed up the inference of Large Language Models" }

#396: astra-assistants-api: A backend implementation of the OpenAI beta Assistants API

### Details

Similarity score: 0.87 - [ ] [datastax/astra-assistants-api: A backend implementation of the OpenAI beta Assistants API](https://github.com/datastax/astra-assistants-api)

Astra Assistant API Service

A drop-in compatible service for the OpenAI beta Assistants API with support for persistent threads, files, assistants, messages, retrieval, function calling and more using AstraDB (DataStax's db as a service offering powered by Apache Cassandra and jvector).

Compatible with existing OpenAI apps via the OpenAI SDKs by changing a single line of code.

Getting Started

Create an Astra DB Vector database
Replace the following code:

client = OpenAI(
    api_key=OPENAI_API_KEY,
)

with:

client = OpenAI(
    base_url="https://open-assistant-ai.astra.datastax.com/v1", 
    api_key=OPENAI_API_KEY,
    default_headers={
        "astra-api-token": ASTRA_DB_APPLICATION_TOKEN,
    }
)

Or, if you have an existing astra db, you can pass your db_id in a second header:

client = OpenAI(
    base_url="https://open-assistant-ai.astra.datastax.com/v1", 
    api_key=OPENAI_API_KEY,
    default_headers={
        "astra-api-token": ASTRA_DB_APPLICATION_TOKEN,
        "astra-db-id": ASTRA_DB_ID
    }
)

Create an assistant

assistant = client.beta.assistants.create(
  instructions="You are a personal math tutor. When asked a math question, write and run code to answer the question.",
  model="gpt-4-1106-preview",
  tools=[{"type": "retrieval"}]
)

By default, the service uses AstraDB as the database/vector store and OpenAI for embeddings and chat completion.

Third party LLM Support

We now support many third party models for both embeddings and completion thanks to litellm. Pass the api key of your service using api-key and embedding-model headers.

For AWS Bedrock, you can pass additional custom headers:

client = OpenAI(
    base_url="https://open-assistant-ai.astra.datastax.com/v1", 
    api_key="NONE",
    default_headers={
        "astra-api-token": ASTRA_DB_APPLICATION_TOKEN,
        "embedding-model": "amazon.titan-embed-text-v1",
        "LLM-PARAM-aws-access-key-id": BEDROCK_AWS_ACCESS_KEY_ID,
        "LLM-PARAM-aws-secret-access-key": BEDROCK_AWS_SECRET_ACCESS_KEY,
        "LLM-PARAM-aws-region-name": BEDROCK_AWS_REGION,
    }
)

and again, specify the custom model for the assistant.

assistant = client.beta.assistants.create(
    name="Math Tutor",
    instructions="You are a personal math tutor. Answer questions briefly, in a sentence or less.",
    model="meta.llama2-13b-chat-v1",
)

Additional examples including third party LLMs (bedrock, cohere, perplexity, etc.) can be found under examples.

To run the examples using poetry:

Create a .env file in this directory with your secrets.
Run:

poetry install
poetry run python examples/completion/basic.py
poetry run python examples/retreival/basic.py
poetry run python examples/function-calling/basic.py

Coverage

See our coverage report here.

Roadmap

Support for other embedding models and LLMs
Function calling
Pluggable RAG strategies
Streaming support

Suggested labels

{ "key": "llm-function-calling", "value": "Integration of function calling with Large Language Models (LLMs)" }

#183: litellm: Call all LLM APIs using the OpenAI format. Use Bedrock, Azure, OpenAI, Cohere, Anthropic, Ollama, Sagemaker, HuggingFace, Replicate (100+ LLMs)

### Details

Similarity score: 0.87 - [ ] [BerriAI/litellm: Call all LLM APIs using the OpenAI format. Use Bedrock, Azure, OpenAI, Cohere, Anthropic, Ollama, Sagemaker, HuggingFace, Replicate (100+ LLMs)](https://github.com/BerriAI/litellm)

Call all LLM APIs using the OpenAI format. Use Bedrock, Azure, OpenAI, Cohere, Anthropic, Ollama, Sagemaker, HuggingFace, Replicate (100+ LLMs)</details>

#678: chroma/README.md at main · chroma-core/chroma

### Details

Similarity score: 0.86 - [ ] [chroma/README.md at main · chroma-core/chroma](https://github.com/chroma-core/chroma/blob/main/README.md?plain=1)

chroma/README.md at main · chroma-core/chroma

Chroma - the open-source embedding database.
The fastest way to build Python or JavaScript LLM apps with memory!

| | Docs | Homepage

|

pip install chromadb # python client
# for javascript, npm install chromadb!
# for client-server mode, chroma run --path /chroma_db_path

The core API is only 4 functions (run our 💡 Google Colab or Replit template):

import chromadb
# setup Chroma in-memory, for easy prototyping. Can add persistence easily!
client = chromadb.Client()

# Create collection. get_collection, get_or_create_collection, delete_collection also available!
collection = client.create_collection("all-my-documents")

# Add docs to the collection. Can also update and delete. Row-based API coming soon!
collection.add(
    documents=["This is document1", "This is document2"], # we handle tokenization, embedding, and indexing automatically. You can skip that and add your own embeddings as well
    metadatas=[{"source": "notion"}, {"source": "google-docs"}], # filter on these!
    ids=["doc1", "doc2"], # unique for each doc
)

# Query/search 2 most similar results. You can also .get by id
results = collection.query(
    query_texts=["This is a query document"],
    n_results=2,
    # where={"metadata_field": "is_equal_to_this"}, # optional filter
    # where_document={"$contains":"search_string"}  # optional filter
)

Features

Simple: Fully-typed, fully-tested, fully-documented == happiness
Integrations: 🦜️🔗 LangChain (python and js), 🦙 LlamaIndex and more soon
Dev, Test, Prod: the same API that runs in your python notebook, scales to your cluster
Feature-rich: Queries, filtering, density estimation and more
Free & Open Source: Apache 2.0 Licensed

Use case: ChatGPT for ______

For example, the "Chat your data" use case:

Add documents to your database. You can pass in your own embeddings, embedding function, or let Chroma embed them for you.
Query relevant documents with natural language.
Compose documents into the context window of an LLM like GPT3 for additional summarization or analysis.

Embeddings?

What are embeddings?

Read the guide from OpenAI
Literal: Embedding something turns it from image/text/audio into a list of numbers. 🖼️ or 📄 => [1.2, 2.1, ....]. This process makes documents "understandable" to a machine learning model.
By analogy: An embedding represents the essence of a document. This enables documents and queries with the same essence to be "near" each other and therefore easy to find.
Technical: An embedding is the latent-space position of a document at a layer of a deep neural network. For models trained specifically to embed data, this is the last layer.
A small example: If you search your photos for "famous bridge in San Francisco". By embedding this query and comparing it to the embeddings of your photos and their metadata - it should return photos of the Golden Gate Bridge.

Embeddings databases (also known as vector databases) store embeddings and allow you to search by nearest neighbors rather than by substrings like a traditional database. By default, Chroma uses Sentence Transformers to embed for you but you can also use OpenAI embeddings, Cohere (multilingual) embeddings, or your own.

View on GitHub

Suggested labels

#62: Simonw's llm cli: Template usage.

### Details

Similarity score: 0.85 Here are the code blocks extracted from the readme file:

llm 'Summarize this: $input' --save summarize

llm --system 'Summarize this' --save summarize

llm --system 'Summarize this' --model gpt-4 --save summarize

llm --system 'Summarize this text in the voice of $voice' \
  --model gpt-4 -p voice GlaDOS --save summarize

curl -s https://example.com/ | llm -t summarize

curl -s https://llm.datasette.io/en/latest/ | \
  llm -t summarize -m gpt-3.5-turbo-16k

llm templates

llm templates edit summarize

prompt: 'Summarize this: $input'

prompt: >
    Summarize the following text.

    Insert frequent satirical steampunk-themed illustrative anecdotes.
    Really go wild with that.

    Text to summarize: $input

curl -s 'https://til.simonwillison.net/macos/imovie-slides-and-audio' | \
  strip-tags -m | llm -t steampunk -m 4

system: Summarize this

system: You speak like an excitable Victorian adventurer  
prompt: 'Summarize this: $input'

prompt: |
    Suggest a recipe using ingredients: $ingredients

    It should be based on cuisine from this country: $country

llm -t recipe -p ingredients 'sausages, milk' -p country Germany

system: Summarize this text in the voice of $voice

curl -s 'https://til.simonwillison.net/macos/imovie-slides-and-audio' | \
  strip-tags -m | llm -t summarize -p voice GlaDOS

system: Summarize this text in the voice of $voice
defaults:
  voice: GlaDOS

model: gpt-4
system: roast the user at every possible opportunity, be succinct

llm -t roast 'How are you today?'
```</details>

### #328: llama-cpp-python: OpenAI compatible web server - Local Copilot replacement - Function Calling support - Vision API support
<details><summary>### Details</summary>Similarity score: 0.85
> **Python Bindings for llama.cpp**
> 
> Simple Python bindings for @ggerganov's llama.cpp library. This package provides:
> 
> - Low-level access to C API via ctypes interface.
> - High-level Python API for text completion
> - OpenAI-like API
> - LangChain compatibility
> - OpenAI compatible web server
> - Local Copilot replacement
> - Function Calling support
> - Vision API support
> - Multiple Models
> 
> Documentation is available at [https://llama-cpp-python.readthedocs.io/en/latest](https://llama-cpp-python.readthedocs.io/en/latest).
> 
> **Installation**
> 
> llama-cpp-python can be installed directly from PyPI as a source distribution by running:
> 
> ```
> pip install llama-cpp-python
> ```
> 
> This will build llama.cpp from source using cmake and your system's c compiler (required) and install the library alongside this python package.
> 
> If you run into issues during installation add the `--verbose` flag to the `pip install` command to see the full cmake build log.
> 
> **Installation with Specific Hardware Acceleration (BLAS, CUDA, Metal, etc)**
> 
> The default `pip install` behaviour is to build llama.cpp for CPU only on Linux and Windows and use Metal on MacOS.
> 
> llama.cpp supports a number of hardware acceleration backends depending including OpenBLAS, cuBLAS, CLBlast, HIPBLAS, and Metal. See the llama.cpp README for a full list of supported backends.
> 
> All of these backends are supported by llama-cpp-python and can be enabled by setting the `CMAKE_ARGS` environment variable before installing.
> 
> On Linux and Mac you set the `CMAKE_ARGS` like this:
> 
> ```
> CMAKE_ARGS="-DLLAMA_BLAS=ON -DLLAMA_BLAS_VENDOR=OpenBLAS" pip install llama-cpp-python
> ```
> 
> On Windows you can set the `CMAKE_ARGS` like this:
> 
> ```
> $env:CMAKE_ARGS = "-DLLAMA_BLAS=ON -DLLAMA_BLAS_VENDOR=OpenBLAS"
> pip install llama-cpp-python
> ```
> 
> **OpenBLAS**
> 
> To install with OpenBLAS, set the `LLAMA_BLAS` and `LLAMA_BLAS_VENDOR` environment variables before installing:
> 
> ```
> CMAKE_ARGS="-DLLAMA_BLAS=ON -DLLAMA_BLAS_VENDOR=OpenBLAS" pip install llama-cpp-python
> ```
> 
> **cuBLAS**
> 
> To install with cuBLAS, set the `LLAMA_CUBLAS=1` environment variable before installing:
> 
> ```
> CMAKE_ARGS="-DLLAMA_CUBLAS=on" pip install llama-cpp-python
> ```
> 
> **Metal**
> 
> To install with Metal (MPS), set the `LLAMA_METAL=on` environment variable before installing:
> 
> ```
> CMAKE_ARGS="-DLLAMA_METAL=on" pip install llama-cpp-python
> ```
> 
> #### Suggested labels
> 
> { "key": "llm-python-bindings", "value": "Python bindings for llama.cpp library" }</details>

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Amazon bedrock plugin for llm cli #785

Amazon bedrock plugin for llm cli #785

irthomasthomas commented Apr 3, 2024

irthomasthomas commented Apr 3, 2024

Suggested labels

{ "key": "llm-evaluation", "value": "Evaluating the performance and behavior of Large Language Models through human-written evaluation sets" } { "key": "llm-serving-optimisations", "value": "Tips, tricks and tools to speed up the inference of Large Language Models" }

Astra Assistant API Service

Getting Started

Third party LLM Support

Coverage

Roadmap

Suggested labels

{ "key": "llm-function-calling", "value": "Integration of function calling with Large Language Models (LLMs)" }

#678: chroma/README.md at main · chroma-core/chroma

chroma/README.md at main · chroma-core/chroma

Features

Use case: ChatGPT for ______

Embeddings?

Suggested labels

#62: Simonw's llm cli: Template usage.

Amazon bedrock plugin for llm cli #785

Amazon bedrock plugin for llm cli #785

Comments

irthomasthomas commented Apr 3, 2024

Amazon bedrock plugin for llm cli

Installation

Configuration

Usage

Options

Suggested labels

irthomasthomas commented Apr 3, 2024

Related content

#361: gorilla-llm/gorilla-cli: LLMs for your CLI

Suggested labels

{ "key": "llm-evaluation", "value": "Evaluating the performance and behavior of Large Language Models through human-written evaluation sets" } { "key": "llm-serving-optimisations", "value": "Tips, tricks and tools to speed up the inference of Large Language Models" }

#396: astra-assistants-api: A backend implementation of the OpenAI beta Assistants API

Astra Assistant API Service

Getting Started

Third party LLM Support

Coverage

Roadmap

Suggested labels

{ "key": "llm-function-calling", "value": "Integration of function calling with Large Language Models (LLMs)" }

#183: litellm: Call all LLM APIs using the OpenAI format. Use Bedrock, Azure, OpenAI, Cohere, Anthropic, Ollama, Sagemaker, HuggingFace, Replicate (100+ LLMs)

#678: chroma/README.md at main · chroma-core/chroma

chroma/README.md at main · chroma-core/chroma

Features

Use case: ChatGPT for ______

Embeddings?

Suggested labels

#62: Simonw's llm cli: Template usage.