feat: track the cost associated with each user #198

changchiyou · 2024-04-10T05:39:46Z

changchiyou
Apr 10, 2024

Is your feature request related to a problem? Please describe.
Currently, it's challenging to track the cost associated with each user.

Describe the solution you'd like

Modify the current method of generating user API keys: implement a process where user keys are generated by invoking the <litellm>/key/generate API.
Users initiate conversations through LiteLLM models using their unique API keys.
Provide an admin link for accessing https://litellm.vercel.app/docs/proxy/ui from the admin panel.

Describe alternatives you've considered

Modify the current method of generating user API keys: implement a process where user keys are generated by invoking the <litellm>/key/generate API.

Disable registration function.
(The admin) Register an open-webui(hosted web) account for user and retrieves <litellm>/key/generate api using the email address
Manually updating the generated API into the API field.
After all, provide the account with email/password to user, notify user update the default password with their own password.

Additional context

Reference:

tjbck · 2024-04-10T05:48:34Z

tjbck
Apr 10, 2024
Maintainer

Related: #20, we should try to add this feature to the rest of our proxy servers as well.

0 replies

max00346 · 2024-05-15T22:44:48Z

max00346
May 15, 2024

This would also solve my Feature-Request: open-webui/open-webui#1320

0 replies

krrishdholakia · 2024-05-28T08:51:23Z

krrishdholakia
May 28, 2024

@changchiyou do you want to create a key per user?

LiteLLM also allows for tracking by the 'user' param in /chat/completions -

curl --location 'http://0.0.0.0:4000/chat/completions' \
        --header 'Content-Type: application/json' \
        --header 'Authorization: Bearer sk-zi5onDRdHGD24v0Zdn7VBA' \
        --data ' {
        "model": "azure-gpt-3.5",
        "user": "ishaan3", # 👈 TRACKING COST FOR THIS USER
        "messages": [
            {
            "role": "user",
            "content": "what time is it"
            }
        ]
        }'

https://docs.litellm.ai/docs/proxy/users

0 replies

changchiyou · 2024-05-28T09:29:47Z

changchiyou
May 28, 2024
Author

@krrishdholakia Does the method you provide, which uses user instead of an API key to track users, require additional manual adjustments/settings on the Open WebUI? I observed that Open WebUI does not send out the user parameter by default:

https://github.com/open-webui/open-webui/blob/d43ee0fc5b018cec183de99e8047472c454737ae/src/routes/(app)/%2Bpage.svelte#L602-L666

and the logs in LiteLLM show that the user is always default_user_id:

litellm-proxy  | Request to litellm:
litellm-proxy  | litellm.acompletion(model='azure/gpt-35-turbo_0125', api_key='xxx', api_base='xxx', messages=[{'role': 'user', 'content': '說「HI」'}], caching=False, client=<openai.lib.azure.AsyncAzureOpenAI object at 0x7fe75dce7310>, timeout=6000, stream=True, proxy_server_request={'url': 'http://litellm-proxy:8000/v1/chat/completions', 'method': 'POST', 'headers': {'host': 'litellm-proxy:8000', 'user-agent': 'python-requests/2.31.0', 'accept-encoding': 'gzip, deflate', 'accept': '*/*', 'connection': 'keep-alive', 'authorization': 'Bearer sk-1234', 'content-type': 'application/json', 'content-length': '109'}, 'body': {'model': 'gpt-3.5-turbo', 'stream': True, 'messages': [{'role': 'user', 'content': '說「HI」'}]}}, user='default_user_id', metadata={'user_api_key': 'sk-1234', 'user_api_key_alias': None, 'user_api_key_user_id': 'default_user_id', 'user_api_key_team_id': None, 'user_api_key_metadata': {}, 'headers': {'host': 'litellm-proxy:8000', 'user-agent': 'python-requests/2.31.0', 'accept-encoding': 'gzip, deflate', 'accept': '*/*', 'connection': 'keep-alive', 'content-type': 'application/json', 'content-length': '109'}, 'endpoint': 'http://litellm-proxy:8000/v1/chat/completions', 'model_group': 'gpt-3.5-turbo', 'deployment': 'azure/gpt-35-turbo_0125', 'model_info': {'base_model': 'azure/gpt-3.5-turbo-0125', 'id': '69f51c2e-d0fb-4b40-b722-7664a58366e4'}, 'caching_groups': None}, request_timeout=600, model_info={'base_model': 'azure/gpt-3.5-turbo-0125', 'id': '69f51c2e-d0fb-4b40-b722-7664a58366e4'}, max_retries=0)

0 replies

juancarlosm · 2024-05-30T08:02:42Z

juancarlosm
May 30, 2024

same problem here! We are unable to track costs per user:

0 replies

juancarlosm · 2024-05-30T10:15:58Z

juancarlosm
May 30, 2024

If it helps, a dirty hack in backend/apps/[openai|litellm]/main.py at proxy function:

_tmp = json.loads(body)
_tmp['user'] = user.email
body = json.dumps(_tmp)

And now I can see the right user at langfuse:

0 replies

kroonen · 2024-05-30T11:06:32Z

kroonen
May 30, 2024

We've added LiteLLM support to our pipelines examples (see : https://github.com/open-webui/pipelines/blob/main/pipelines/examples/litellm_manifold_pipeline.py and https://github.com/open-webui/pipelines/blob/main/pipelines/examples/litellm_subprocess_manifold_pipeline.py), enabling per-user cost tracking. This will be included with the v0.2.0 release, currently in the :dev branch. (We'll be leveraging OpenAI's endpoint to create plugins. Full docs will be available soon!)

1 reply

deigaard Feb 26, 2025

Did this ever get merged past dev?

juancarlosm · 2024-06-28T12:57:30Z

juancarlosm
Jun 28, 2024

open-webui/open-webui#3494

0 replies

changchiyou · 2024-08-15T14:16:30Z

changchiyou
Aug 15, 2024
Author

Hello everyone, I've successfully implemented this feature by integrating Open WebUI, LiteLLM, and Langfuse, utilizing the filter and pipe from open-webui.pipelines:

filter

"""
title: Chat Info Filter Pipeline
author: changchiyou
date: 2024-08-02
version: 1.0
license: MIT
description: A filter pipeline that preprocess form data before requesting chat completion with LiteLLM.
requirements:
"""

from typing import List, Optional
from pydantic import BaseModel


class Pipeline:
    class Valves(BaseModel):
        # List target pipeline ids (models) that this filter will be connected to.
        # If you want to connect this filter to all pipelines, you can set pipelines to ["*"]
        # e.g. ["llama3:latest", "gpt-3.5-turbo"]
        pipelines: List[str] = []

        # Assign a priority level to the filter pipeline.
        # The priority level determines the order in which the filter pipelines are executed.
        # The lower the number, the higher the priority.
        priority: int = 0

    def __init__(self):
        # Pipeline filters are only compatible with Open WebUI
        # You can think of filter pipeline as a middleware that can be used to edit the form data before it is sent to the OpenAI API.
        self.type = "filter"

        # Optionally, you can set the id and name of the pipeline.
        # Best practice is to not specify the id so that it can be automatically inferred from the filename, so that users can install multiple versions of the same pipeline.
        # The identifier must be unique across all pipelines.
        # The identifier must be an alphanumeric string that can include underscores or hyphens. It cannot contain spaces, special characters, slashes, or backslashes.
        self.name = "Chat Info Filter"

        # Initialize
        self.valves = self.Valves(
            **{
                "pipelines": ["*"],  # Connect to all pipelines
                "priority": 0
            }
        )

        self.chat_generations = {}
        pass

    async def on_startup(self):
        # This function is called when the server is started.
        print(f"on_startup:{__name__}")
        pass

    async def on_shutdown(self):
        # This function is called when the server is stopped.
        print(f"on_shutdown:{__name__}")
        pass

    async def on_valves_updated(self):
        # This function is called when the valves are updated.
        pass

    async def inlet(self, body: dict, user: Optional[dict] = None) -> dict:
        print(f"inlet:{__name__}")

        body["custom_metadata"] = {"session_id": body['chat_id']}

        if user := user if user else body.get("user"):
            body["custom_metadata"]["trace_user_id"] = f'{user["name"]} / {user["email"]}'
        else:
            print(f"Error: user & body[\"user\"] are both None")

        return body

pipe

"""
title: LiteLLM Manifold Pipeline
author: open-webui
date: 2024-05-30
version: 1.0.1
license: MIT
description: A manifold pipeline that uses LiteLLM.
"""

from typing import List, Union, Generator, Iterator
from schemas import OpenAIChatMessage
from pydantic import BaseModel
import requests
import os


class Pipeline:

    class Valves(BaseModel):
        LITELLM_BASE_URL: str = ""
        LITELLM_API_KEY: str = ""
        LITELLM_PIPELINE_DEBUG: bool = False

    def __init__(self):
        # You can also set the pipelines that are available in this pipeline.
        # Set manifold to True if you want to use this pipeline as a manifold.
        # Manifold pipelines can have multiple pipelines.
        self.type = "manifold"

        # Optionally, you can set the id and name of the pipeline.
        # Best practice is to not specify the id so that it can be automatically inferred from the filename, so that users can install multiple versions of the same pipeline.
        # The identifier must be unique across all pipelines.
        # The identifier must be an alphanumeric string that can include underscores or hyphens. It cannot contain spaces, special characters, slashes, or backslashes.
        # self.id = "litellm_manifold"

        # Optionally, you can set the name of the manifold pipeline.
        # self.name = "LiteLLM: "
        self.name = ""

        # Initialize rate limits
        self.valves = self.Valves(
            **{
                "LITELLM_BASE_URL": os.getenv(
                    "LITELLM_BASE_URL", "http://localhost:4001"
                ),
                "LITELLM_API_KEY": os.getenv("LITELLM_API_KEY", "your-api-key-here"),
                "LITELLM_PIPELINE_DEBUG": os.getenv("LITELLM_PIPELINE_DEBUG", True),
            }
        )
        # Get models on initialization
        self.pipelines = self.get_litellm_models()
        pass

    async def on_startup(self):
        # This function is called when the server is started.
        print(f"on_startup:{__name__}")
        # Get models on startup
        self.pipelines = self.get_litellm_models()
        pass

    async def on_shutdown(self):
        # This function is called when the server is stopped.
        print(f"on_shutdown:{__name__}")
        pass

    async def on_valves_updated(self):
        # This function is called when the valves are updated.

        self.pipelines = self.get_litellm_models()
        pass

    def get_litellm_models(self):

        headers = {}
        if self.valves.LITELLM_API_KEY:
            headers["Authorization"] = f"Bearer {self.valves.LITELLM_API_KEY}"

        if self.valves.LITELLM_BASE_URL:
            try:
                r = requests.get(
                    f"{self.valves.LITELLM_BASE_URL}/v1/models", headers=headers
                )
                models = r.json()
                return [
                    {
                        "id": model["id"],
                        "name": model["name"] if "name" in model else model["id"],
                    }
                    for model in models["data"]
                ]
            except Exception as e:
                print(f"Error fetching models from LiteLLM: {e}")
                return [
                    {
                        "id": "error",
                        "name": "Could not fetch models from LiteLLM, please update the URL in the valves.",
                    },
                ]
        else:
            print("LITELLM_BASE_URL not set. Please configure it in the valves.")
            return []

    def pipe(
        self, user_message: str, model_id: str, messages: List[dict], body: dict
    ) -> Union[str, Generator, Iterator]:
        if "user" in body:
            print("######################################")
            print(f'# User: {body["user"]["name"]} / {body["user"]["email"]}')
            print(f"# Message: {user_message}")
            print("######################################")

        headers = {}
        if self.valves.LITELLM_API_KEY:
            headers["Authorization"] = f"Bearer {self.valves.LITELLM_API_KEY}"

        try:
            payload = {**body, "model": model_id}

            payload.pop("chat_id", None)
            payload.pop("user", None)
            payload.pop("title", None)
            payload.pop("custom_metadata", None)

            if body.get('custom_metadata'):
                payload["metadata"] = body["custom_metadata"]

            r = requests.post(
                url=f"{self.valves.LITELLM_BASE_URL}/v1/chat/completions",
                json=payload,
                headers=headers,
                stream=True,
            )

            r.raise_for_status()

            if body["stream"]:
                return r.iter_lines()
            else:
                return r.json()
        except Exception as e:
            return f"Error: {e}"

graph TB
    subgraph My Solution
        direction TB
        subgraph "User 👤"
            Client[Client 🌐]
        end
        note["`2️⃣. Automaticly remove metadata by Open WebUI's OpenAI interface, insert **session_id** & **trace_user_id** into custom column **custom_metadata** for LiteLLM
            4️⃣. Insert **custom_metadat** into payload as **metadata** and remove from original body
            8️⃣. Success/Failure callback to Langfuse`"]
        subgraph FTC GPT
            direction TB
            OW[Open WebUI]
            AO[Azure OpenAI]
            Langfuse
            LiteLLM
            subgraph Pipelines
                direction LR
                filter
                pipe
            end
        end
    end

    Client -->|1| OW
    OW -->|2| filter
    filter -.->|3| OW
    OW -.->|4| pipe
    pipe -->|5| LiteLLM
    LiteLLM -->|6| AO
    AO -->|7| LiteLLM
    LiteLLM -->|8| Langfuse
    LiteLLM -->|9| pipe
    pipe -->|10| OW
    OW -.->|11| Client

    linkStyle 1 color:#6a83a8
    linkStyle 3 color:#6a83a8
    linkStyle 7 color:#6a83a8

    style note text-align: left

10 replies

thiswillbeyourgithub Aug 30, 2024

Oh interesting. Thanks a lot for the followup!

I saw this also:
https://github.com/open-webui/pipelines/blob/main/examples/filters/langfuse_filter_pipeline.py
Do you think it could be used to log to langfuse with only this filter? My issue with pipe (not pipelines) is that I cankt make all my models go through it. Actually I'm still struggling to understand wth is a manifold pipe and if it can magically proxy all my models. Do ypu know by any chance?

Also do you plan on making a PR to add this env variable? Thas sounds like a relevant fix.

changchiyou Aug 30, 2024
Author

Do you think it could be used to log to langfuse with only this filter?

It can be, but it can't be. Chaining LiteLLM and Langfuse is far better than using langfuse_filter_pipeline.py because LiteLLM handles success and failure results with Langfuse, rather than just logging everything indiscriminately.

Additionally, langfuse_filter_pipeline.py is useless because it only provides usage statistics without calculating costs based on the pricing tables of different LLMs. It merely gives the number of characters, which can mislead Langfuse's cost calculation.

            usage={
                "totalCost": (len(user_message) + len(generated_message)) / 1000,
                "unit": "CHARACTERS",
            },

changchiyou Aug 30, 2024
Author

Also do you plan on making a PR to add this env variable? Thas sounds like a relevant fix.

No, I definitely wouldn't do that. Cleaning up metadata is essential to avoid errors when interacting with LLM endpoints. For instance, if you accidentally include metadata in a request payload to Azure OpenAI, you'll likely encounter an error like Unknown parameter.

thiswillbeyourgithub Aug 30, 2024

I want to thank you very much for everything you've provided me, it's been very helpful and I now have a deeper understanding of how those are interlinked. I plan to use your code, fork it a bit, and then host it on my linked GitHub repository, if that's okay with you.

Also do you plan on making a PR to add this env variable? Thas sounds like a relevant fix.

No, I definitely wouldn't do that. Cleaning up metadata is essential to avoid errors when interacting with LLM endpoints. For instance, if you accidentally include metadata in a request payload to Azure OpenAI, you'll likely encounter an error like Unknown parameter.

I agree with you but I also think that it would be helpful to add a name parameter to each of the connections we can set up and this way we could add a environment variable that lists the backends for which we should not remove unexpected parameters. I think it would probably ease the development a bit for some unorthodox setups.

thiswillbeyourgithub Sep 1, 2024

FYI the ongoing code is at https://github.com/thiswillbeyourgithub/openwebui_custom_pipes_filters/tree/main

Not yet working but I have little time

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

feat: track the cost associated with each user #198

{{title}}

Replies: 9 comments 11 replies

{{title}}

{{title}}

{{title}}

{{editor}}'s edit

{{editor}}'s edit

{{title}}

{{title}}

{{title}}

{{title}}

{{title}}

{{title}}

{{title}}

{{editor}}'s edit

{{editor}}'s edit

{{title}}

{{title}}

{{editor}}'s edit

{{editor}}'s edit

{{title}}

{{title}}

{{title}}

Select a reply

feat: track the cost associated with each user #198

Replies: 9 comments · 11 replies

tjbck Apr 10, 2024 Maintainer

changchiyou May 28, 2024 Author

changchiyou Aug 15, 2024 Author

changchiyou Aug 30, 2024 Author

changchiyou Aug 30, 2024 Author

Replies: 9 comments 11 replies

tjbck
Apr 10, 2024
Maintainer

changchiyou
May 28, 2024
Author

changchiyou
Aug 15, 2024
Author

changchiyou Aug 30, 2024
Author

changchiyou Aug 30, 2024
Author