Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

How to use cvat-sdk with token #7439

Open
2 tasks done
liudaolunboluo opened this issue Feb 5, 2024 · 18 comments
Open
2 tasks done

How to use cvat-sdk with token #7439

liudaolunboluo opened this issue Feb 5, 2024 · 18 comments
Labels
enhancement New feature or request gsoc2024

Comments

@liudaolunboluo
Copy link

Actions before raising this issue

  • I searched the existing issues and did not find anything similar.
  • I read/searched the docs

Is your feature request related to a problem? Please describe.

When creating a client in cvat-sdk you need to enter the username and password but now I have multiple users and I don't know their passwords but I can get their tokens, can I use this token directly to use the cvat-sdk client?

Describe the solution you'd like

I us token directly to use the cvat-sdk client

Describe alternatives you've considered

No response

Additional context

No response

@liudaolunboluo liudaolunboluo added the enhancement New feature or request label Feb 5, 2024
@zhiltsov-max
Copy link
Contributor

Hi, there is no "official way", but please check this answer and this draft PR.

@liudaolunboluo
Copy link
Author

Hi, there is no "official way", but please check this answer and this draft PR.

Are sessionid and csrftoken necessary? Because my scenario is that I deployed a privatized cvat myself and integrated him into my system, and I want to automate the import of the dataset in one of my system's features, so I shouldn't be able to get the cross-domain cookie information in the browser

@zhiltsov-max
Copy link
Contributor

zhiltsov-max commented Feb 5, 2024

If you're working as a local admin, you should be able to download annotations from other users. Consider using this approach instead. Would a special service account for these purposes work for you?

@liudaolunboluo
Copy link
Author

If you're working as a local admin, you should be able to download annotations from other users. Consider using this approach instead. Would a special service account for these purposes work for you?

I only want to upload dataset to project,now I think I can use normal user create project and use admin user upload datset,because I have admin user password,but question is high level sdk have not only import dataset api,it only have create project from datset api. so what can i do?

@liudaolunboluo
Copy link
Author

If you're working as a local admin, you should be able to download annotations from other users. Consider using this approach instead. Would a special service account for these purposes work for you?

oh,sory,I read this manual: https://opencv.github.io/cvat/docs/api_sdk/sdk/highlevel-api/,That's very helpful.,
it worked:
project = client.projects.retrieve(project_id)

@liudaolunboluo
Copy link
Author

I'm having a little problem again.
My code:

with make_client(
        host="my local server url",
        credentials=('admin', 'admin password')

) as client:
    try:
        project = client.projects.retrieve(project_id)
    except Exception as e:
        print('throw exception:', e)
    project.import_dataset(format_name='CVAT 1.1', filename='dataset file path', pbar=pbar)

it throw exception:
HTTP response body: b'{"detail":"CSRF Failed: CSRF token missing or incorrect."}'
I've noticed that in some issues uploading on the page may encounter this problem, just try logging out and logging in, now in the sdk I'm still encountering this problem, how can I fix it? Thank you very much for your reply!

@liudaolunboluo
Copy link
Author

liudaolunboluo commented Feb 6, 2024

I solved it! It looks like it was a code issue, there was no X-Csrftoken set for the header anywhere in the code for importing the dataset, so it was reporting an error:https://stackoverflow.com/questions/26639169/csrf-failed-csrf-token-missing-or-incorrect
,I'm now changing it this way and it doesn't report an error:

        pairs = client.api_client.get_common_headers()['Cookie'].split('; ')
        dictionary = {pair.split('=')[0]: pair.split('=')[1] for pair in pairs}
        client.api_client.set_default_header("X-Csrftoken", dictionary['csrftoken'])
        client.api_client.get_common_headers()

The core is to get the csrftoken from the cookie and set the X-CSRFToken to the header.
but But there is a new error reported:

  File "/Library/Frameworks/Python.framework/Versions/3.10/lib/python3.10/site-packages/cvat_sdk/core/proxies/projects.py", line 57, in import_dataset
    DatasetUploader(self._client).upload_file_and_wait(
  File "/Library/Frameworks/Python.framework/Versions/3.10/lib/python3.10/site-packages/cvat_sdk/core/uploading.py", line 321, in upload_file_and_wait
    rq_id = json.loads(response.data).get("rq_id")
  File "/Library/Frameworks/Python.framework/Versions/3.10/lib/python3.10/json/__init__.py", line 346, in loads
    return _default_decoder.decode(s)
  File "/Library/Frameworks/Python.framework/Versions/3.10/lib/python3.10/json/decoder.py", line 337, in decode
    obj, end = self.raw_decode(s, idx=_w(s, 0).end())
  File "/Library/Frameworks/Python.framework/Versions/3.10/lib/python3.10/json/decoder.py", line 355, in raw_decode
    raise JSONDecodeError("Expecting value", s, err.value) from None
json.decoder.JSONDecodeError: Expecting value: line 1 column 1 (char 0)

code is:rq_id = json.loads(response.data).get("rq_id")
But response.data doesn't look like a json string but a byte array
But Data set imported successfully

@liudaolunboluo
Copy link
Author

I solved it again,But the sdk api does have problems too, hopefully it will be fixed soon!

project = client.projects.retrieve(project_id)
filename = Path(dataset_path)
params = {"format": 'CVAT 1.1', "filename": filename.name}
url = client.api_map.make_endpoint_url(project.api.create_dataset_endpoint.path, kwsub={"id": project_id})
response = DatasetUploader(client).upload_file(
                url, filename, pbar=pbar, query_params=params, meta={"filename": params["filename"]})

@zhiltsov-max
Copy link
Contributor

Thank you for reporting the issues.

But response.data doesn't look like a json string but a byte array

Probably, you're using an older version of CVAT, before the change was introduced. Please make sure SDK and server versions match. The change was introduced in #5909.

@liudaolunboluo
Copy link
Author

Thank you for reporting the issues.

But response.data doesn't look like a json string but a byte array

Probably, you're using an older version of CVAT, before the change was introduced. Please make sure SDK and server versions match. The change was introduced in #5909.

I use v2.3.0 sdk,and use import_dataset method,but it throw:raise tus_uploader.TusCommunicationError(
tusclient.exceptions.TusCommunicationError: Attempt to retrieve offset failed with status 200

@Abo-Omar-74
Copy link

I am looking forward to contributing to solving this issue through GSOC 24 and I have some questions regarding the scope of this project.

  • Scope Clarification:

    • According to the documentation, there is already a way of Authentication using tokens. Does this mean that the scope of this problem is to make the process of creating these tokens easier with the UI?
  • Integration with auth_api:

    • Do I need to add a new method to auth_api for token generation, or utilize the existing login method since it returns an access token?
  • Display of Access Tokens:

    • Should the access token be directly visible in the account settings, allowing users to copy and manually include it in the request headers using api_client.set_default_header("Authorization", "Token " + {generated token})?
    • Alternatively, should the UI indicate successful addition of the access token and automate its inclusion in environment variables for seamless integration into request headers?

@Abo-Omar-74
Copy link

I am looking forward to contributing to solving this issue through GSOC 24 and I have some questions regarding the scope of this project.

  • Scope Clarification:

    • According to the documentation, there is already a way of Authentication using tokens. Does this mean that the scope of this problem is to make the process of creating these tokens easier with the UI?
  • Integration with auth_api:

    • Do I need to add a new method to auth_api for token generation, or utilize the existing login method since it returns an access token?
  • Display of Access Tokens:

    • Should the access token be directly visible in the account settings, allowing users to copy and manually include it in the request headers using api_client.set_default_header("Authorization", "Token " + {generated token})?
    • Alternatively, should the UI indicate successful addition of the access token and automate its inclusion in environment variables for seamless integration into request headers?

I would greatly appreciate your guidance in clarifying this. Your expertise would be incredibly valuable. I truly appreciate your support.

@zhiltsov-max @SpecLad @azhavoro

@zhiltsov-max
Copy link
Contributor

zhiltsov-max commented Mar 17, 2024

@Abo-Omar-74,

Hi, thank you for reaching us out about this topic. Overall, the scope for GSoC can vary, depending on your skills and desire. This is a complex task, that can be split into several elements:

  1. SDK / CLI update to support persistent login. Can be done with any type of keys. Basically, it means integrating this code snippet into our current CLI:

This code allows to record user credentials in the user profile directory. Credentials for each visited CVAT host are recorded, to allow future visits without explicit authentication from the user. This is similar to what you would find in AWS s3 CLI or Google Cloud Storage's CLI.

from __future__ import annotations

import json
import os
from argparse import ArgumentParser
from functools import partial
from pathlib import Path
from types import SimpleNamespace
from typing import Optional, Tuple

import attrs
import cvat_sdk

API_KEY_VAR = "CVAT_API_KEY"
API_SESSIONID_VAR = "CVAT_API_SESSIONID"
API_CSRFTOKEN_VAR = "CVAT_API_CSRFTOKEN"


def add_cli_parser_args(parser: ArgumentParser) -> ArgumentParser:
    parser.add_argument("--org")
    parser.add_argument("--host", default="https://app.cvat.ai")
    parser.add_argument("--port")
    parser.add_argument(
        "--login",
        help=f"A 'login:password' pair. "
        f"Default: use {API_KEY_VAR}, {API_SESSIONID_VAR}, {API_CSRFTOKEN_VAR} env vars",
    )
    parser.add_argument("--profile-dir", default=None, type=Path, help="User profile dir")

    return parser


DEFAULT_PROFILE_FILENAME = "profile.json"
DEFAULT_PROFILE_DIR = "~/cvat/"


@attrs.define
class UserCredentials:
    token: str
    sessionid: str
    csrftoken: str


@attrs.define
class UserProfile:
    _PROFILE_UMASK = 0o600

    @classmethod
    def get_default_path(cls) -> Path:
        return Path(DEFAULT_PROFILE_DIR).expanduser() / DEFAULT_PROFILE_FILENAME

    path: Optional[Path] = attrs.field(factory=partial(get_default_path.__func__, __build_class__))

    credentials: dict[str, UserCredentials] = attrs.field(factory=dict)
    extra: dict[str, str] = attrs.field(factory=dict)

    @classmethod
    def parse(cls, path: Path) -> UserProfile:
        data = json.loads(path.read_text())

        return UserProfile(
            credentials={k: UserCredentials(**v) for k, v in data["credentials"].items()},
            extra=data.get("extra"),
        )

    @classmethod
    def load(cls, path: Optional[Path] = None) -> UserProfile:
        path = path or cls.get_default_path()

        if path.is_file():
            if (mode := path.stat().st_mode) & cls._PROFILE_UMASK != cls._PROFILE_UMASK:
                raise Exception(f"Invalid profile mode. Expected 600 (rw--), got {oct(mode)}")

            profile = cls.parse(path)
        else:
            profile = UserProfile(path=path)
            profile.save(path)

        return profile

    def save(self, path: Optional[Path] = None):
        path = path or self.path or self.get_default_path()

        data = json.dumps(
            attrs.asdict(self, filter=lambda a, v: a.name != "path", recurse=True), indent=2
        )

        if path.absolute() == self.get_default_path().absolute() and not path.parent.is_dir():
            path.parent.mkdir(mode=0o700, parents=True, exist_ok=True)

        with path.open("w", encoding="utf-8") as f:
            f.write(data)

        path.chmod(self._PROFILE_UMASK)

    def update_credentials(self, host: str, *, token: str, sessionid: str, csrftoken: str):
        self.credentials[self._make_host_key(host)] = UserCredentials(
            token=token, sessionid=sessionid, csrftoken=csrftoken
        )

    def has_credentials(self, host: str) -> bool:
        return self._make_host_key(host) in self.credentials

    def get_credentials(self, host: str) -> UserCredentials:
        return self.credentials[self._make_host_key(host)]

    def _make_host_key(self, host: str) -> str:
        return host.split("://", maxsplit=1)[-1]


class UserClient(cvat_sdk.Client):
    def __init__(self, *args, profile: Optional[UserProfile] = None, **kwargs) -> None:
        super().__init__(*args, **kwargs)

        self.profile = profile or UserProfile.load()

    def login(self, credentials: Tuple[str, str]) -> None:
        super().login(credentials)
        self.save_current_credentials()

    def load_credentials(self):
        credentials = self.profile.get_credentials(self.api_map.host)

        self.api_client.set_default_header("Authorization", f"Token {credentials.token}")
        self.api_client.cookies["sessionid"] = credentials.sessionid
        self.api_client.cookies["csrftoken"] = credentials.csrftoken
        self.api_client.set_default_header("X-Csrftoken", credentials.csrftoken)

    def save_current_credentials(self):
        self.profile.update_credentials(
            host=self.api_map.host,
            token=self.api_client.default_headers["Authorization"].split("Token ")[-1],
            sessionid=self.api_client.cookies["sessionid"].value,
            csrftoken=self.api_client.cookies["csrftoken"].value,
        )
        self.profile.save()


def make_client_from_cli(parsed_args: SimpleNamespace) -> UserClient:
    profile = UserProfile.load(parsed_args.profile_dir)

    host = parsed_args.host
    port = parsed_args.port
    url = host.rstrip("/")
    if port:
        url = f"{url}:{port}"

    with UserClient(url, profile=profile) as client:
        if parsed_args.org:
            client.organization_slug = parsed_args.org

        if parsed_args.login:
            client.login(parsed_args.login.split(":", maxsplit=1))
        elif api_key := os.getenv(API_KEY_VAR):
            client.api_client.set_default_header("Authorization", f"Token {api_key}")
            client.api_client.cookies["sessionid"] = os.getenv(API_SESSIONID_VAR)
            client.api_client.cookies["csrftoken"] = os.getenv(API_CSRFTOKEN_VAR)
            client.api_client.set_default_header("X-Csrftoken", os.getenv(API_CSRFTOKEN_VAR))
            client.save_current_credentials()
        elif profile.has_credentials(client.api_map.host):
            client.load_credentials()

        return client

I can see it can be extended with an auth CLI command, that allows just to login on a host and store tokens locally for further use. It can also be extended with a command to remove any of the recorded tokens from the local profile.

  1. Server updates to support manageable API token generation for a user. Currently, CVAT also uses tokens, but the difference is that these new tokens could be manageable - i.e. can be created, revoked, have expiration time etc. Current tokens are mostly for the UI to work, so are obtained after the login.
  2. UI updates to support API token management options in the personal account page (basically, requires to add such page first, as there is no such section in CVAT yet). Probably, can work similarly to what's on GitHub.
  3. SDK / CLI updates to support API tokens for auth - similar to the existing login/password pair

Basically, I'd propose to start with the first task from this list. It already has a PoC implementation, doesn't require too many changes, and just needs to be productized - i.e. with tests and convenient user interface.

@ritikraj26
Copy link
Contributor

Hi, I'm interested in contributing to this project. My understanding is that the goal is to implement a system within user profiles that allows for the generation of API access tokens. Users should be able to store these tokens locally for persistent authorization, similar to how GitHub handles personal access tokens. Can you confirm if this understanding is correct?

@zhiltsov-max @nmanovic

@zhiltsov-max
Copy link
Contributor

zhiltsov-max commented Mar 28, 2024

@ritikraj26, hi! Yes, your understanding is correct.

@ritikraj26
Copy link
Contributor

ritikraj26 commented Mar 29, 2024

Currently, the user is obtaining the auth token using the username and password. We need to just replace the current method with the API access tokens generated by the user. The rest of the flow remains the same. Only the method to obtain the auth token is updated?
@zhiltsov-max

@ritikraj26
Copy link
Contributor

@zhiltsov-max
Where can I connect with you? Would you be open to some discussion? I am keenly interested in contributing to this project.

@zhiltsov-max
Copy link
Contributor

@ritikraj26, I think it's best to be discussed here for others to see and participate, unless you're going to send your GSoC proposal. Proposals need to be sent via the GSoC site.

The rest of the flow remains the same. Only the method to obtain the auth token is updated?

Basically - yes, but "The rest of the flow" needs clarification. If you meant the point 1 from the comment above, then I think it's quite close to what's expected from the token use point of view. But tokens can also be managed in CVAT as discussed in the comment.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
enhancement New feature or request gsoc2024
Projects
None yet
Development

No branches or pull requests

5 participants