Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

feat: add v1 enpoint to set Response status #2756

Original file line number Diff line number Diff line change
Expand Up @@ -36,6 +36,7 @@ def upgrade() -> None:
sa.Column("values", sa.JSON, nullable=False),
sa.Column("record_id", sa.Uuid, sa.ForeignKey("records.id", ondelete="CASCADE"), nullable=False, index=True),
sa.Column("user_id", sa.Uuid, sa.ForeignKey("users.id", ondelete="SET NULL"), index=True),
sa.Column("status", sa.String, nullable=False, index=True),
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Can you explain your rationale for modifying this database migration instead of opting to create a new one?

Copy link
Member Author

@gabrielmbmb gabrielmbmb Apr 24, 2023

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

So in the current latest version 1.6.0, only three tables have been created and managed by alembic workspaces, users, and workspaces_users.

In the branch feat/2615-instructions-task, migrations files have been created for some other tables. I've opted to modify the migration file instead of generating a new one because I've assumed that as these changes have not been used yet in any production environment, it is better to modify the migration file directly to add this new column, instead of creating a new migration file for just adding one column. One more file, but a little bit more noise.

Of course, if the migrations files in feat/2615-instruction-task were already applied in any production DB, then the logical thing to do would have been to generate a new migration script to add this new column, so alembic would have detected that there is a revision that has not been applied and that it needs to apply to the DB.

Having that said, for me it was not super clear how are you planning to handle the migrations with alembic. Maybe it's a good idea to have one single alembic revision/script to update the DB per the released version of the package.

Also, I've realized that alembic revision --autogenerate command cannot be used to generate migrations automatically because

target_metadata = None
the target_metadata has not been set to
class Base(DeclarativeBase):
Base.metadata.

Another doubt that I have, is that I've seen that the attributes nullable, index, etc are being set in the migrations files, but not in the mapped_column function of sqlalchemy.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Also, I've realized that alembic revision --autogenerate command cannot be used to generate migrations automatically

In your opinion, what are the benefits of utilizing automatic migrations?

Another doubt that I have, is that I've seen that the attributes nullable, index, etc are being set in the migrations files, but not in the mapped_column function of sqlalchemy.

In the same vein as the previous question, what are your views on the advantages of including these attributes in models?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think that I can answer both questions at the same time 😁

So if we include these attributes in the models, then alembic will be available to automatically pick up them and it will generate a statement in the migration script with all the columns constraints, index, etc.

I think that it's really beneficial to use alembic automatic migrations, because alembic is able to automatically detect the updates you have done in your model class, or in other words, it is able to generate a diff between the schema generated with all the previous revisions and the one you're about to generate, and it will only create statements for the things that need to be updated.

So in my opinion, it is better to let alembic generate the migration file automatically with the SQLAlchemy model metadata than to let a human generate it manually and increase the chances of forgetting to include something in the migration script.

sa.Column("inserted_at", sa.DateTime, nullable=False),
sa.Column("updated_at", sa.DateTime, nullable=False),
sa.UniqueConstraint("record_id", "user_id", name="response_record_id_user_id_uq"),
Expand Down
21 changes: 20 additions & 1 deletion src/argilla/server/apis/v1/handlers/responses.py
Original file line number Diff line number Diff line change
Expand Up @@ -21,7 +21,11 @@
from argilla.server.database import get_db
from argilla.server.models import User
from argilla.server.policies import ResponsePolicyV1, authorize
from argilla.server.schemas.v1.responses import Response, ResponseUpdate
from argilla.server.schemas.v1.responses import (
Response,
ResponseStatusUpdate,
ResponseUpdate,
)
from argilla.server.security import auth

router = APIRouter(tags=["responses"])
Expand Down Expand Up @@ -53,6 +57,21 @@ def update_response(
return datasets.update_response(db, response, response_update)


@router.put("/responses/{response_id}/status", response_model=Response)
def update_response_status(
*,
db: Session = Depends(get_db),
response_id: UUID,
response_status_update: ResponseStatusUpdate,
current_user: User = Security(auth.get_current_user),
):
response = _get_response(db, response_id)

authorize(current_user, ResponsePolicyV1.update(response))

return datasets.update_response_status(db, response, response_status_update)


@router.delete("/responses/{response_id}", response_model=Response)
def delete_response(
*,
Expand Down
11 changes: 10 additions & 1 deletion src/argilla/server/contexts/datasets.py
Original file line number Diff line number Diff line change
Expand Up @@ -24,7 +24,7 @@
RecordsCreate,
)
from argilla.server.schemas.v1.records import ResponseCreate
from argilla.server.schemas.v1.responses import ResponseUpdate
from argilla.server.schemas.v1.responses import ResponseStatusUpdate, ResponseUpdate
from argilla.server.security.model import User
from sqlalchemy import and_, func
from sqlalchemy.orm import Session, contains_eager, joinedload
Expand Down Expand Up @@ -230,6 +230,15 @@ def update_response(db: Session, response: Response, response_update: ResponseUp
return response


def update_response_status(db: Session, response: Response, response_status_update: ResponseStatusUpdate):
response.status = response_status_update.status

db.commit()
db.refresh(response)

return response


def delete_response(db: Session, response: Response):
db.delete(response)
db.commit()
Expand Down
9 changes: 8 additions & 1 deletion src/argilla/server/models.py
Original file line number Diff line number Diff line change
Expand Up @@ -53,6 +53,12 @@ class UserRole(str, Enum):
annotator = "annotator"


class ResponseStatus(str, Enum):
pending = "pending"
submitted = "submitted"
discarded = "discarded"


class Field(Base):
__tablename__ = "fields"

Expand Down Expand Up @@ -106,6 +112,7 @@ class Response(Base):
values: Mapped[dict] = mapped_column(JSON)
record_id: Mapped[UUID] = mapped_column(ForeignKey("records.id"))
user_id: Mapped[UUID] = mapped_column(ForeignKey("users.id"))
status: Mapped[ResponseStatus] = mapped_column(default=ResponseStatus.pending, index=True, nullable=False)

inserted_at: Mapped[datetime] = mapped_column(default=datetime.utcnow)
updated_at: Mapped[datetime] = mapped_column(default=default_inserted_at, onupdate=datetime.utcnow)
Expand All @@ -116,7 +123,7 @@ class Response(Base):
def __repr__(self):
return (
f"Response(id={str(self.id)!r}, record_id={str(self.record_id)!r}, user_id={str(self.user_id)!r}, "
f"inserted_at={str(self.inserted_at)!r}, updated_at={str(self.updated_at)!r})"
f"status={self.status!r}, inserted_at={str(self.inserted_at)!r}, updated_at={str(self.updated_at)!r})"
gabrielmbmb marked this conversation as resolved.
Show resolved Hide resolved
)


Expand Down
7 changes: 7 additions & 0 deletions src/argilla/server/schemas/v1/responses.py
Original file line number Diff line number Diff line change
Expand Up @@ -18,12 +18,15 @@

from pydantic import BaseModel

from argilla.server.models import ResponseStatus


class Response(BaseModel):
id: UUID
values: Dict[str, Any]
record_id: UUID
user_id: UUID
status: ResponseStatus
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

In src/argilla/server/schemas/v1/datasets.py, there exists a distinct Response schema. Would you suggest including the status field in this schema as well?

Copy link
Member Author

@gabrielmbmb gabrielmbmb Apr 24, 2023

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Oh, I didn't realize there was this Response schema. Yes, we should include the status field here too, otherwise when calling the listing records /api/v1/datasets/{dataset_id}/records the status for each response won't be there, and the frontend would have to send an additional request to get the status for getting the responses status, which is not ideal.

I think it makes sense to have a ResponseBase class in src/argilla/server/schemas/v1/responses.py, this way we avoid code duplication.

  • We can create a Responseclass inheriting from ResponseBase class with additional attributes that have to be exposed when listing responses (like the record_id) in src/argilla/server/schemas/v1/responses.py.
  • We can create a Response class inheriting from ResponseBase with no additional attributes that can be used when listing records in src/argilla/server/schemas/v1/datasets.py.

inserted_at: datetime
updated_at: datetime

Expand All @@ -33,3 +36,7 @@ class Config:

class ResponseUpdate(BaseModel):
values: Dict[str, Any]


class ResponseStatusUpdate(BaseModel):
status: ResponseStatus
56 changes: 55 additions & 1 deletion tests/server/api/v1/test_responses.py
gabrielmbmb marked this conversation as resolved.
Show resolved Hide resolved
Original file line number Diff line number Diff line change
Expand Up @@ -16,7 +16,7 @@
from uuid import uuid4

from argilla._constants import API_KEY_HEADER_NAME
from argilla.server.models import Response
from argilla.server.models import Response, ResponseStatus
from fastapi.testclient import TestClient
from sqlalchemy.orm import Session

Expand Down Expand Up @@ -54,6 +54,7 @@ def test_update_response(client: TestClient, db: Session, admin_auth_header: dic
},
"record_id": str(response.record_id),
"user_id": str(response.user_id),
"status": ResponseStatus.pending,
"inserted_at": response.inserted_at.isoformat(),
"updated_at": datetime.fromisoformat(resp_body["updated_at"]).isoformat(),
}
Expand Down Expand Up @@ -117,6 +118,7 @@ def test_update_response_as_annotator(client: TestClient, db: Session):
},
"record_id": str(response.record_id),
"user_id": str(response.user_id),
"status": ResponseStatus.pending,
"inserted_at": response.inserted_at.isoformat(),
"updated_at": datetime.fromisoformat(resp_body["updated_at"]).isoformat(),
}
Expand Down Expand Up @@ -171,6 +173,58 @@ def test_update_response_with_nonexistent_response_id(client: TestClient, db: Se
}


def test_update_response_status(client: TestClient, db: Session, admin_auth_header: dict):
response = ResponseFactory.create()
response_json = {"status": "submitted"}

resp = client.put(f"/api/v1/responses/{response.id}/status", headers=admin_auth_header, json=response_json)

assert resp.status_code == 200
assert db.get(Response, response.id).status == "submitted"


gabrielmbmb marked this conversation as resolved.
Show resolved Hide resolved
def test_update_response_status_without_authentication(client: TestClient, db: Session):
response = ResponseFactory.create()
response_json = {"status": "submitted"}

resp = client.put(f"/api/v1/responses/{response.id}/status", json=response_json)

assert resp.status_code == 401
assert db.get(Response, response.id).status == "pending"


def test_update_response_status_as_annotator(client: TestClient, db: Session):
annotator = AnnotatorFactory.create()
response = ResponseFactory.create(user=annotator)
response_json = {"status": "submitted"}

resp = client.put(
f"/api/v1/responses/{response.id}/status", headers={API_KEY_HEADER_NAME: annotator.api_key}, json=response_json
)

assert resp.status_code == 200
assert db.get(Response, response.id).status == "submitted"


def test_update_response_status_as_annotator_for_different_user_response(client: TestClient, db: Session):
annotator = AnnotatorFactory.create()
response = ResponseFactory.create()
response_json = {"status": "submitted"}

resp = client.put(
f"/api/v1/responses/{response.id}/status", headers={API_KEY_HEADER_NAME: annotator.api_key}, json=response_json
)

assert resp.status_code == 403
assert db.get(Response, response.id).status == "pending"


def test_update_response_status_with_nonexistent_response_id(client: TestClient, db: Session, admin_auth_header: dict):
resp = client.put(f"/api/v1/responses/{uuid4()}/status", headers=admin_auth_header, json={"status": "submitted"})

assert resp.status_code == 404


def test_delete_response(client: TestClient, db: Session, admin_auth_header: dict):
response = ResponseFactory.create()

Expand Down