Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

fix: replace is_() with == in dataset name filter #3983

Merged
merged 1 commit into from
Jul 24, 2024

Conversation

joelbarmettlerUZH
Copy link
Contributor

This pull request addresses a syntax error occurring in the list_datasets function when filtering datasets by name, specifically when using PostgreSQL 16.

Issue

When querying the /v1/datasets endpoint with PostgreSQL 16, the function encounters a syntax error. The error occurs due to the use of SQLAlchemy's is_() method for comparing the dataset name, which is typically used for NULL comparisons. This resulted in a PostgreSQL syntax error near "$1" when the query was executed.

Error logs revealed:

sqlalchemy.exc.ProgrammingError: (sqlalchemy.dialects.postgresql.asyncpg.ProgrammingError) <class 'asyncpg.exceptions.PostgresSyntaxError'>: syntax error at or near "$1"
[SQL: SELECT datasets.id, datasets.name, datasets.description, datasets.metadata, datasets.created_at, datasets.updated_at 
FROM datasets 
WHERE datasets.name IS $1::VARCHAR ORDER BY datasets.id DESC 
 LIMIT $2::INTEGER]
[parameters: ('test', 11)]
(Background on this error at: https://sqlalche.me/e/20/f405)

This error indicates that PostgreSQL 16 is not interpreting the IS operator correctly with the parameter placeholder $1.

Fix

Changed the dataset name filter from:

query = query.filter(models.Dataset.name.is_(name))

to:

query = query.filter(models.Dataset.name == name)

Rationale

According to the SQLAlchemy documentation, the is_() method is meant for NULL comparisons and generates SQL using the IS operator. For value equality comparisons, the == operator should be used instead. This change ensures compatibility with PostgreSQL 16 and correctly generates the SQL for name comparison.

Impact

This change resolves the SQL syntax error and allows proper filtering of datasets by name when using PostgreSQL 16. It ensures that the /v1/datasets endpoint functions correctly with parameter-based filtering.

Related documentation:

SQLAlchemy ColumnOperators.is_(): https://docs.sqlalchemy.org/en/20/core/sqlelement.html#sqlalchemy.sql.expression.ColumnOperators.is_

@dosubot dosubot bot added the size:XS This PR changes 0-9 lines, ignoring generated files. label Jul 24, 2024
Copy link
Contributor

github-actions bot commented Jul 24, 2024

CLA Assistant Lite bot All contributors have signed the CLA ✍️ ✅

@joelbarmettlerUZH joelbarmettlerUZH changed the title Fix: replace is_() with == in dataset name filter fix: replace is_() with == in dataset name filter Jul 24, 2024
@joelbarmettlerUZH
Copy link
Contributor Author

I have read the CLA Document and I hereby sign the CLA

github-actions bot added a commit that referenced this pull request Jul 24, 2024
@RogerHYang
Copy link
Contributor

Thank you for your contribution, @joelbarmettlerUZH!

@RogerHYang RogerHYang merged commit 3f77759 into Arize-ai:main Jul 24, 2024
9 checks passed
@mikeldking
Copy link
Contributor

Thank you @joelbarmettlerUZH ! Amazing.

@axiomofjoy
Copy link
Contributor

Good catch @joelbarmettlerUZH thanks!

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
size:XS This PR changes 0-9 lines, ignoring generated files.
Projects
None yet
Development

Successfully merging this pull request may close these issues.

4 participants