Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

MSQ window functions: Reject MVDs during window processing #17036

Merged
merged 3 commits into from
Sep 23, 2024

Conversation

Akshat-Jain
Copy link
Contributor

Description

This PR aims to reject MVDs when used as partition by columns during window function processing for MSQ engine.

Prior to this change, a MSQ query like the following was failing with ClassCastException:

select cityName, countryName, array_to_mv(array[1,length(cityName)]),
row_number() over (partition by  array_to_mv(array[1,length(cityName)]) order by countryName, cityName)
from wikipedia
where countryName in ('Austria', 'Republic of Korea') and cityName is not null
order by 1, 2, 3

This is a follow-up on #17002, where a similar thing was done for sql-native engine.


This PR has:

  • been self-reviewed.
  • added documentation for new or modified features or behaviors.
  • a release note entry in the PR description.
  • added Javadocs for most classes and all non-trivial methods. Linked related entities via Javadoc links.
  • added or updated version, license, or notice information in licenses.yaml
  • added comments explaining the "why" and the intent of the code wherever would not be obvious for an unfamiliar reader.
  • added unit tests or modified existing tests to cover new code paths, ensuring the threshold for code coverage is met.
  • added integration tests.
  • been tested in a test Druid cluster.

@github-actions github-actions bot added Area - Batch Ingestion Area - MSQ For multi stage queries - https://github.com/apache/druid/issues/12262 labels Sep 12, 2024
@Akshat-Jain
Copy link
Contributor Author

Based on offline discussion, we can wait for GlueingPartitioningOperator to be used, which would make it return a proper error message. So I'll close this PR.

@Akshat-Jain Akshat-Jain reopened this Sep 19, 2024
@Akshat-Jain
Copy link
Contributor Author

We had some queries like the following giving incorrect results, because the MVDs (List object) were getting converted to string:

image

As per offline discussion with @sreemanamala, have opened this PR and have added another check in RearrangedRowsAndColumns.

This is a temporary measure and would be removed once we have the GlueingPartitioningOperator approach as mentioned previously.

@abhishekagarwal87 abhishekagarwal87 modified the milestones: 30.0.1, 31.0.0 Sep 19, 2024
Copy link
Member

@kgyrtkirk kgyrtkirk left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'm not sure how frequently people will run into this - but the updates which supposed to be made to this part will make this check as well go away.
Let's have it for now

@@ -128,7 +130,16 @@ public boolean isNull(int rowNum)
@Override
public Object getObject(int rowNum)
{
return accessor.getObject(pointers[start + rowNum]);
Object value = accessor.getObject(pointers[start + rowNum]);
if (ColumnType.STRING.equals(getType()) && value instanceof List) {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Since we require this check for druid 31, we can live with this "hackery". Once #17038 gets merged, please remove this check.
cc @Akshat-Jain

@cryptoe cryptoe merged commit 40414cf into apache:master Sep 23, 2024
90 checks passed
Akshat-Jain added a commit to Akshat-Jain/druid that referenced this pull request Sep 23, 2024
)

* MSQ window functions: Reject MVDs during window processing

* MSQ window functions: Reject MVDs during window processing

* Remove parameterization from MSQWindowTest
abhishekagarwal87 pushed a commit that referenced this pull request Sep 25, 2024
…17127)

* MSQ window functions: Reject MVDs during window processing

* MSQ window functions: Reject MVDs during window processing

* Remove parameterization from MSQWindowTest
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Area - Batch Ingestion Area - MSQ For multi stage queries - https://github.com/apache/druid/issues/12262
Projects
None yet
Development

Successfully merging this pull request may close these issues.

5 participants