Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Change result type of concat varchar function to bounded varchar #12922

Closed
wants to merge 3 commits into from

Conversation

ebyhr
Copy link
Member

@ebyhr ebyhr commented Jun 22, 2022

Description

Change result type of concat varchar function to bounded varchar
Fixes #12827

Documentation

( ) No documentation is needed.
( ) Sufficient documentation is included in this PR.
( ) Documentation PR is available with #prnumber.
( ) Documentation issue #issuenumber is filed, and can be handled later.

Release notes

( ) No release notes entries required.
( ) Release notes entries required with the following suggested text:

# Section
* Fix some things. ({issue}`issuenumber`)

@cla-bot cla-bot bot added the cla-signed label Jun 22, 2022
@ebyhr ebyhr force-pushed the ebi/concat branch 3 times, most recently from fb4a2de to e15651c Compare June 22, 2022 08:15
@@ -1627,7 +1627,7 @@ public void testInvalidTypeInfixOperator()
// Comment on why error message references varchar(214783647) instead of varchar(2) which seems expected result type for concatenation in expression.
Copy link
Member

@findepi findepi Jun 22, 2022

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

this comment is outdated (apparently since long)

@findepi findepi requested a review from martint June 22, 2022 10:16
@ebyhr ebyhr requested a review from kasiafi June 24, 2022 02:34
Comment on lines 61 to 62
.longVariable("x", "min(2147483647, TOTAL_LONG_LITERAL_SIZE)")
.returnType(parseTypeSignature("varchar(x)", ImmutableSet.of("x")))
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

would it be possible to populate the return type in a functional manner?
like

.returnType(argumentTypes -> { .... })

I know this would harm equivalence on Signature objects (Signature defines equals and there is no strict equality for functions). However, Signature seems to be used as a map key only for non-calculated signatures (at least here

if (implementation.getSignature().getTypeVariableConstraints().isEmpty()
&& implementation.getSignature().getArgumentTypes().stream().noneMatch(TypeSignature::isCalculated)
&& !implementation.getSignature().getReturnType().isCalculated()) {
exactImplementations.put(implementation.getSignature(), implementation);
}
else if (implementation.hasSpecializedTypeParameters()) {
specializedImplementations.add(implementation);
}
else {
genericImplementations.add(implementation);
). So maybe the equivalence-related problem is solvable.

@dain @martint @losipiuk @sopel39 thoughts?

@ebyhr ebyhr force-pushed the ebi/concat branch 2 times, most recently from b8eb947 to 63ed2ef Compare July 6, 2022 02:23
@ebyhr ebyhr marked this pull request as draft July 6, 2022 03:02
@ebyhr ebyhr marked this pull request as ready for review July 6, 2022 06:10
@ebyhr ebyhr requested a review from martint July 6, 2022 06:11
@ebyhr
Copy link
Member Author

ebyhr commented Jul 19, 2022

Updated to functional style. A downside seems we have to provide dummy return type because such information is required for SHOW FUNCTIONS result.

Comment on lines 100 to 101
@JsonProperty
public Optional<Function<List<TypeSignature>, TypeSignature>> getFunctionalReturnType()
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

a Function can unlikely be json-serializable?

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@ebyhr i think this is a blocker for me, unless Signature is not actually getting serialized.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think so too. I'm wondering when json (de)serialization happens for Signature class.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I don't know that, but you should be able to find out easily, by removing the @Json* annotations from the class and running tests with DistributedQueryRunner

@findepi
Copy link
Member

findepi commented Aug 3, 2022

@martint can you please take a look?

@findepi
Copy link
Member

findepi commented Sep 7, 2022

It looks like this PR conflicted with @dain 's #12588 that was merged recently.

@ebyhr
Copy link
Member Author

ebyhr commented Sep 8, 2022

Just rebased on upstream to resolve conflicts.

@findepi
Copy link
Member

findepi commented Sep 9, 2022

Looks like the only test failing after Remove json annotations from Signature is TestSignature.testSerializationRoundTrip.
So it looks like it doesn't need to be json-serializable.

cc @dain

public TypeSignature getReturnType()
{
return returnType;
}

@JsonProperty
public Optional<Function<List<TypeSignature>, TypeSignature>> getFunctionalReturnType()
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

What's a "functional return type"?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It's a method to allow building "return type" (e.g. a + b) from "function arguments" (e.g. concat(a, b)). The name might be confusing. Do you have any suggestion?

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

getReturnTypeDerivation ?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks for your suggestion, changed to getReturnTypeDerivation.

@ebyhr ebyhr force-pushed the ebi/concat branch 2 times, most recently from 9976c86 to 2659241 Compare September 10, 2022 00:47
@ebyhr ebyhr requested a review from findepi September 12, 2022 02:11
*/
@Deprecated
@JsonCreator
public static Signature fromJson(
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

cc @dain

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This method is here on purpose to make it clear that this method is only here for Jasckson json parsing because we disable access to private constructors when we use JSON. If the parsing works when this is removed, it implies that the code deserializing the JSON has a misconfigured parser.

@@ -241,6 +254,12 @@ public Builder returnType(TypeSignature returnType)
return this;
}

public Builder returnTypeDerivation(Optional<Function<List<TypeSignature>, TypeSignature>> returnTypeDerivation)
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This could accept without Optional.

// still required for SHOW FUNCTIONS result
.longVariable("x", "1")
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

what does the SHOW FUNCTIONS print now?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It's varchar(x) after this change


// still required for SHOW FUNCTIONS result
.longVariable("x", "1")
.returnType(new TypeSignature("varchar", typeVariable("x")))
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Can we use unbounded varchar as a "fake" return type?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Probably, it requires additional change. Below exception happens after changing to returnType(VARCHAR) without longVariable.

Bound signature concat(varchar, varchar):varchar(3) does not match declared signature concat(varchar):varchar
java.lang.IllegalArgumentException: Bound signature concat(varchar, varchar):varchar(3) does not match declared signature concat(varchar):varchar
	at com.google.common.base.Preconditions.checkArgument(Preconditions.java:435)
	at io.trino.metadata.SignatureBinder.verifyBoundSignature(SignatureBinder.java:332)
	at io.trino.metadata.SignatureBinder.extractBoundVariables(SignatureBinder.java:306)
	at io.trino.metadata.SignatureBinder.bindFunction(SignatureBinder.java:251)
	at io.trino.metadata.FunctionResolver.toFunctionBinding(FunctionResolver.java:127)
	at io.trino.metadata.FunctionResolver.matchFunction(FunctionResolver.java:258)
	at io.trino.metadata.FunctionResolver.matchFunctionWithCoercion(FunctionResolver.java:233)
	at io.trino.metadata.FunctionResolver.resolveFunction(FunctionResolver.java:174)
	at io.trino.metadata.MetadataManager.resolvedFunctionInternal(MetadataManager.java:2115)
	at io.trino.metadata.MetadataManager.lambda$resolvedFunctionInternal$49(MetadataManager.java:2110)
	at java.base/java.util.Optional.orElseGet(Optional.java:364)
	at io.trino.metadata.MetadataManager.resolvedFunctionInternal(MetadataManager.java:2110)
	at io.trino.metadata.MetadataManager.resolveFunction(MetadataManager.java:2081)
	at io.trino.sql.analyzer.ExpressionAnalyzer$Visitor.visitFunctionCall(ExpressionAnalyzer.java:1258)
	at io.trino.sql.analyzer.ExpressionAnalyzer$Visitor.visitFunctionCall(ExpressionAnalyzer.java:581)
	at io.trino.sql.tree.FunctionCall.accept(FunctionCall.java:121)
	at io.trino.sql.tree.StackableAstVisitor.process(StackableAstVisitor.java:27)
	at io.trino.sql.analyzer.ExpressionAnalyzer$Visitor.process(ExpressionAnalyzer.java:604)
	at io.trino.sql.analyzer.ExpressionAnalyzer.analyze(ExpressionAnalyzer.java:485)
	at io.trino.sql.analyzer.ExpressionAnalyzer.analyzeExpression(ExpressionAnalyzer.java:3441)
	at io.trino.sql.analyzer.StatementAnalyzer$Visitor.analyzeExpression(StatementAnalyzer.java:4075)
	at io.trino.sql.analyzer.StatementAnalyzer$Visitor.analyzeSelectSingleColumn(StatementAnalyzer.java:3887)
	at io.trino.sql.analyzer.StatementAnalyzer$Visitor.analyzeSelect(StatementAnalyzer.java:3673)
	at io.trino.sql.analyzer.StatementAnalyzer$Visitor.visitQuerySpecification(StatementAnalyzer.java:2436)
	at io.trino.sql.analyzer.StatementAnalyzer$Visitor.visitQuerySpecification(StatementAnalyzer.java:457)
	at io.trino.sql.tree.QuerySpecification.accept(QuerySpecification.java:155)
	at io.trino.sql.tree.AstVisitor.process(AstVisitor.java:27)
	at io.trino.sql.analyzer.StatementAnalyzer$Visitor.process(StatementAnalyzer.java:474)
	at io.trino.sql.analyzer.StatementAnalyzer$Visitor.process(StatementAnalyzer.java:482)
	at io.trino.sql.analyzer.StatementAnalyzer$Visitor.visitQuery(StatementAnalyzer.java:1394)
	at io.trino.sql.analyzer.StatementAnalyzer$Visitor.visitQuery(StatementAnalyzer.java:457)
	at io.trino.sql.tree.Query.accept(Query.java:107)
	at io.trino.sql.tree.AstVisitor.process(AstVisitor.java:27)
	at io.trino.sql.analyzer.StatementAnalyzer$Visitor.process(StatementAnalyzer.java:474)
	at io.trino.sql.analyzer.StatementAnalyzer.analyze(StatementAnalyzer.java:436)
	at io.trino.sql.analyzer.Analyzer.analyze(Analyzer.java:79)
	at io.trino.sql.analyzer.Analyzer.analyze(Analyzer.java:71)
	at io.trino.execution.SqlQueryExecution.analyze(SqlQueryExecution.java:261)
	at io.trino.execution.SqlQueryExecution.<init>(SqlQueryExecution.java:199)
	at io.trino.execution.SqlQueryExecution$SqlQueryExecutionFactory.createQueryExecution(SqlQueryExecution.java:825)
	at io.trino.dispatcher.LocalDispatchQueryFactory.lambda$createDispatchQuery$0(LocalDispatchQueryFactory.java:135)
	at io.trino.$gen.Trino_testversion____20220914_081840_71.call(Unknown Source)
	at com.google.common.util.concurrent.TrustedListenableFutureTask$TrustedFutureInterruptibleTask.runInterruptibly(TrustedListenableFutureTask.java:131)
	at com.google.common.util.concurrent.InterruptibleTask.run(InterruptibleTask.java:74)
	at com.google.common.util.concurrent.TrustedListenableFutureTask.run(TrustedListenableFutureTask.java:82)
	at java.base/java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1136)
	at java.base/java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:635)
	at java.base/java.lang.Thread.run(Thread.java:833)

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It doesn't make sense to allow a function definition to declare a return type and a returnTypeDerivation at the same time:

  • If they are not the same value, which one matters?
  • If they are the same value, why have the returnTypeDerivation at all?

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

What should be the return type printed by SHOW FUNCTIONS?

throw new IllegalArgumentException("Unsupported type: " + type);
}

private static TypeSignature calculateReturnType(List<TypeSignature> typeSignatures)
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

calculateVarcharConcatReturnType

@ebyhr
Copy link
Member Author

ebyhr commented Sep 25, 2022

@dain Gentle reminder.

@findepi
Copy link
Member

findepi commented Sep 26, 2022

@ebyhr there is a CI failure

@ebyhr
Copy link
Member Author

ebyhr commented Sep 27, 2022

I will merge this PR tomorrow if there's no objection.
cc: @dain @martint

@martint
Copy link
Member

martint commented Sep 27, 2022

The concern I have is that this lets functions define how the result type is derived and completely bypass the type system. It's possible that this will make it harder to improve the type inferencer in the future without breaking backward compatibility, especially if anyone uses this feature to derive a return type that is not what the inference engine would naturally produce if it was capable of doing so.

@ebyhr
Copy link
Member Author

ebyhr commented Nov 8, 2022

Just rebased on upstream to resolve conflicts.

@ebyhr
Copy link
Member Author

ebyhr commented Nov 8, 2022

@martint Do you have any alternative solution for improving concat result type?

@ebyhr
Copy link
Member Author

ebyhr commented Dec 13, 2022

@martint Gentle reminder.

@ebyhr
Copy link
Member Author

ebyhr commented Jan 10, 2023

@martint Gentle reminder.

ebyhr added 3 commits January 26, 2023 08:12
All tests passed except for TestSignature when removed
json annotations from Signature class. TestSignature
was a unit test verifying json serialization and
didn't contain the actual usage. Removing the annotation
and test should be fine.
@ebyhr
Copy link
Member Author

ebyhr commented Jan 25, 2023

Rebased on upstream to resolve conflicts.

Copy link
Member

@dain dain left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I don't understand why this change. If we want concat(varchar(3), varchar(5) to return varchar(8) and concat(varchar, varchar(5) to be unbounded, we should work on the type calculation system to system to support that exact calculation instead of giving functions the ability to opt out of the type system.

*/
@Deprecated
@JsonCreator
public static Signature fromJson(
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This method is here on purpose to make it clear that this method is only here for Jasckson json parsing because we disable access to private constructors when we use JSON. If the parsing works when this is removed, it implies that the code deserializing the JSON has a misconfigured parser.

import static io.trino.type.InternalTypeManager.TESTING_TYPE_MANAGER;
import static org.testng.Assert.assertEquals;

public class TestSignature
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Why would we remove the test that verifies the signature round trips?

@ebyhr ebyhr closed this Feb 20, 2023
@ebyhr ebyhr deleted the ebi/concat branch February 20, 2023 00:22
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Development

Successfully merging this pull request may close these issues.

Calculate varchar bounds when concatenating two bounded varchars
4 participants