Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[VL] Support Decimal type in Gluten #1176

Merged
merged 3 commits into from
Mar 28, 2023

Conversation

JkSelf
Copy link
Contributor

@JkSelf JkSelf commented Mar 22, 2023

What changes were proposed in this pull request?

Co-working by @JkSelf @liujiayi771 @jinchengchenghh @rui-mo

How was this patch tested?

Verified TPC-DS 103 queries locally.

test by CH[[378]]

@github-actions
Copy link

Thanks for opening a pull request!

Could you open an issue for this pull request on Github Issues?

https://github.com/oap-project/gluten/issues

Then could you also rename commit message and pull request title in the following format?

[Gluten-${ISSUES_ID}] ${detailed message}

See also:


import org.apache.spark.sql.catalyst.expressions.Expression

class UnscaledValueTransformer(
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Please move this transformer to UnaryExpressionTransformer.scala

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

it can reuse UnaryExpressionTransformer and don't need to create a new one, right ?

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yes

case class UnscaledValue(child: Expression) extends UnaryExpression

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@liujiayi771 Can you help to resolve this comment? Thanks.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I have removed these codes.

@zhouyuan
Copy link
Contributor

@zzcclp This patch enabled several decimal related expressions, CK backend ran into seg faults in some tests. Can you take a look?

23/03/23 01:10:40 ERROR Executor: Exception in task 1.0 in stage 56.0 (TID 113)
java.lang.RuntimeException: check_overflow function requires two args.
0. Poco::Exception::Exception(std::__1::basic_string<char, std::__1::char_traits<char>, std::__1::allocator<char>> const&, int) @ 0x16758bfa in /tmp/libch.so
1. DB::Exception::Exception(DB::Exception::MessageMasked&&, int, bool) @ 0x11c06a95 in /tmp/libch.so
2. DB::Exception::Exception<char const (&) [43], void>(int, char const (&) [43]) @ 0xac96e16 in /tmp/libch.so
3. local_engine::SerializedPlanParser::getFunctionName(std::__1::basic_string<char, std::__1::char_traits<char>, std::__1::allocator<char>> const&, substrait::Expression_ScalarFunction const&) @ 0xaca0a11 in /tmp/libch.so
4. local_engine::SerializedPlanParser::parseFunctionWithDAG(substrait::Expression const&, std::__1::basic_string<char, std::__1::char_traits<char>, std::__1::allocator<char>>&, std::__1::vector<std::__1::basic_string<char, std::__1::char_traits<char>, std::__1::allocator<char>>, std::__1::allocator<std::__1::basic_string<char, std::__1::char_traits<char>, std::__1::allocator<char>>>>&, std::__1::shared_ptr<DB::ActionsDAG>, bool) @ 0xaca9f08 in /tmp/libch.so
5. local_engine::SerializedPlanParser::parseFunctionArgument(std::__1::shared_ptr<DB::ActionsDAG>&, std::__1::vector<std::__1::basic_string<char, std::__1::char_traits<char>, std::__1::allocator<char>>, std::__1::allocator<std::__1::basic_string<char, std::__1::char_traits<char>, std::__1::allocator<char>>>>&, std::__1::basic_string<char, std::__1::char_traits<char>, std::__1::allocator<char>> const&, substrait::FunctionArgument const&) @ 0xacbba4d in /tmp/libch.so
6. local_engine::SerializedPlanParser::parseFunctionArguments(std::__1::shared_ptr<DB::ActionsDAG>&, std::__1::vector<DB::ActionsDAG::Node const*, std::__1::allocator<DB::ActionsDAG::Node const*>>&, std::__1::vector<std::__1::basic_string<char, std::__1::char_traits<char>, std::__1::allocator<char>>, std::__1::allocator<std::__1::basic_string<char, std::__1::char_traits<char>, std::__1::allocator<char>>>>&, std::__1::basic_string<char, std::__1::char_traits<char>, std::__1::allocator<char>> const&, substrait::Expression_ScalarFunction const&) @ 0xacb87ef in /tmp/libch.so
7. local_engine::SerializedPlanParser::parseFunctionWithDAG(substrait::Expression const&, std::__1::basic_string<char, std::__1::char_traits<char>, std::__1::allocator<char>>&, std::__1::vector<std::__1::basic_string<char, std::__1::char_traits<char>, std::__1::allocator<char>>, std::__1::allocator<std::__1::basic_string<char, std::__1::char_traits<char>, std::__1::allocator<char>>>>&, std::__1::shared_ptr<DB::ActionsDAG>, bool) @ 0xaca9f46 in /tmp/libch.so
8. local_engine::SerializedPlanParser::parseFunction(DB::Block const&, substrait::Expression const&, std::__1::basic_string<char, std::__1::char_traits<char>, std::__1::allocator<char>>&, std::__1::vector<std::__1::basic_string<char, std::__1::char_traits<char>, std::__1::allocator<char>>, std::__1::allocator<std::__1::basic_string<char, std::__1::char_traits<char>, std::__1::allocator<char>>>>&, std::__1::shared_ptr<DB::ActionsDAG>, bool) @ 0xaca1047 in /tmp/libch.so
9. local_engine::SerializedPlanParser::expressionsToActionsDAG(std::__1::vector<substrait::Expression, std::__1::allocator<substrait::Expression>> const&, DB::Block const&, DB::Block const&) @ 0xac9e81d in /tmp/libch.so
10. local_engine::SerializedPlanParser::parseOp(substrait::Rel const&, std::__1::list<substrait::Rel const*, std::__1::allocator<substrait::Rel const*>>&) @ 0xacadf8e in /tmp/libch.so
11. local_engine::SerializedPlanParser::parse(std::__1::unique_ptr<substrait::Plan, std::__1::default_delete<substrait::Plan>>) @ 0xacac73e in /tmp/libch.so
12. local_engine::SerializedPlanParser::parse(std::__1::basic_string<char, std::__1::char_traits<char>, std::__1::allocator<char>> const&) @ 0xacbc89c in /tmp/libch.so

@zzcclp
Copy link
Contributor

zzcclp commented Mar 23, 2023

@taiyang-li please help to review this pr, it's about decimal type support

val decimalType = datatype.asInstanceOf[DecimalType]
val precision = decimalType.precision
val scale = decimalType.scale
typedFuncName.concat("dec<" + precision + "," + scale + ">")
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It seems like needs to support this name for ch backend. @taiyang-li

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

ok, let's do it

@taiyang-li
Copy link
Contributor

@JkSelf there are two issues related to clickhouse backend AFAIK:

  • the new adding functions make_decimal/unscaled_value is not supported by clickhouse yet, we can insert them into CH_EXPR_BLACKLIST_TYPE_EXISTS in CHExpressionUtil.scala
  • CheckOverflowTransformer is not actually working in ExpressionConverter, maybe we can extract it out of UnaryExpressionTransformer.apply and add it directly in ExpressionConverter. May confirm if velox need it ? cc @loneylee

val expressionNodes =
Lists.newArrayList(childNode, new BooleanLiteralNode(original.nullOnOverflow))
Lists.newArrayList(childNode, toTypeNodes, new BooleanLiteralNode(original.nullOnOverflow))
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Why do you need to pass the parameter toTypeNodes, there is already a return type in the substrait plan.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@jinchengchenghh Can you help to check whether we can use the return type directly in the substrait plan?

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is from CheckOverflow constructor, velox should use it to match function signature, and decide if it is ShortDecimal or LongDecimal, so as MakeDecimal. This is a special case, the input argument is same with return type, but I think we should not do some exceptional thing when convert substrait plan to velox plan

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@JkSelf @jinchengchenghh Maybe we can do this special logic only when backend is velox, or just append toTypeNodes after nullOnOverflow instead before it. Currently CH didn't need toTypeNodes, those changes would cause an incompatiability issue as in CH uts.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

OK, I will move it after nullOnOverflow

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Please add comment 'test by CH[[378]]'.
Kyligence/ClickHouse#378

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Can you merge this one Kyligence/ClickHouse#378? So this PR can pass CH CI

@JkSelf
Copy link
Contributor Author

JkSelf commented Mar 23, 2023

CheckOverflowTransformer is not actually working in ExpressionConverter, maybe we can extract it out of UnaryExpressionTransformer.apply and add it directly in ExpressionConverter. May confirm if velox need it ? cc @loneylee

@jinchengchenghh Can you help to look this suggestion about CheckOverflowTransformer . Thanks.

@@ -41,7 +41,9 @@ object CHExpressionUtil {
SPLIT_PART -> Set(EMPTY_TYPE),
TO_UNIX_TIMESTAMP -> Set(DATE_TYPE),
UNIX_TIMESTAMP -> Set(DATE_TYPE),
MIGHT_CONTAIN -> Set(EMPTY_TYPE)
MIGHT_CONTAIN -> Set(EMPTY_TYPE),
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@taiyang-li Please help to review. Thanks.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM

@jinchengchenghh
Copy link
Contributor

@JkSelf there are two issues related to clickhouse backend AFAIK:

  • the new adding functions make_decimal/unscaled_value is not supported by clickhouse yet, we can insert them into CH_EXPR_BLACKLIST_TYPE_EXISTS in CHExpressionUtil.scala
  • CheckOverflowTransformer is not actually working in ExpressionConverter, maybe we can extract it out of UnaryExpressionTransformer.apply and add it directly in ExpressionConverter. May confirm if velox need it ? cc @loneylee

For the 2,

case class CheckOverflow(
    child: Expression,
    dataType: DecimalType,
    nullOnOverflow: Boolean) extends UnaryExpression

It extends UnaryExpression, so it should place here, and it is really common used in velox

@zhouyuan zhouyuan added the velox backend works for Velox backend label Mar 23, 2023
@JkSelf JkSelf force-pushed the decimal-test-latest-velox branch 2 times, most recently from bde4e51 to 14e89e0 Compare March 27, 2023 13:58
@JkSelf
Copy link
Contributor Author

JkSelf commented Mar 28, 2023

@zzcclp @taiyang-li
There are three failed uts in CH backend, can you help to take a look? Thanks.

@loneylee
Copy link
Member

@zzcclp @taiyang-li There are three failed uts in CH backend, can you help to take a look? Thanks.

Add comment 'test by CH[[378]]'

@JkSelf
Copy link
Contributor Author

JkSelf commented Mar 28, 2023

test by CH[[378]]

@loneylee
Copy link
Member

@JkSelf
CH backend fixed errors, Please try test again.

zhouyuan
zhouyuan previously approved these changes Mar 28, 2023
Copy link
Contributor

@zhouyuan zhouyuan left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

👍

Copy link
Contributor

@zhouyuan zhouyuan left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

👍 if CI is green

@JkSelf JkSelf merged commit 59a6f3f into apache:main Mar 28, 2023
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
velox backend works for Velox backend
Projects
None yet
Development

Successfully merging this pull request may close these issues.

7 participants