-
Notifications
You must be signed in to change notification settings - Fork 1.2k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Properly process Final Selection in Reduce(). #2408
Conversation
✅ Deploy Preview for meta-velox canceled.
|
This pull request was exported from Phabricator. Differential Revision: D39121620 |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@spershin Thank you for the fix. Some comments and questions below.
velox/functions/prestosql/Reduce.cpp
Outdated
@@ -104,6 +104,8 @@ class ReduceFunction : public exec::VectorFunction { | |||
}); | |||
|
|||
// Fix finalSelection at "rows" unless already fixed. | |||
const SelectivityVector& finalSelectionRows = |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This code can be simplified if it is moved after the finalSelection and isFinalSelection.
const auto& finalSelectionRows = *context->finalSelection();
std::unique_ptr<exec::ExprSet> compileExpression( | ||
const std::string& expr, | ||
const RowTypePtr& rowType) { | ||
std::vector<std::shared_ptr<const core::ITypedExpr>> expressions = { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
nit: use TypedExprPtr
@@ -23,14 +23,6 @@ using namespace facebook::velox::test; | |||
|
|||
class MapFilterTest : public functions::test::FunctionBaseTest { | |||
protected: | |||
std::unique_ptr<exec::ExprSet> compileExpression( |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thank you for refactoring.
@@ -123,3 +124,48 @@ TEST_F(ReduceTest, conditional) { | |||
nullEvery(11)); | |||
assertEqualVectors(expectedResult, result); | |||
} | |||
|
|||
TEST_F(ReduceTest, lambdaSelectivityVector) { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Does this test fail before the fix?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Good question. Unfortunately - not.
I only made sure that we hit the new code path.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
To reproduce the problem we need to
- make it so that context->finalSelection() is larger than 'rows'.
- use outputFunction that relies of EvalCtx::moveOrCopyResult
To satisfy (1), we can wrap reduce call in an IF statement. To satisfy (2), we can use row_constructor.
Here is an example of a test that fails without a fix and succeeds with the fix.
TEST_F(ReduceTest, finalSelection) {
vector_size_t size = 1'000;
auto inputArray = makeArrayVector<int64_t>(
size,
modN(5),
[](auto row, auto index) { return row + index; },
nullEvery(11));
auto input = makeRowVector({
inputArray,
makeFlatVector<int64_t>(
size, [](auto row) { return row; }, nullEvery(11)),
});
registerLambda(
"sum_input",
rowType("s", BIGINT(), "x", BIGINT()),
input->type(),
"s + x");
registerLambda(
"row_output",
rowType("s", BIGINT()),
input->type(),
"row_constructor(s)");
auto result = evaluate<RowVector>(
"if (c1 < 100, row_constructor(c1), reduce(c0, 10, function('sum_input'), function('row_output')))",
input);
auto expectedResult = makeRowVector({makeFlatVector<int64_t>(
size,
[](auto row) -> int64_t {
if (row < 100) {
return row;
} else {
int64_t sum = 10;
for (auto i = 0; i < row % 5; i++) {
sum += row + i;
}
return sum;
}
},
nullEvery(11))});
assertEqualVectors(expectedResult, result);
}
The failure before the fix is:
/Users/mbasmanova/cpp/velox-1/velox/vector/tests/VectorTestBase.cpp:98: Failure
Value of: expected->equalValueAt(actual.get(), i, i)
Actual: false
Expected: true
at 1: expected {1}, but got {-6799976246779207263}
Notice that the fix is needed only for the "Apply output function" code. The code that applies input function doesn't need the fix.
Final selection is used for two purposes: (1) ensure lazy vectors are loaded for all the necessary rows; (2) ensure that pre-computed results are preserved. (1) doesn't apply when evaluating lambda functions since all inputs have already been loaded. (2) doesn't apply to "apply input function" here because the partialResult is created from scratch in this function and therefore doesn't have any useful data outside of 'rows'.
Hope this helps.
CC: @bikramSingh91
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks, @mbasmanova, this is very helpful.
No way any time soon I would try to use row_constructor() or anything like that to reproduce the issue.
I was thinking about if(), but that alone wouldn't reproduce.
ad2b059
to
bf3d632
Compare
Summary: Pull Request resolved: facebookincubator#2408 In Reduce() we ignored the fact that Final Selection might have been false before we set it to false ourselves. In that case we need to use different SelectivityVector. Differential Revision: D39121620 fbshipit-source-id: 8a1a715a2d7e4efe9cdc9980c58aeae3b172ab5d
This pull request was exported from Phabricator. Differential Revision: D39121620 |
Summary: Pull Request resolved: facebookincubator#2408 In Reduce() we ignored the fact that Final Selection might have been false before we set it to false ourselves. In that case we need to use different SelectivityVector. Reviewed By: mbasmanova Differential Revision: D39121620 fbshipit-source-id: 2a996f8d00f06e8741fef10564f792b2ea3b6e66
bf3d632
to
657734d
Compare
This pull request was exported from Phabricator. Differential Revision: D39121620 |
Summary:
In Reduce() we ignored the fact that Final Selection might have been false before we set it to false ourselves.
In that case we need to use different SelectivityVector.
Differential Revision: D39121620