Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Properly process Final Selection in Reduce(). #2408

Closed
wants to merge 1 commit into from

Conversation

spershin
Copy link
Contributor

Summary:
In Reduce() we ignored the fact that Final Selection might have been false before we set it to false ourselves.
In that case we need to use different SelectivityVector.

Differential Revision: D39121620

@netlify
Copy link

netlify bot commented Aug 30, 2022

Deploy Preview for meta-velox canceled.

Name Link
🔨 Latest commit 657734d
🔍 Latest deploy log https://app.netlify.com/sites/meta-velox/deploys/6317ba7a0872310008ccd416

@facebook-github-bot facebook-github-bot added CLA Signed This label is managed by the Facebook bot. Authors need to sign the CLA before a PR can be reviewed. fb-exported labels Aug 30, 2022
@facebook-github-bot
Copy link
Contributor

This pull request was exported from Phabricator. Differential Revision: D39121620

Copy link
Contributor

@mbasmanova mbasmanova left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@spershin Thank you for the fix. Some comments and questions below.

@@ -104,6 +104,8 @@ class ReduceFunction : public exec::VectorFunction {
});

// Fix finalSelection at "rows" unless already fixed.
const SelectivityVector& finalSelectionRows =
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This code can be simplified if it is moved after the finalSelection and isFinalSelection.

const auto& finalSelectionRows = *context->finalSelection();

std::unique_ptr<exec::ExprSet> compileExpression(
const std::string& expr,
const RowTypePtr& rowType) {
std::vector<std::shared_ptr<const core::ITypedExpr>> expressions = {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

nit: use TypedExprPtr

@@ -23,14 +23,6 @@ using namespace facebook::velox::test;

class MapFilterTest : public functions::test::FunctionBaseTest {
protected:
std::unique_ptr<exec::ExprSet> compileExpression(
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thank you for refactoring.

@@ -123,3 +124,48 @@ TEST_F(ReduceTest, conditional) {
nullEvery(11));
assertEqualVectors(expectedResult, result);
}

TEST_F(ReduceTest, lambdaSelectivityVector) {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Does this test fail before the fix?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Good question. Unfortunately - not.
I only made sure that we hit the new code path.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

To reproduce the problem we need to

  1. make it so that context->finalSelection() is larger than 'rows'.
  2. use outputFunction that relies of EvalCtx::moveOrCopyResult

To satisfy (1), we can wrap reduce call in an IF statement. To satisfy (2), we can use row_constructor.

Here is an example of a test that fails without a fix and succeeds with the fix.

TEST_F(ReduceTest, finalSelection) {
  vector_size_t size = 1'000;
  auto inputArray = makeArrayVector<int64_t>(
      size,
      modN(5),
      [](auto row, auto index) { return row + index; },
      nullEvery(11));
  auto input = makeRowVector({
      inputArray,
      makeFlatVector<int64_t>(
          size, [](auto row) { return row; }, nullEvery(11)),
  });
  registerLambda(
      "sum_input",
      rowType("s", BIGINT(), "x", BIGINT()),
      input->type(),
      "s + x");
  registerLambda(
      "row_output",
      rowType("s", BIGINT()),
      input->type(),
      "row_constructor(s)");

  auto result = evaluate<RowVector>(
      "if (c1 < 100, row_constructor(c1), reduce(c0, 10, function('sum_input'), function('row_output')))",
      input);

  auto expectedResult = makeRowVector({makeFlatVector<int64_t>(
      size,
      [](auto row) -> int64_t {
        if (row < 100) {
          return row;
        } else {
          int64_t sum = 10;
          for (auto i = 0; i < row % 5; i++) {
            sum += row + i;
          }
          return sum;
        }
      },
      nullEvery(11))});
  assertEqualVectors(expectedResult, result);
}

The failure before the fix is:

/Users/mbasmanova/cpp/velox-1/velox/vector/tests/VectorTestBase.cpp:98: Failure
Value of: expected->equalValueAt(actual.get(), i, i)
  Actual: false
Expected: true
at 1: expected {1}, but got {-6799976246779207263}

Notice that the fix is needed only for the "Apply output function" code. The code that applies input function doesn't need the fix.

Final selection is used for two purposes: (1) ensure lazy vectors are loaded for all the necessary rows; (2) ensure that pre-computed results are preserved. (1) doesn't apply when evaluating lambda functions since all inputs have already been loaded. (2) doesn't apply to "apply input function" here because the partialResult is created from scratch in this function and therefore doesn't have any useful data outside of 'rows'.

Hope this helps.

CC: @bikramSingh91

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks, @mbasmanova, this is very helpful.
No way any time soon I would try to use row_constructor() or anything like that to reproduce the issue.
I was thinking about if(), but that alone wouldn't reproduce.

spershin pushed a commit to spershin/velox-1 that referenced this pull request Aug 31, 2022
Summary:
Pull Request resolved: facebookincubator#2408

In Reduce() we ignored the fact that Final Selection might have been false before we set it to false ourselves.
In that case we need to use different SelectivityVector.

Differential Revision: D39121620

fbshipit-source-id: 8a1a715a2d7e4efe9cdc9980c58aeae3b172ab5d
@facebook-github-bot
Copy link
Contributor

This pull request was exported from Phabricator. Differential Revision: D39121620

Summary:
Pull Request resolved: facebookincubator#2408

In Reduce() we ignored the fact that Final Selection might have been false before we set it to false ourselves.
In that case we need to use different SelectivityVector.

Reviewed By: mbasmanova

Differential Revision: D39121620

fbshipit-source-id: 2a996f8d00f06e8741fef10564f792b2ea3b6e66
@facebook-github-bot
Copy link
Contributor

This pull request was exported from Phabricator. Differential Revision: D39121620

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
CLA Signed This label is managed by the Facebook bot. Authors need to sign the CLA before a PR can be reviewed. fb-exported
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants