[Experiment] what if reorder compare of columns in Merge #63780

UnamedRus · 2024-05-14T15:36:25Z

Changelog category (leave one):

Performance Improvement

Changelog entry (a user-readable short description of the changes that goes to CHANGELOG.md):

Start to compare rows from most likely unequal columns first.

Documentation entry for user-facing changes

Documentation is written (mandatory for new features)

Information about CI checks: https://clickhouse.com/docs/en/development/continuous-integration/

Modify your CI run

NOTE: If your merge the PR with modified CI you MUST KNOW what you are doing
NOTE: Checked options will be applied if set before CI RunConfig/PrepareRunConfig step

Include tests (required builds will be added automatically):

Exclude tests:

Extra options:

do not test (only style check)
disable merge-commit (no merge from master before tests)
disable CI cache (job reuse)

Only specified batches in multi-batch jobs:

1
2
3
4

robot-ch-test-poll4 · 2024-05-14T22:24:03Z

This is an automated comment for commit c8af4a0 with description of existing statuses. It's updated for the latest CI running

❌ Click here to open a full report in a separate page

Check name	Description	Status
Integration tests	The integration tests report. In parenthesis the package type is given, and in square brackets are the optional part/total tests	❌ failure
Stateless tests	Runs stateless functional tests for ClickHouse binaries built in various configurations -- release, debug, with sanitizers, etc	❌ failure

Successful checks

Check name	Description	Status
AST fuzzer	Runs randomly generated queries to catch program errors. The build type is optionally given in parenthesis. If it fails, ask a maintainer for help	✅ success
Builds	There's no description for the check yet, please add it to tests/ci/ci_config.py:CHECK_DESCRIPTIONS	✅ success
ClickBench	Runs [ClickBench](https://github.com/ClickHouse/ClickBench/) with instant-attach table	✅ success
Compatibility check	Checks that clickhouse binary runs on distributions with old libc versions. If it fails, ask a maintainer for help	✅ success
Docker keeper image	The check to build and optionally push the mentioned image to docker hub	✅ success
Docker server image	The check to build and optionally push the mentioned image to docker hub	✅ success
Docs check	Builds and tests the documentation	✅ success
Fast test	Normally this is the first check that is ran for a PR. It builds ClickHouse and runs most of stateless functional tests, omitting some. If it fails, further checks are not started until it is fixed. Look at the report to see which tests fail, then reproduce the failure locally as described here	✅ success
Flaky tests	Checks if new added or modified tests are flaky by running them repeatedly, in parallel, with more randomization. Functional tests are run 100 times with address sanitizer, and additional randomization of thread scheduling. Integration tests are run up to 10 times. If at least once a new test has failed, or was too long, this check will be red. We don't allow flaky tests, read the doc	✅ success
Install packages	Checks that the built packages are installable in a clear environment	✅ success
Performance Comparison	Measure changes in query performance. The performance test report is described in detail here. In square brackets are the optional part/total tests	✅ success
Stateful tests	Runs stateful functional tests for ClickHouse binaries built in various configurations -- release, debug, with sanitizers, etc	✅ success
Stress test	Runs stateless functional tests concurrently from several clients to detect concurrency-related errors	✅ success
Style check	Runs a set of checks to keep the code style clean. If some of tests failed, see the related log from the report	✅ success
Unit tests	Runs the unit tests for different release types	✅ success
Upgrade check	Runs stress tests on server version from last release and then tries to upgrade it to the version from the PR. It checks if the new server can successfully startup without any errors, crashes or sanitizer asserts	✅ success

alexey-milovidov · 2024-09-26T05:29:23Z

Did it help?

UnamedRus · 2024-10-23T23:03:08Z

Did it help?

Doesn't seems so.
But may be, i was bad on making good test for that.

alexey-milovidov · 2024-10-23T23:30:41Z

We need at least one proof, even synthetic will be ok.

UnamedRus · 2024-12-14T14:10:00Z

Test with LC(String)

New 
SELECT count()
FROM test_reorder_columns_in_merge
FINAL
SETTINGS max_threads = 1

   ┌──count()─┐
1. │ 60000000 │ -- 60.00 million
   └──────────┘

1 row in set. Elapsed: 3.120 sec. Processed 59.91 million rows, 479.32 MB (19.20 million rows/s., 153.63 MB/s.)
Peak memory usage: 4.06 MiB.

Old
SELECT count()
FROM test_reorder_columns_in_merge
FINAL
SETTINGS max_threads = 1

   ┌──count()─┐
1. │ 60000000 │ -- 60.00 million
   └──────────┘

1 row in set. Elapsed: 4.134 sec. Processed 59.91 million rows, 479.32 MB (14.49 million rows/s., 115.94 MB/s.)
Peak memory usage: 4.06 MiB.

alexey-milovidov · 2024-12-15T05:33:28Z

src/Processors/Merges/Algorithms/RowRef.h

+    {
+        for (size_t i = 0; i < size; ++i)
+        {
+            auto col_number = (offset + i) % size;


I'm afraid of this division - change to something like if and reset to zero.

Not sure that simple reset to 0 will work

But something like, could.
But not sure that it's better. (and size_t cannot be negative, so different type)

if ((offset + i) == size) offset -= size; auto col_number = (offset + i);

UnamedRus added 2 commits May 14, 2024 15:32

little experiment

3987ab5

fix bug

bb5fde6

nikitamikhaylov added the can be tested Allows running workflows for external contributors label May 14, 2024

robot-ch-test-poll4 added the pr-performance Pull request with some performance improvements label May 14, 2024

UnamedRus added 4 commits May 15, 2024 17:56

Merge branch 'ClickHouse:master' into reorder-compare-columns-merge

1a7cbb2

newlines

ed0ad71

simplify

7f5d821

fix copypaste

1382b4e

alexey-milovidov added the close in a month if not active This will be closed in case of no information label Sep 26, 2024

UnamedRus added 6 commits December 12, 2024 17:23

commit test

22f8a57

Merge branch 'master' into reorder-compare-columns-merge

c53f426

fix style

f6d9eb3

Fix test

08a9392

fix test 2

125aa67

better test

657e25e

alexey-milovidov reviewed Dec 15, 2024

View reviewed changes

UnamedRus added 3 commits December 15, 2024 16:36

get rid of div

4221411

other approach

eee09d3

fix zero key columns tables

c8af4a0

UnamedRus requested a review from alexey-milovidov December 16, 2024 10:32

alexey-milovidov self-assigned this Dec 29, 2024

alexey-milovidov added this pull request to the merge queue Dec 29, 2024

Merged via the queue into ClickHouse:master with commit b0d8549 Dec 29, 2024
220 of 225 checks passed

robot-ch-test-poll2 added the pr-synced-to-cloud The PR is synced to the cloud repo label Dec 29, 2024

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

[Experiment] what if reorder compare of columns in Merge #63780

[Experiment] what if reorder compare of columns in Merge #63780

UnamedRus commented May 14, 2024 •

edited

Loading

robot-ch-test-poll4 commented May 14, 2024 •

edited by robot-clickhouse

Loading

alexey-milovidov commented Sep 26, 2024

UnamedRus commented Oct 23, 2024

alexey-milovidov commented Oct 23, 2024

UnamedRus commented Dec 14, 2024

alexey-milovidov Dec 15, 2024 •

edited

Loading

UnamedRus Dec 15, 2024 •

edited

Loading

[Experiment] what if reorder compare of columns in Merge #63780

[Experiment] what if reorder compare of columns in Merge #63780

Conversation

UnamedRus commented May 14, 2024 • edited Loading

Changelog category (leave one):

Changelog entry (a user-readable short description of the changes that goes to CHANGELOG.md):

Documentation entry for user-facing changes

Include tests (required builds will be added automatically):

Exclude tests:

Extra options:

Only specified batches in multi-batch jobs:

robot-ch-test-poll4 commented May 14, 2024 • edited by robot-clickhouse Loading

alexey-milovidov commented Sep 26, 2024

UnamedRus commented Oct 23, 2024

alexey-milovidov commented Oct 23, 2024

UnamedRus commented Dec 14, 2024

alexey-milovidov Dec 15, 2024 • edited Loading

Choose a reason for hiding this comment

UnamedRus Dec 15, 2024 • edited Loading

Choose a reason for hiding this comment

UnamedRus commented May 14, 2024 •

edited

Loading

robot-ch-test-poll4 commented May 14, 2024 •

edited by robot-clickhouse

Loading

alexey-milovidov Dec 15, 2024 •

edited

Loading

UnamedRus Dec 15, 2024 •

edited

Loading