Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Mixed shard repair reproducer #8435

Draft
wants to merge 3 commits into
base: master
Choose a base branch
from

Conversation

Deexie
Copy link

@Deexie Deexie commented Aug 26, 2024

Reproducer for mixed shard repair to choose the best solution for scylladb/scylladb#18269.

Sets up a 3-node cluster on AWS with 1TB of data and runs repair.

It will be run with jenkins with the following configurations:

@Deexie Deexie requested review from asias and denesb August 26, 2024 14:54
@Deexie Deexie force-pushed the mixed-shard-repair branch 6 times, most recently from 05d9461 to 3060c15 Compare August 27, 2024 14:52
@denesb
Copy link

denesb commented Aug 29, 2024

I am not familiar with the SCT code, but the description looks good to me.
Did you get a chance to run the test? How do the numbers look?

@Deexie Deexie force-pushed the mixed-shard-repair branch from 3060c15 to 7f38a55 Compare September 2, 2024 15:42
@Deexie
Copy link
Author

Deexie commented Sep 2, 2024

  • change instance type to i3.16xlarge

@Deexie Deexie force-pushed the mixed-shard-repair branch from 7f38a55 to 9502ed2 Compare September 3, 2024 12:50
@Deexie
Copy link
Author

Deexie commented Sep 3, 2024

  • change shards count

@Deexie Deexie force-pushed the mixed-shard-repair branch from 9502ed2 to d09d884 Compare September 4, 2024 07:21
@Deexie
Copy link
Author

Deexie commented Sep 4, 2024

  • change loaders instance
  • split data population

@Deexie Deexie force-pushed the mixed-shard-repair branch 4 times, most recently from 4b62220 to 13c631d Compare September 5, 2024 10:51
@Deexie
Copy link
Author

Deexie commented Sep 5, 2024

master-60-59-58
test duration: 1h51m
repair time: 936.2321102619171s (15min)
argus: https://argus.scylladb.com/test_runs?state=WyI4YTI3MGEyYS05OWE2LTQwYzYtODkxNS1lMzhiOGFiOGQ2OGMiXQ
non-LSA memory: image

@Deexie
Copy link
Author

Deexie commented Sep 6, 2024

master 60-60-60
test duration: 1h 24min
repair time: 553.5129013061523 (9 min)
image

@Deexie
Copy link
Author

Deexie commented Sep 6, 2024

poc1-60-59-58
test duration: 3h 2min
repair time: 5473.398208618164 (1.5h)
image

@Deexie
Copy link
Author

Deexie commented Sep 6, 2024

poc2-60-59-58
failed after: 7h 45min

02:09:38  error running operation: std::system_error (error system:104, recv: Connection reset by peer)
02:09:38  ----- LAST WARNING EVENT -----------------------------------------------------
02:09:38  2024-09-05 19:38:43.928 <2024-09-05 19:38:43.699>: (DatabaseLogEvent Severity.WARNING) period_type=one-time event_id=c571cbf1-5131-4da8-8563-3d5ed86ec7bb: type=WARNING regex=(^WARNING|!\s*?WARNING).*\[shard.*\] line_number=109442 node=ubuntu-mixed-sh-db-node-a7786369-2
02:09:38  2024-09-05T19:38:43.699+00:00 ubuntu-mixed-sh-db-node-a7786369-2  !WARNING | scylla[15842]:  [shard  0: gms] seastar_memory - oversized allocation: 1069056 bytes. This is non-fatal, but could lead to latency and/or fragmentation issues. Please report: at 0x6102d3e 0x6103350 0x6103658 0x5bafde2 0x5bb2585 0x45e9158 0x45e255a 0x5c0591f 0x5c06e9a 0x5c08077 0x5c07428 0x5b97593 0x5b968f3 0x13cf2f5 0x13d0cb0 0x13cd713 /opt/scylladb/libreloc/libc.so.6+0x2a087 /opt/scylladb/libreloc/libc.so.6+0x2a14a 0x13cad94
02:09:38  ----- LAST NORMAL EVENT ------------------------------------------------------
02:09:38  2024-09-05 19:38:10.473: (PrometheusAlertManagerEvent Severity.NORMAL) period_type=end event_id=d69143fc-2503-4eb1-b248-61a7a9171077 duration=1h35m59s: alert_name=InstanceDown node=10.4.1.235 start=2024-09-05T18:02:07.408Z end=2024-09-05T18:06:07.408Z description=10.4.1.235 has been down for more than 30 seconds. updated=2024-09-05T18:02:07.412Z state=active fingerprint=45469a7e312b47e8 labels={'alertname': 'InstanceDown', 'cluster': 'my-cluster', 'dc': 'eu-west-1', 'instance': '10.4.1.235', 'job': 'scylla', 'monitor': 'scylla-monitor', 'severity': '3'}
02:09:38  ================================================================================

decoded:

[Backtrace #0]
void seastar::backtrace<seastar::current_backtrace_tasklocal()::$_0>(seastar::current_backtrace_tasklocal()::$_0&&) at ./build/release/seastar/./seastar/include/seastar/util/backtrace.hh:68
 (inlined by) seastar::current_backtrace_tasklocal() at ./build/release/seastar/./build/release/seastar/./seastar/src/util/backtrace.cc:97
seastar::current_tasktrace() at ./build/release/seastar/./build/release/seastar/./seastar/src/util/backtrace.cc:148
seastar::current_backtrace() at ./build/release/seastar/./build/release/seastar/./seastar/src/util/backtrace.cc:181
seastar::memory::cpu_pages::warn_large_allocation(unsigned long) at ./build/release/seastar/./build/release/seastar/./seastar/src/core/memory.cc:849
 (inlined by) seastar::memory::cpu_pages::check_large_allocation(unsigned long) at ./build/release/seastar/./build/release/seastar/./seastar/src/core/memory.cc:912
 (inlined by) seastar::memory::cpu_pages::allocate_large(unsigned int, bool) at ./build/release/seastar/./build/release/seastar/./seastar/src/core/memory.cc:919
 (inlined by) seastar::memory::allocate_large(unsigned long, bool) at ./build/release/seastar/./build/release/seastar/./seastar/src/core/memory.cc:1542
 (inlined by) seastar::memory::allocate_slowpath(unsigned long) at ./build/release/seastar/./build/release/seastar/./seastar/src/core/memory.cc:1688
malloc at ./build/release/seastar/./build/release/seastar/./seastar/src/core/memory.cc:1707
service::raft_sys_table_storage::load_log() at ././seastar/include/seastar/core/sstring.hh:167
std::__n4861::coroutine_handle<seastar::internal::coroutine_traits_base<boost::container::deque<seastar::lw_shared_ptr<raft::log_entry const>, void, void> >::promise_type>::resume() const at /usr/lib/gcc/x86_64-redhat-linux/14/../../../../include/c++/14/coroutine:242
 (inlined by) seastar::internal::coroutine_traits_base<boost::container::deque<seastar::lw_shared_ptr<raft::log_entry const>, void, void> >::promise_type::run_and_dispose() at ././seastar/include/seastar/core/coroutine.hh:80
seastar::reactor::run_tasks(seastar::reactor::task_queue&) at ./build/release/seastar/./build/release/seastar/./seastar/src/core/reactor.cc:2577
seastar::reactor::run_some_tasks() at ./build/release/seastar/./build/release/seastar/./seastar/src/core/reactor.cc:3043
seastar::reactor::do_run() at ./build/release/seastar/./build/release/seastar/./seastar/src/core/reactor.cc:3211
seastar::reactor::run() at ./build/release/seastar/./build/release/seastar/./seastar/src/core/reactor.cc:3101
seastar::app_template::run_deprecated(int, char**, std::function<void ()>&&) at ./build/release/seastar/./build/release/seastar/./seastar/src/core/app-template.cc:276
seastar::app_template::run(int, char**, std::function<seastar::future<int> ()>&&) at ./build/release/seastar/./build/release/seastar/./seastar/src/core/app-template.cc:167
scylla_main(int, char**) at ././main.cc:700
std::function<int (int, char**)>::operator()(int, char**) const at /usr/lib/gcc/x86_64-redhat-linux/14/../../../../include/c++/14/bits/std_function.h:591
main at ././main.cc:2246
/data/scylla-s3-reloc.cache/by-build-id/f8ada775ee7b1210127d4237f218442ce59c3ae3/extracted/scylla/libreloc/libc.so.6: ELF 64-bit LSB shared object, x86-64, version 1 (GNU/Linux), dynamically linked, interpreter /lib64/ld-linux-x86-64.so.2, BuildID[sha1]=8f53abaad945a669f2bdcd25f471d80e077568ef, for GNU/Linux 3.2.0, not stripped

__libc_start_call_main at ??:?
__libc_start_main_alias_2 at :?
_start at ??:?

@Deexie Deexie force-pushed the mixed-shard-repair branch 2 times, most recently from 3f3986d to b3929c0 Compare September 13, 2024 16:28
@Deexie Deexie force-pushed the mixed-shard-repair branch 2 times, most recently from 7fca2ff to a948bb9 Compare September 19, 2024 13:05
@asias
Copy link
Contributor

asias commented Sep 20, 2024

@Deexie How did you execute the new sct test introduced in this PR? Do you run through Jenkins? Could you share the details?

@denesb
Copy link

denesb commented Dec 12, 2024

@pehala please review.

Copy link
Contributor

@pehala pehala left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I see we have db_nodes_shards_selection option, that we use in asymetrical test cases, would this work here? Or could be extended to work here, without so many changes in the core?

@Deexie
Copy link
Author

Deexie commented Dec 12, 2024

I see we have db_nodes_shards_selection option, that we use in asymetrical test cases, would this work here? Or could be extended to work here, without so many changes in the core?

With this option, we can have different numbers of shards, but they are taken randomly. Here, we can specify the exact number of shards per node. If we reuse db_nodes_shards_selection, then I think we still need to propagate the number of shards on each node.

Originally, it was done for AWS only. Maybe that's a good way.

@roydahan
Copy link
Contributor

Please note that "features" kind of tests are tests that aren't being triggered regularly (especially not automatically).
It's a good way to test a specific feature quickly, but not part of regression testing of releases.

Till now we had the asymmetrical longevities to exercise this path, it's obviously not enough since it didn't detect the issue we had in the field.
I recommend to either refactor current longevity or add a new longevity that will exercise this "feature".

@denesb
Copy link

denesb commented Dec 20, 2024

@Deexie what is the status of this PR?

@denesb
Copy link

denesb commented Jan 20, 2025

@Deexie what is the status here?

@Deexie
Copy link
Author

Deexie commented Jan 20, 2025

@Deexie what is the status here?

I'm getting back to it. Currently, the PR contains a feature that enables setting the shard number for each server and the test that was used in the mixed shard issue. I do not see how to achieve what's tested here without the custom shard num feature, nor how to make it a regression test that runs periodically.

I see we have db_nodes_shards_selection option, that we use in asymetrical test cases, would this work here? Or could be extended to work here, without so many changes in the core?

@pehala please see my response above (#8435 (comment)). Do you think that the change may get in as is? Does it need additional testing? Should I run it with each backend and check whether the number of cores is as specified?

I don't think we can go with db_nodes_shards_selection.

Please note that "features" kind of tests are tests that aren't being triggered regularly (especially not automatically). It's a good way to test a specific feature quickly, but not part of regression testing of releases.

Till now we had the asymmetrical longevities to exercise this path, it's obviously not enough since it didn't detect the issue we had in the field. I recommend to either refactor current longevity or add a new longevity that will exercise this "feature".

@roydahan This test wasn't meant to run periodically. The bug was examined based on metrics and logs. I don't know how to convert this into longevity.

@roydahan
Copy link
Contributor

@roydahan This test wasn't meant to run periodically. The bug was examined based on metrics and logs. I don't know how to convert this into longevity.

Maybe one simple way is to change the current "asymmetric" longevities configuration to use "nodes_smp: [X, Y, Z]" instead of the current random, with number of smp that we think will stress this feature the most.
You can do that by either adding another configuration file like the one here https://github.com/scylladb/scylla-cluster-tests/blob/master/configurations/db-nodes-shards-random.yaml and set some of the longevities that uses this one with your new config file.

@roydahan roydahan requested a review from fruch January 20, 2025 18:30
sdcm/cluster_aws.py Outdated Show resolved Hide resolved
sdcm/cluster.py Outdated Show resolved Hide resolved
sdcm/cluster_aws.py Outdated Show resolved Hide resolved
sdcm/sct_config.py Outdated Show resolved Hide resolved
@@ -499,6 +499,9 @@ class SCTConfiguration(dict):
In case of random option - Scylla will start with different (random) shards on every node of the cluster
"""),

dict(name="nodes_smp", env="SCT_NODES_SMP", type=list,
help="List of shard numbers of nodes in Scylla cluster; list of int, like [4, 5, 3]"),
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
help="List of shard numbers of nodes in Scylla cluster; list of int, like [4, 5, 3]"),
help="List of shard number to set per node in Scylla cluster; list of int, like [4, 5, 3]"),

I wonder how it would work with multi-dc cases:

region_name: 'eu-west-1 us-east-1'
n_db_nodes: '2 1'
nodes_smp: [12, 12, 15]

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The number is based on node_index and I think it does not depend on dc

@fruch fruch added backport/none Backport is not required test-provision-aws Run provision test on AWS test-provision-gce Run provision test on GCE test-provision-docker labels Jan 20, 2025
fruch
fruch previously approved these changes Jan 20, 2025
Copy link
Contributor

@fruch fruch left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM

  • we might be able to name a bit better the configuration option
  • arguments shouldn't be mutable

@Deexie
Copy link
Author

Deexie commented Jan 24, 2025

  • use None as a default param value
  • rename nodes_smp to smp_per_db_node_mapping
  • use str_or_list_or_eval type for smp_per_db_node_mapping
  • add pipelines with custom shard number for some tests that run with random shard num

@Deexie Deexie force-pushed the mixed-shard-repair branch from 3ba3fec to 3d55f23 Compare January 24, 2025 13:33
@Deexie
Copy link
Author

Deexie commented Jan 24, 2025

  • modify smp_per_db_node_mapping description

@scylladbbot
Copy link

@Deexie new branch branch-2025.1 was added, please add backport label if needed

Add custom shard number config for Scylla clusters.
…es with custom shard number

Copy asimetric jenkins longevity pipelines and set custom shard
number for them.
@Deexie Deexie force-pushed the mixed-shard-repair branch from 3d55f23 to 8482007 Compare January 28, 2025 15:17
@Deexie
Copy link
Author

Deexie commented Jan 28, 2025

  • drop excessive self arg

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
backport/none Backport is not required test-provision-aws Run provision test on AWS test-provision-docker test-provision-gce Run provision test on GCE
Projects
None yet
Development

Successfully merging this pull request may close these issues.

7 participants