Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Encountering Transaction Deadline Exceeded Errors while benchmarking TPC-C #26529

Closed
tim-o opened this issue Jun 7, 2018 · 39 comments
Closed
Assignees
Labels
A-kv-client Relating to the KV client and the KV interface. C-investigation Further steps needed to qualify. C-label will change. O-support Would prevent or help troubleshoot a customer escalation - bugs, missing observability/tooling, docs S-2-temp-unavailability Temp crashes or other availability problems. Can be worked around or resolved by restarting.

Comments

@tim-o
Copy link
Contributor

tim-o commented Jun 7, 2018

Reported by @HeikoOnnebrink. Related to #18684, though it might not be identical since that error is expected to be infrequent, and this can be reproduced.

Heiko is attempting to reproduce our TPC-C benchmark using the 1000 warehouse test on a 3 node cluster. Each cluster has 16 cores and is behind HAproxy.

After roughly 1 hour, the cluster becomes unstable and queries begin failing with transaction deadline exceeded errors.

Heikko, could you provide us with logs from your most recent test? Were you able to notice anything unusual in the admin UI prior to the start of the errors (i.e., steadily increasing SQL execution latency?) Or is the onset sudden?

@tim-o tim-o added C-investigation Further steps needed to qualify. C-label will change. S-2-temp-unavailability Temp crashes or other availability problems. Can be worked around or resolved by restarting. labels Jun 7, 2018
@HeikoOnnebrink
Copy link

I executed 2 test runs.. before each run I restarted the cluster to have short logs
The logs from all three nodes have been uploaed to
https://bitbucket.org/honnebrink/public-file-share/src
and can be found in the 2 subfolder run1 ... and run2 ...

@HeikoOnnebrink
Copy link

run1

@HeikoOnnebrink
Copy link

UI from 1st run

@HeikoOnnebrink
Copy link

run2

@HeikoOnnebrink
Copy link

UI from 2nd run

@HeikoOnnebrink
Copy link

the cluster itself is stable and keeps running... problem just manifests to termination of workload program with deadline exceeded error. Let me know if you want a run with --ignore option and fresh logs ..

@jordanlewis
Copy link
Member

Hi @HeikoOnnebrink,

Are you using our workload TPCC load generator, or something else? What's your HAProxy config? What are your machine specs, specifically the disks and CPUs? Are there any mount options you used?

Thanks,
Jordan

@HeikoOnnebrink
Copy link

Hi @jordanlewis

I am using your latest workload generator.

HAproxy.cfg is generated from Cockroach. Just had to increase client and server timeouts as 1 min. caused HAproxy to close sockets before cockroach completed requests.
Also added stats section to config for UI access.

The exact hardware vendor I do not know, I assume Huawei from China with Intel CPUs.
Machines are provisioned on an on-premise OpenStack cloud.

The flavour of each machine is 16 cores, 60 GB RAM, 640 GB local SSD storage accessed via paravirtualised driver. OS is latest Core OS stable and Cockroach runs inside Docker Container.
Docker network mode is host network. (Just to be save I already reproduced the reported problems also without docker running the bin straight on CoreOS, had to use 2.0.2 as 2.0.0 does not work on CoreOS due to shared library problems..)

@tim-o
Copy link
Contributor Author

tim-o commented Jun 11, 2018

Hey @HeikoOnnebrink - thanks for your patience here. I ran a couple tests to see if I could reproduce your issues:

  1. Latest stable build, v2.0.2, without HAproxy, following our whitepaper.
  2. Latest alpha, with HAproxy, balancing round robin (the default configuration).

In each case, I ran the workload for 90 minutes. The results were:

Case 1: ```_elapsed___errors_____ops(total)___ops/sec(cum)__avg(ms)__p50(ms)__p95(ms)__p99(ms)_pMax(ms)__total
5400.1s 0 115582 21.4 122.5 113.2 209.7 285.2 1275.1 delivery
5400.1s 0 1154917 213.9 94.6 88.1 176.2 243.3 838.9 newOrder
5400.1s 0 115689 21.4 16.4 14.7 31.5 48.2 285.2 orderStatus
5400.1s 0 1156504 214.2 51.8 44.0 104.9 159.4 906.0 payment
5400.1s 0 115563 21.4 70.5 67.1 125.8 167.8 402.7 stockLevel

_elapsed___errors_____ops(total)___ops/sec(cum)__avg(ms)__p50(ms)__p95(ms)__p99(ms)_pMax(ms)__result
5400.1s 0 2658255 492.3 72.7 65.0 159.4 218.1 1275.1```

Case 2:

 5400.1s        0         115444           21.4    153.7    151.0    234.9    302.0    973.1  delivery
 5400.1s        0        1153656          213.6    114.6    109.1    192.9    243.3    838.9  newOrder
 5400.1s        0         115525           21.4     18.7     16.8     37.7     52.4    226.5  orderStatus
 5400.1s        0        1155435          214.0     65.6     60.8    121.6    159.4    671.1  payment
 5400.1s        0         115445           21.4     89.4     88.1    151.0    192.9    436.2  stockLevel

_elapsed___errors_____ops(total)___ops/sec(cum)__avg(ms)__p50(ms)__p95(ms)__p99(ms)_pMax(ms)__result
 5400.1s        0        2655505          491.8     89.7     83.9    176.2    226.5    973.1```

What jumps out to me is that p99 latency in your tests is so high and so consistent across nodes, starting from the beginning of the workload - it's 40x higher than p99 in my tests, and >12x the max. You have more RAM available per box, and the same amount of CPU, so I doubt those are the issue.

Typically, high latency is going to be a factor of CPU, network, or (rarely) storage speed; there might be some other gotchas that I'm not yet aware of. Let me talk to some folks and see if anyone else has run into similar problems.

@tim-o
Copy link
Contributor Author

tim-o commented Jun 11, 2018

One final thing I should mention: I was running HAproxy and workload on a separate machine, not on the cluster itself. Here's the haproxy.cfg:


global
  maxconn 4096

defaults
    mode                tcp
    # Timeout values should be configured for your specific use.
    # See: https://cbonte.github.io/haproxy-dconv/1.8/configuration.html#4-timeout%20connect
    timeout connect     10s
    timeout client      1m
    timeout server      1m
    # TCP keep-alive on client side. Server already enables them.
    option              clitcpka

listen psql
    bind :26257
    mode tcp
    balance roundrobin
    option httpchk GET /health?ready=1
    server cockroach1 tim-o-tpcc-0001:26257 check port 26258
    server cockroach2 tim-o-tpcc-0002:26257 check port 26258
    server cockroach3 tim-o-tpcc-0003:26257 check port 26258
global```

Here are the options I used to execute workload:
`./workload.LATEST run tpcc --ramp=30s --warehouses=1000 --duration=300s "postgresql://root@localhost:26257/tpcc?sslmode=disable"`

@a-robinson
Copy link
Contributor

I've also seen transaction deadline exceeded errors from our workload tpcc implementation. They happen when the cockroach cluster isn't even close to being able to keep up with the load, which you can also tell from the very high query latencies.

Given that you're only running with 1000 warehouses on 3 nodes, I'm with @tim-o on being suspicious about the hardware or system configuration. I don't have any particular suggestions, though.

@HeikoOnnebrink
Copy link

HAproxy and workload run on separate dedicated machines. HAproxy config is same like yours (both seem to be generated by Cockroach). The machines are from production OpenStack next to tons of other machines where we do not face complaints .. maybe our workload is a bit more close to the edge ..

Right now I am building up a fresh 30 node cluster with same 16 core node type and local SSD on our latest new OpenStack cloud. Its not yet free for users access, so I am alone and can sort out impact by (overprovisioned) concurrent machines. I will update once I have some stats .. first need to populate 2,2 TB of data for the 10.000 warehouse test

@HeikoOnnebrink
Copy link

Here the next feedback from the 30 node cluster test.

Again I used 16 core nodes with same OS/spec like above.

I kicked off a workload --init last Friday to load the 10.000 warehouse data
(intentially I used init to stress/test bulk inerts)

./workload run tpcc "postgres://root@10.98.144.72:26257/tpcc?sslmode=verify-full&sslrootcert=/home/core/cert/ca.crt&sslcert=/home/core/cert/client.root.crt&sslkey=/home/core/cert/client.root.key" --warehouses=10000 --drop --init

What I observed is that logs filled up with hundred-thousands of these errors (several errors per second)

W180612 07:04:38.575076 604 sql/schema_changer.go:1146 [n1] Error executing schema change: not hit TTL deadline: -19h3m43.114472342s remaining
I180612 07:04:38.677904 46737055 storage/replica_raftstorage.go:520 [raftsnapshot,n1,s1,r30184/4:/Table/112/1/653{8/78…-9/75…}] generated Raft snapshot 365770e9 at index 1263
E180612 07:04:38.678792 46737055 storage/queue.go:778 [raftsnapshot,n1,s1,r30184/4:/Table/112/1/653{8/78…-9/75…}] snapshot failed: (n2,s2):2: remote couldn't accept snapshot with error: [n2,s2],r30184: cannot apply snapshot: snapshot intersects existing range [n2,s2,r29994/2:/Table/112/1/6{529/1…-875/1…}]
I180612 07:04:38.882016 46737097 storage/replica_raftstorage.go:520 [raftsnapshot,n1,s1,r30064/4:/Table/112/1/145{7/27…-8/27…}] generated Raft snapshot d2523f84 at index 1265
E180612 07:04:38.882921 46737097 storage/queue.go:778 [raftsnapshot,n1,s1,r30064/4:/Table/112/1/145{7/27…-8/27…}] snapshot failed: (n2,s2):2: remote couldn't accept snapshot with error: [n2,s2],r30064: cannot apply snapshot: snapshot intersects existing range [n2,s2,r25755/2:/Table/112/1/1{305/8…-875/3…}]
I180612 07:04:38.999171 46736896 storage/replica_raftstorage.go:520 [raftsnapshot,n1,s1,r33380/4:/Table/112/1/802{5/31…-6/31…}] generated Raft snapshot f19145d7 at index 1229
E180612 07:04:39.000062 46736896 storage/queue.go:778 [raftsnapshot,n1,s1,r33380/4:/Table/112/1/802{5/31…-6/31…}] snapshot failed: (n6,s6):3: remote couldn't accept snapshot with error: [n6,s6],r33380: cannot apply snapshot: snapshot intersects existing range [n6,s6,r27762/3:/Table/112/1/{7693/5…-8125/5…}]
I180612 07:04:39.678102 46737069 storage/replica_raftstorage.go:520 [raftsnapshot,n1,s1,r30184/4:/Table/112/1/653{8/78…-9/75…}] generated Raft snapshot 373525ae at index 1263
E180612 07:04:39.679128 46737069 storage/queue.go:778 [raftsnapshot,n1,s1,r30184/4:/Table/112/1/653{8/78…-9/75…}] snapshot failed: (n2,s2):2: remote couldn't accept snapshot with error: [n2,s2],r30184: cannot apply snapshot: snapshot intersects existing range [n2,s2,r29994/2:/Table/112/1/6{529/1…-875/1…}]
I180612 07:04:39.878882 46737181 storage/replica_raftstorage.go:520 [raftsnapshot,n1,s1,r30064/4:/Table/112/1/145{7/27…-8/27…}] generated Raft snapshot 505f1393 at index 1265
E180612 07:04:39.880011 46737181 storage/queue.go:778 [raftsnapshot,n1,s1,r30064/4:/Table/112/1/145{7/27…-8/27…}] snapshot failed: (n2,s2):2: remote couldn't accept snapshot with error: [n2,s2],r30064: cannot apply snapshot: snapshot intersects existing range [n2,s2,r25755/2:/Table/112/1/1{305/8…-875/3…}]
I180612 07:04:39.998666 46737079 storage/replica_raftstorage.go:520 [raftsnapshot,n1,s1,r33380/4:/Table/112/1/802{5/31…-6/31…}] generated Raft snapshot 76222b68 at index 1229
E180612 07:04:39.999607 46737079 storage/queue.go:778 [raftsnapshot,n1,s1,r33380/4:/Table/112/1/802{5/31…-6/31…}] snapshot failed: (n6,s6):3: remote couldn't accept snapshot with error: [n6,s6],r33380: cannot apply snapshot: snapshot intersects existing range [n6,s6,r27762/3:/Table/112/1/{7693/5…-8125/5…}]
W180612 07:04:40.507916 604 sql/schema_changer.go:1146 [n1] Error executing schema change: not hit TTL deadline: -19h3m41.03485009s remaining
I180612 07:04:40.678128 46737118 storage/replica_raftstorage.go:520 [raftsnapshot,n1,s1,r30184/4:/Table/112/1/653{8/78…-9/75…}] generated Raft snapshot 33a4d19e at index 1263
E180612 07:04:40.679056 46737118 storage/queue.go:778 [raftsnapshot,n1,s1,r30184/4:/Table/112/1/653{8/78…-9/75…}] snapshot failed: (n2,s2):2: remote couldn't accept snapshot with error: [n2,s2],r30184: cannot apply snapshot: snapshot intersects existing range [n2,s2,r29994/2:/Table/112/1/6{529/1…-875/1…}]
I180612 07:04:40.877842 46737270 storage/replica_raftstorage.go:520 [raftsnapshot,n1,s1,r30064/4:/Table/112/1/145{7/27…-8/27…}] generated Raft snapshot 6d94f66a at index 1265
E180612 07:04:40.878907 46737270 storage/queue.go:778 [raftsnapshot,n1,s1,r30064/4:/Table/112/1/145{7/27…-8/27…}] snapshot failed: (n2,s2):2: remote couldn't accept snapshot with error: [n2,s2],r30064: cannot apply snapshot: snapshot intersects existing range [n2,s2,r25755/2:/Table/112/1/1{305/8…-875/3…}]
I180612 07:04:41.001118 46737283 storage/replica_raftstorage.go:520 [raftsnapshot,n1,s1,r33380/4:/Table/112/1/802{5/31…-6/31…}] generated Raft snapshot 243a5dfb at index 1229
E180612 07:04:41.002066 46737283 storage/queue.go:778 [raftsnapshot,n1,s1,r33380/4:/Table/112/1/802{5/31…-6/31…}] snapshot failed: (n6,s6):3: remote couldn't accept snapshot with error: [n6,s6],r33380: cannot apply snapshot: snapshot intersects existing range [n6,s6,r27762/3:/Table/112/1/{7693/5…-8125/5…}]
I180612 07:04:41.678194 46737212 storage/replica_raftstorage.go:520 [raftsnapshot,n1,s1,r30184/4:/Table/112/1/653{8/78…-9/75…}] generated Raft snapshot cb81281d at index 1263
E180612 07:04:41.678996 46737212 storage/queue.go:778 [raftsnapshot,n1,s1,r30184/4:/Table/112/1/653{8/78…-9/75…}] snapshot failed: (n2,s2):2: remote couldn't accept snapshot with error: [n2,s2],r30184: cannot apply snapshot: snapshot intersects existing range [n2,s2,r29994/2:/Table/112/1/6{529/1…-875/1…}]
I180612 07:04:41.878265 46737323 storage/replica_raftstorage.go:520 [raftsnapshot,n1,s1,r30064/4:/Table/112/1/145{7/27…-8/27…}] generated Raft snapshot 7adc78a1 at index 1265
E180612 07:04:41.879534 46737323 storage/queue.go:778 [raftsnapshot,n1,s1,r30064/4:/Table/112/1/145{7/27…-8/27…}] snapshot failed: (n2,s2):2: remote couldn't accept snapshot with error: [n2,s2],r30064: cannot apply snapshot: snapshot intersects existing range [n2,s2,r25755/2:/Table/112/1/1{305/8…-875/3…}]
I180612 07:04:42.099890 46737364 storage/replica_raftstorage.go:520 [raftsnapshot,n1,s1,r33380/4:/Table/112/1/802{5/31…-6/31…}] generated Raft snapshot a5ffc4b6 at index 1229
E180612 07:04:42.101134 46737364 storage/queue.go:778 [raftsnapshot,n1,s1,r33380/4:/Table/112/1/802{5/31…-6/31…}] snapshot failed: (n6,s6):3: remote couldn't accept snapshot with error: [n6,s6],r33380: cannot apply snapshot: snapshot intersects existing range [n6,s6,r27762/3:/Table/112/1/{7693/5…-8125/5…}]
I180612 07:04:42.678129 46737492 storage/replica_raftstorage.go:520 [raftsnapshot,n1,s1,r30184/4:/Table/112/1/653{8/78…-9/75…}] generated Raft snapshot a06b092e at index 1263
E180612 07:04:42.679049 46737492 storage/queue.go:778 [raftsnapshot,n1,s1,r30184/4:/Table/112/1/653{8/78…-9/75…}] snapshot failed: (n2,s2):2: remote couldn't accept snapshot with error: [n2,s2],r30184: cannot apply snapshot: snapshot intersects existing range [n2,s2,r29994/2:/Table/112/1/6{529/1…-875/1…}]
I180612 07:04:42.858063 601 server/status/runtime.go:219 [n1] runtime stats: 21 GiB RSS, 592 goroutines, 285 MiB/84 MiB/485 MiB GO alloc/idle/total, 15 GiB/19 GiB CGO alloc/total, 809.40cgo/sec, 0.45/0.07 %(u/s)time, 0.00 %gc (4x)
I180612 07:04:42.878040 46737293 storage/replica_raftstorage.go:520 [raftsnapshot,n1,s1,r30064/4:/Table/112/1/145{7/27…-8/27…}] generated Raft snapshot dee5fa03 at index 1265
E180612 07:04:42.879094 46737293 storage/queue.go:778 [raftsnapshot,n1,s1,r30064/4:/Table/112/1/145{7/27…-8/27…}] snapshot failed: (n2,s2):2: remote couldn't accept snapshot with error: [n2,s2],r30064: cannot apply snapshot: snapshot intersects existing range [n2,s2,r25755/2:/Table/112/1/1{305/8…-875/3…}]
I180612 07:04:42.999418 46737824 storage/replica_raftstorage.go:520 [raftsnapshot,n1,s1,r33380/4:/Table/112/1/802{5/31…-6/31…}] generated Raft snapshot 54e17bb3 at index 1229
E180612 07:04:43.003105 46737824 storage/queue.go:778 [raftsnapshot,n1,s1,r33380/4:/Table/112/1/802{5/31…-6/31…}] snapshot failed: (n6,s6):3: remote couldn't accept snapshot with error: [n6,s6],r33380: cannot apply snapshot: snapshot intersects existing range [n6,s6,r27762/3:/Table/112/1/{7693/5…-8125/5…}]
I180612 07:04:43.678313 46737457 storage/replica_raftstorage.go:520 [raftsnapshot,n1,s1,r30184/4:/Table/112/1/653{8/78…-9/75…}] generated Raft snapshot 0063a116 at index 1263
E180612 07:04:43.679275 46737457 storage/queue.go:778 [raftsnapshot,n1,s1,r30184/4:/Table/112/1/653{8/78…-9/75…}] snapshot failed: (n2,s2):2: remote couldn't accept snapshot with error: [n2,s2],r30184: cannot apply snapshot: snapshot intersects existing range [n2,s2,r29994/2:/Table/112/1/6{529/1…-875/1…}]
I180612 07:04:43.878478 46738134 storage/replica_raftstorage.go:520 [raftsnapshot,n1,s1,r30064/4:/Table/112/1/145{7/27…-8/27…}] generated Raft snapshot 30194968 at index 1265
E180612 07:04:43.879776 46738134 storage/queue.go:778 [raftsnapshot,n1,s1,r30064/4:/Table/112/1/145{7/27…-8/27…}] snapshot failed: (n2,s2):2: remote couldn't accept snapshot with error: [n2,s2],r30064: cannot apply snapshot: snapshot intersects existing range [n2,s2,r25755/2:/Table/112/1/1{305/8…-875/3…}]
I180612 07:04:43.999838 46737712 storage/replica_raftstorage.go:520 [raftsnapshot,n1,s1,r33380/4:/Table/112/1/802{5/31…-6/31…}] generated Raft snapshot 6528d9ca at index 1229
E180612 07:04:44.002453 46737712 storage/queue.go:778 [raftsnapshot,n1,s1,r33380/4:/Table/112/1/802{5/31…-6/31…}] snapshot failed: (n6,s6):3: remote couldn't accept snapshot with error: [n6,s6],r33380: cannot apply snapshot: snapshot intersects existing range [n6,s6,r27762/3:/Table/112/1/{7693/5…-8125/5…}]
I180612 07:04:44.683127 46738166 storage/replica_raftstorage.go:520 [raftsnapshot,n1,s1,r30184/4:/Table/112/1/653{8/78…-9/75…}] generated Raft snapshot 780a8ba0 at index 1263
E180612 07:04:44.685558 46738166 storage/queue.go:778 [raftsnapshot,n1,s1,r30184/4:/Table/112/1/653{8/78…-9/75…}] snapshot failed: (n2,s2):2: remote couldn't accept snapshot with error: [n2,s2],r30184: cannot apply snapshot: snapshot intersects existing range [n2,s2,r29994/2:/Table/112/1/6{529/1…-875/1…}]
I180612 07:04:44.878357 46738234 storage/replica_raftstorage.go:520 [raftsnapshot,n1,s1,r30064/4:/Table/112/1/145{7/27…-8/27…}] generated Raft snapshot fb432aae at index 1265
E180612 07:04:44.879472 46738234 storage/queue.go:778 [raftsnapshot,n1,s1,r30064/4:/Table/112/1/145{7/27…-8/27…}] snapshot failed: (n2,s2):2: remote couldn't accept snapshot with error: [n2,s2],r30064: cannot apply snapshot: snapshot intersects existing range [n2,s2,r25755/2:/Table/112/1/1{305/8…-875/3…}]
I180612 07:04:44.998818 46738094 storage/replica_raftstorage.go:520 [raftsnapshot,n1,s1,r33380/4:/Table/112/1/802{5/31…-6/31…}] generated Raft snapshot 78f2636b at index 1229
E180612 07:04:44.999835 46738094 storage/queue.go:778 [raftsnapshot,n1,s1,r33380/4:/Table/112/1/802{5/31…-6/31…}] snapshot failed: (n6,s6):3: remote couldn't accept snapshot with error: [n6,s6],r33380: cannot apply snapshot: snapshot intersects existing range [n6,s6,r27762/3:/Table/112/1/{7693/5…-8125/5…}]
I180612 07:04:45.679404 46738201 storage/replica_raftstorage.go:520 [raftsnapshot,n1,s1,r30184/4:/Table/112/1/653{8/78…-9/75…}] generated Raft snapshot 46bb2285 at index 1263
E180612 07:04:45.680529 46738201 storage/queue.go:778 [raftsnapshot,n1,s1,r30184/4:/Table/112/1/653{8/78…-9/75…}] snapshot failed: (n2,s2):2: remote couldn't accept snapshot with error: [n2,s2],r30184: cannot apply snapshot: snapshot intersects existing range [n2,s2,r29994/2:/Table/112/1/6{529/1…-875/1…}]
I180612 07:04:45.879971 46738139 storage/replica_raftstorage.go:520 [raftsnapshot,n1,s1,r30064/4:/Table/112/1/145{7/27…-8/27…}] generated Raft snapshot 975dd660 at index 1265
E180612 07:04:45.880978 46738139 storage/queue.go:778 [raftsnapshot,n1,s1,r30064/4:/Table/112/1/145{7/27…-8/27…}] snapshot failed: (n2,s2):2: remote couldn't accept snapshot with error: [n2,s2],r30064: cannot apply snapshot: snapshot intersects existing range [n2,s2,r25755/2:/Table/112/1/1{305/8…-875/3…}]
I180612 07:04:46.002847 46738238 storage/replica_raftstorage.go:520 [raftsnapshot,n1,s1,r33380/4:/Table/112/1/802{5/31…-6/31…}] generated Raft snapshot b5294da5 at index 1229
E180612 07:04:46.004706 46738238 storage/queue.go:778 [raftsnapshot,n1,s1,r33380/4:/Table/112/1/802{5/31…-6/31…}] snapshot failed: (n6,s6):3: remote couldn't accept snapshot with error: [n6,s6],r33380: cannot apply snapshot: snapshot intersects existing range [n6,s6,r27762/3:/Table/112/1/{7693/5…-8125/5…}]
I180612 07:04:46.679994 46738342 storage/replica_raftstorage.go:520 [raftsnapshot,n1,s1,r30184/4:/Table/112/1/653{8/78…-9/75…}] generated Raft snapshot 9503181d at index 1263
E180612 07:04:46.681433 46738342 storage/queue.go:778 [raftsnapshot,n1,s1,r30184/4:/Table/112/1/653{8/78…-9/75…}] snapshot failed: (n2,s2):2: remote couldn't accept snapshot with error: [n2,s2],r30184: cannot apply snapshot: snapshot intersects existing range [n2,s2,r29994/2:/Table/112/1/6{529/1…-875/1…}]
I180612 07:04:46.878489 46738310 storage/replica_raftstorage.go:520 [raftsnapshot,n1,s1,r30064/4:/Table/112/1/145{7/27…-8/27…}] generated Raft snapshot ee489563 at index 1265
E180612 07:04:46.879329 46738310 storage/queue.go:778 [raftsnapshot,n1,s1,r30064/4:/Table/112/1/145{7/27…-8/27…}] snapshot failed: (n2,s2):2: remote couldn't accept snapshot with error: [n2,s2],r30064: cannot apply snapshot: snapshot intersects existing range [n2,s2,r25755/2:/Table/112/1/1{305/8…-875/3…}]
W180612 07:04:47.078500 604 sql/schema_changer.go:1146 [n1] Error executing schema change: not hit TTL deadline: -19h3m34.627098837s remaining
I180612 07:04:47.101192 46738450 storage/replica_raftstorage.go:520 [raftsnapshot,n1,s1,r33380/4:/Table/112/1/802{5/31…-6/31…}] generated Raft snapshot c20f905c at index 1229
E180612 07:04:47.102267 46738450 storage/queue.go:778 [raftsnapshot,n1,s1,r33380/4:/Table/112/1/802{5/31…-6/31…}] snapshot failed: (n6,s6):3: remote couldn't accept snapshot with error: [n6,s6],r33380: cannot apply snapshot: snapshot intersects existing range [n6,s6,r27762/3:/Table/112/1/{7693/5…-8125/5…}]
I180612 07:04:47.678066 46738078 storage/replica_raftstorage.go:520 [raftsnapshot,n1,s1,r30184/4:/Table/112/1/653{8/78…-9/75…}] generated Raft snapshot a402a340 at index 1263
E180612 07:04:47.679073 46738078 storage/queue.go:778 [raftsnapshot,n1,s1,r30184/4:/Table/112/1/653{8/78…-9/75…}] snapshot failed: (n2,s2):2: remote couldn't accept snapshot with error: [n2,s2],r30184: cannot apply snapshot: snapshot intersects existing range [n2,s2,r29994/2:/Table/112/1/6{529/1…-875/1…}]
I180612 07:04:47.879069 46738477 storage/replica_raftstorage.go:520 [raftsnapshot,n1,s1,r30064/4:/Table/112/1/145{7/27…-8/27…}] generated Raft snapshot fd6db3fa at index 1265
E180612 07:04:47.880028 46738477 storage/queue.go:778 [raftsnapshot,n1,s1,r30064/4:/Table/112/1/145{7/27…-8/27…}] snapshot failed: (n2,s2):2: remote couldn't accept snapshot with error: [n2,s2],r30064: cannot apply snapshot: snapshot intersects existing range [n2,s2,r25755/2:/Table/112/1/1{305/8…-875/3…}]
I180612 07:04:47.998801 46738080 storage/replica_raftstorage.go:520 [raftsnapshot,n1,s1,r33380/4:/Table/112/1/802{5/31…-6/31…}] generated Raft snapshot c92a38c8 at index 1229
E180612 07:04:47.999920 46738080 storage/queue.go:778 [raftsnapshot,n1,s1,r33380/4:/Table/112/1/802{5/31…-6/31…}] snapshot failed: (n6,s6):3: remote couldn't accept snapshot with error: [n6,s6],r33380: cannot apply snapshot: snapshot intersects existing range [n6,s6,r27762/3:/Table/112/1/{7693/5…-8125/5…}]
W180612 07:04:48.519955 604 sql/schema_changer.go:1146 [n1] Error executing schema change: not hit TTL deadline: -19h3m33.162942659s remaining
I180612 07:04:48.678004 46738303 storage/replica_raftstorage.go:520 [raftsnapshot,n1,s1,r30184/4:/Table/112/1/653{8/78…-9/75…}] generated Raft snapshot e9c9a0de at index 1263
E180612 07:04:48.678930 46738303 storage/queue.go:778 [raftsnapshot,n1,s1,r30184/4:/Table/112/1/653{8/78…-9/75…}] snapshot failed: (n2,s2):2: remote couldn't accept snapshot with error: [n2,s2],r30184: cannot apply snapshot: snapshot intersects existing range [n2,s2,r29994/2:/Table/112/1/6{529/1…-875/1…}]
I180612 07:04:48.878897 46738443 storage/replica_raftstorage.go:520 [raftsnapshot,n1,s1,r30064/4:/Table/112/1/145{7/27…-8/27…}] generated Raft snapshot 3b9a8a55 at index 1265
E180612 07:04:48.880093 46738443 storage/queue.go:778 [raftsnapshot,n1,s1,r30064/4:/Table/112/1/145{7/27…-8/27…}] snapshot failed: (n2,s2):2: remote couldn't accept snapshot with error: [n2,s2],r30064: cannot apply snapshot: snapshot intersects existing range [n2,s2,r25755/2:/Table/112/1/1{305/8…-875/3…}]
I180612 07:04:48.998959 46737935 storage/replica_raftstorage.go:520 [raftsnapshot,n1,s1,r33380/4:/Table/112/1/802{5/31…-6/31…}] generated Raft snapshot 74ec0d94 at index 1229
E180612 07:04:48.999946 46737935 storage/queue.go:778 [raftsnapshot,n1,s1,r33380/4:/Table/112/1/802{5/31…-6/31…}] snapshot failed: (n6,s6):3: remote couldn't accept snapshot with error: [n6,s6],r33380: cannot apply snapshot: snapshot intersects existing range [n6,s6,r27762/3:/Table/112/1/{7693/5…-8125/5…}]
I180612 07:04:49.679904 46738584 storage/replica_raftstorage.go:520 [raftsnapshot,n1,s1,r30184/4:/Table/112/1/653{8/78…-9/75…}] generated Raft snapshot c150b475 at index 1263
E180612 07:04:49.680843 46738584 storage/queue.go:778 [raftsnapshot,n1,s1,r30184/4:/Table/112/1/653{8/78…-9/75…}] snapshot failed: (n2,s2):2: remote couldn't accept snapshot with error: [n2,s2],r30184: cannot apply snapshot: snapshot intersects existing range [n2,s2,r29994/2:/Table/112/1/6{529/1…-875/1…}]
I180612 07:04:49.888001 46738447 storage/replica_raftstorage.go:520 [raftsnapshot,n1,s1,r30064/4:/Table/112/1/145{7/27…-8/27…}] generated Raft snapshot 420d3111 at index 1265
E180612 07:04:49.889186 46738447 storage/queue.go:778 [raftsnapshot,n1,s1,r30064/4:/Table/112/1/145{7/27…-8/27…}] snapshot failed: (n2,s2):2: remote couldn't accept snapshot with error: [n2,s2],r30064: cannot apply snapshot: snapshot intersects existing range [n2,s2,r25755/2:/Table/112/1/1{305/8…-875/3…}]
I180612 07:04:49.999077 46738603 storage/replica_raftstorage.go:520 [raftsnapshot,n1,s1,r33380/4:/Table/112/1/802{5/31…-6/31…}] generated Raft snapshot c620a191 at index 1229
E180612 07:04:50.002186 46738603 storage/queue.go:778 [raftsnapshot,n1,s1,r33380/4:/Table/112/1/802{5/31…-6/31…}] snapshot failed: (n6,s6):3: remote couldn't accept snapshot with error: [n6,s6],r33380: cannot apply snapshot: snapshot intersects existing range [n6,s6,r27762/3:/Table/112/1/{7693/5…-8125/5…}]
I180612 07:04:50.679221 46738396 storage/replica_raftstorage.go:520 [raftsnapshot,n1,s1,r30184/4:/Table/112/1/653{8/78…-9/75…}] generated Raft snapshot 1746a193 at index 1263
E180612 07:04:50.680207 46738396 storage/queue.go:778 [raftsnapshot,n1,s1,r30184/4:/Table/112/1/653{8/78…-9/75…}] snapshot failed: (n2,s2):2: remote couldn't accept snapshot with error: [n2,s2],r30184: cannot apply snapshot: snapshot intersects existing range [n2,s2,r29994/2:/Table/112/1/6{529/1…-875/1…}]
I180612 07:04:50.877773 46738159 storage/replica_raftstorage.go:520 [raftsnapshot,n1,s1,r30064/4:/Table/112/1/145{7/27…-8/27…}] generated Raft snapshot 9752e3d4 at index 1265

The logs of each node have grown to 500 MB just full with above errors.
Just let me know and I can upload one/some of these logs to file share

Here a screen shot of metrics during workload run

image

The first hours of metrics can be ignored .. they are from a first run where I had some issue as NTP settings were not properly applied to all nodes.

@HeikoOnnebrink
Copy link

HeikoOnnebrink commented Jun 12, 2018

Also just found this morning that from HAproxy side one of the 30 nodes failed health check, but in Cockroach UI all nodes were listed as up and running.

Here the output of the health check of failing node 19

image

image

@tbg tbg added the A-kv-client Relating to the KV client and the KV interface. label Jun 12, 2018
@HeikoOnnebrink
Copy link

Next update
used our brand new cloud environment without concurrent load from other VMs.
created 3 node cluster (this time 2.0.2 version), loaded 1000 warehouse data and kicked off workload test.

Performance looked good for the first 20 minutes, than P99 latency started to grow and SQL throughput dropped by 50%.

After 1h 6 min test failed with deadline exceeded.

image

P50 latency stayed most of the time between 10-20ms.

Server logs will get uploaded tomorrow ..

Still no clue why performance stays only stable for 20 minutes ..

@HeikoOnnebrink
Copy link

repeated above test and found that further tests started directly with high 99th percentile latency.
This latency looks to be constant for some longer time during test at exact 10.2 s for all nodes...

image

@jordanlewis
Copy link
Member

Hmm, that doesn't look good. What mount options do you have on your disks, if any? Have you tried using the nobarrier option, if your disks support battery backups?

Can you attach the log and command commit latency graphs in the Storage dashboard?

@HeikoOnnebrink
Copy link

I let run the workload test over the night with the --tolerate-errors options

image

Here the charts

image

image

image

for the disk/mount related question I have to check with the OpenStack team ..

@HeikoOnnebrink
Copy link

I repeated on same cockroach cluster the tests, this time with 800 warehouses (instead of 1000)
All other parameters stayed unchanged

image

image

This time things look much better (even we have higher latency than we should) .. but response was stable.

I see 2 topics now

On my end I need to investigate the I/O stack to check and improve performance.
On Cockroach side it looks to me that the cluster response does not degrade gracefully when we overwhelm the DB with load. I also see that identical tests that I repeated show different performance behavior.
Will continue with more testing scenarios ..

@tim-o
Copy link
Contributor Author

tim-o commented Jun 13, 2018

Thanks Heiko! I've made product aware of the performance degradation, we're getting an issue together for tracking. I'll let you know once that's been created.

One thought: it might help us resolve faster if we could get access to your test cluster to pull stats and observe the system while it's under load. Would that be possible?

@HeikoOnnebrink
Copy link

Tobias and Nathan have had access in the past to our previous test cluster via a WebEx session. For sure you can have access to the new test systems as well. Only problem is the fact that I am leaving tomorrow for holiday. After holiday I am 2 days at VMware HQ in London and than back to office up from 28th of June. I will reserve all the time up from 28th to continue testing and give you access to the systems.
As we are looking at production deployment scenarios performance topic has high priority for us ..

@HeikoOnnebrink
Copy link

As an first info about IO that I got is that the SSD storage used by the VMs is configured via RAID controller as RAID-6. The OpenStack team is informed to check the IO stack from their end. Also here I will co-work with them once I am back..

@HeikoOnnebrink
Copy link

from your performance white paper I got that you run on google n1-16 nodes with local SSD (SSD scratch disks) From what I read at google they seem to be kind of plain non mirrored SSD modules which would explain much better write performance / lower latency.. and from what I learned is that latency is poison for Rocks DB..

@tim-o
Copy link
Contributor Author

tim-o commented Jun 14, 2018

Thanks @HeikoOnnebrink - have a great holiday. We'll keep an eye out for similar reports, and can catch up on the 28th. It sounds like you have a good lead with OpenStack's disk performance. Hopefully their support can make some recommendations about how to improve IOPS and latency; if there's a configuration that works, we'll happily update docs so the next user has an easier go of it.

If there's anything else we can do in the meantime, let me know.

@tim-o
Copy link
Contributor Author

tim-o commented Jul 16, 2018

Hey @HeikoOnnebrink - if you have some cycles before our meeting tomorrow, it'd be helpful to get sysstat up and logging values on the nodes we'll be testing. You can set up a crontab to log values once per minute (just substitute yum for apt-get if running CentOS):

$ sudo bash
$ apt-get install sysstat
$ sed -ibak -e 's/ENABLED="false"/ENABLED="true"/g' /etc/default/sysstat
$ sed -ibak -e 's|5-55/10 \* \* \* \*|\* \* \* \* \*|g' /etc/cron.d/sysstat

Once that's done, sar -d -p will give you a snapshot of disk IO, sar -u will pull CPU utilization, and sar -r will pull RAM.

Here are the results of Disk IO, memory, and CPU utilization on a cluster was running a 1000 warehouse test, on three GCE n1-highcpu-16 nodes, for 20 minutes. If we can get similar reports from your cluster, it'll give us a better idea of where the bottleneck lies.

TPC-C Test 1000 Warehouses.xlsx

@robert-s-lee
Copy link
Contributor

@tim-o Can you also pull the network data with sar -n ALL. sar -A -p will provide text version of all data points collected.

@tim-o
Copy link
Contributor Author

tim-o commented Jul 16, 2018

Sure - here's the (raw) output of -n ALL.
sar -n ALL output.txt

@HeikoOnnebrink
Copy link

I am running inside Docker on CoreOS .. so things are a bit different / restricted.. I spent some time and managed to get the stats you need , inside Docker toolbox container (Fedora based) See https://coreos.com/os/docs/latest/install-debugging-tools.html
The crontab cannot be activated from within toolbox as it is spawned via systemd-nspawn and no service run here.. But a stupid shell script send to & will execute "sa1 1 1" every minute.
One open topic I have is mapping from disk device identifier to device names.

sar -d -p (pretty print) ends up with 
`Cannot open /dev/mapper: No such file or directory`

But a
sar -d | grep DEVICE-ID
will filter the SSD disk related rows .

Hope I have prepared all you will need .. see you later at 7pm CET

@tim-o
Copy link
Contributor Author

tim-o commented Jul 17, 2018

Thanks Heiko. If you get a chance, it'd be good to review the sar output before the meeting - otherwise we'll talk at 7.

@robert-s-lee
Copy link
Contributor

Tim's 3 nodes of google n1-16 nodes show very good Network and IO performance. @HeikoOnnebrink we shoud check the similiar and compare the latency and queue sizes.

  • 50MB of network bandwidth
  • average of .61 ms response time from disk even at 89% utilization
Time Node DEV tps rd_sec/s wr_sec/s avgrq-sz avgqu-sz await svctm %util
4:27:01 3 sdb 6000.95 218785.42 229424.93 74.69 3.69 0.61 0.15 89.21

@HeikoOnnebrink
Copy link

HeikoOnnebrink commented Jul 17, 2018

this are all extreme valuable infos.. What I need to know is how to benchmark my (OpenStack) VMs against your google tests. Than I can adjust my expectations which tpmc values are realistic as targets for my tests. And in the end I need to understand which options I can throw in to optimise the whole setup as done for SSD RAID config. And in the end I try to learn which resources mostly impact Cockroach performance and where it is tolerant like a cockroach.. ;-)

@tbg tbg added this to the 2.0.x milestone Jul 22, 2018
@knz knz modified the milestones: 2.0.x, 2.1 Jul 23, 2018
@jordanlewis jordanlewis modified the milestones: 2.1, Later Oct 11, 2018
@jordanlewis jordanlewis added the O-support Would prevent or help troubleshoot a customer escalation - bugs, missing observability/tooling, docs label Oct 11, 2018
@jordanlewis jordanlewis removed their assignment Oct 11, 2018
@tbg
Copy link
Member

tbg commented Jan 28, 2019

@nvanbenschoten is this closable for one of the other issues referenced here?

@nvanbenschoten
Copy link
Member

Yes, closing in favor of #18684 now that #35284 is merged. I'm also going to revive #31081 now.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
A-kv-client Relating to the KV client and the KV interface. C-investigation Further steps needed to qualify. C-label will change. O-support Would prevent or help troubleshoot a customer escalation - bugs, missing observability/tooling, docs S-2-temp-unavailability Temp crashes or other availability problems. Can be worked around or resolved by restarting.
Projects
None yet
Development

No branches or pull requests

9 participants