-
Notifications
You must be signed in to change notification settings - Fork 3.8k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
server: grpc unimplemented panic while running workload import #49842
Comments
I restarted the dead node, ran the same workload run command from above, and it was fine. I canceled the in-progress changefeed and started a new one (s3 sink Logs from nodes 1 and 2 here |
The first crash is odd, it's an "unimplemented" panic during a protobuf decode deep inside grpc:
|
The other crashes are also with the same panic. @tbg I think you updated grpc recently? Could this be related? (Does the version ran by Michael contain the upgraded grpc?) |
Yes, this almost certainly is caused by the protobuf bump. The grpc-gateway proto generated code uses I don't know why it does that. However, they have since backpedaled and made this functional again: Picking up that fix should give us temporary relief. I'll write up a unit test that repros the panic and make sure it is fixed by a dep bump. But this is certainly going to come back and haunt us in a future upgrade, so I will also file something against grpc-gateway. |
https://github.com/grpc-ecosystem/grpc-gateway/blob/master/Makefile#L150 This invocation looks vanilla. I wonder why it results in deprecated code. Perhaps it just hasn't been updated in a long time, I will take a look at that. |
grpc-gateway has proto 1.3.2 pinned:
It is evidently creating that code that was then deprecated in 1.4.x. I think we can sit this one out then. This problem will affect everyone mixing proto-1.3 generated code with proto 1.4, which is why proto added the PR that makes it work again. The legacy behavior will stick around, and once grpc-gateway decides to use proto 1.4.x their generated code will not have that problem any more. I'll send a PR to bump golang/protobuf to pick up the PR mentioned above. |
49827: geo/geomfn: Implements ST_Segmentize for geometry r=otan a=abhishek20123g Fixes #49029 This PR implements ST_Segmentize({geometry, float8}) builtin function, which allows modify given geometry such that no segment longer than the given max_segment_length. Also this PR refactors and add extra test cases for ST_Segmentize for geography. Release note (sql change): This PR implements ST_Segmentize({geometry, float8}) builtin function. 49833: geo/geomfn: implement Intersection, PointOnSurface, Union r=sumeerbhola a=otan The last of the topology operators up to Chapter 20. Resolves #48951 Resolves #49832 Resolves #49064 Release note (sql change): Implements the ST_Intersection, ST_PointOnSurface and ST_Union builtin functions. 49869: vendor: bump golang/protobuf to 1.4.2 r=knz a=tbg v1.4.1 aggressively deprecated something (by inserting panics) that was reachable via gogoproto's marshaler. Luckily, v1.4.2 has this "fixed"; it caused enough trouble for others as well. Closes #49842. Release note: None 49870: schemachange: unskip TestDropWhileBackfill r=spaskob a=spaskob Disabling the GC job was preventing this test from completing. Tested with `test stress`: 1000 successful runs. Fixes #44944. Release note: none. 49871: kvserver: fixup test failure message r=andreimatei a=andreimatei Expected and real err were reversed. Release note: None Co-authored-by: abhishek20123g <abhishek20123g@gmail.com> Co-authored-by: Oliver Tan <otan@cockroachlabs.com> Co-authored-by: Tobias Schottdorf <tobias.schottdorf@gmail.com> Co-authored-by: Spas Bojanov <spas@cockroachlabs.com> Co-authored-by: Andrei Matei <andrei@cockroachlabs.com>
Describe the problem
I had a roachprod cluster running
CockroachDB CCL v20.2.0-alpha.00000000-1972-g69bc0192b3 (x86_64-unknown-linux-gnu, built 2020/06/03 13:38:23, go1.13.9)
on 3 node c5d.4xlarge machines. I kicked off a workload run and it caused a fatal error and the node shut down (logs attached below)To Reproduce
Not sure exactly how to repro. This is high level steps I took:
./cockroach workload init tpcc --warehouses=1000 --db=tpcc --data-loader=IMPORT
orders
table. Let backfill complete."./cockroach workload run tpcc --warehouses=1000 --duration=15m
Additional data / screenshots
If the problem is SQL-related, include a copy of the SQL query and the schema
of the supporting tables.
If a node in your cluster encountered a fatal error, supply the contents of the
log directories (at minimum of the affected node(s), but preferably all nodes).
Note that log files can contain confidential information. Please continue
creating this issue, but contact support@cockroachlabs.com to submit the log
files in private.
If applicable, add screenshots to help explain your problem.
Environment:
CockroachDB CCL v20.2.0-alpha.00000000-1972-g69bc0192b3 (x86_64-unknown-linux-gnu, built 2020/06/03 13:38:23, go1.13.9)
The text was updated successfully, but these errors were encountered: