-
Notifications
You must be signed in to change notification settings - Fork 15.6k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Ruby: Seg fault with google-protobuf >= 3.15 #8938
Comments
I stripped some unnecessary code from |
Thanks so much for the brilliant repro. I was immediately able to reproduce on my machine. I'll look into this right away, hopefully we can get a fix into 3.18.0 which is in -rc right now. |
I further reduced the repro: require 'google/protobuf'
Google::Protobuf::DescriptorPool.generated_pool.build do
add_file("test.proto", :syntax => :proto3) do
add_message "Test.Node" do
end
add_message "Test.BoolExpr" do
repeated :args, :message, 3, "Test.Node"
end
end
end
module Test
Node = ::Google::Protobuf::DescriptorPool.generated_pool.lookup("Test.Node").msgclass
BoolExpr = ::Google::Protobuf::DescriptorPool.generated_pool.lookup("Test.BoolExpr").msgclass
end
GC.stress = true
subselect_items = [nil]
expr = Test::BoolExpr.new(:args => [{}, {}, {}])
subselect_items.concat(expr.args) |
Ah, great! I should mention I did see this stack trace before:
The Ruby garbage collector is failing while trying to iterate through an array and mark each object. Somehow the object is |
Yes that mirrors my observations too. When I add debug statements into creation/deletion of message objects, I see: require 'google/protobuf'
Google::Protobuf::DescriptorPool.generated_pool.build do
add_file("test.proto", :syntax => :proto3) do
add_message "Test.Node" do
end
add_message "Test.BoolExpr" do
repeated :args, :message, 3, "Test.Node"
end
end
end
module Test
Node = ::Google::Protobuf::DescriptorPool.generated_pool.lookup("Test.Node").msgclass
BoolExpr = ::Google::Protobuf::DescriptorPool.generated_pool.lookup("Test.BoolExpr").msgclass
end
GC.stress = true
STDERR.puts("Starting test")
expr = Test::BoolExpr.decode_json('{"args": [{}, {}, {}]}')
args = expr.args
items = [nil]
STDERR.puts("This will go boom")
items.concat(args) Output:
So the three message objects are indeed being freed before the What's even weirder is that if I explicitly call require 'google/protobuf'
Google::Protobuf::DescriptorPool.generated_pool.build do
add_file("test.proto", :syntax => :proto3) do
add_message "Test.Node" do
end
add_message "Test.BoolExpr" do
repeated :args, :message, 3, "Test.Node"
end
end
end
module Test
Node = ::Google::Protobuf::DescriptorPool.generated_pool.lookup("Test.Node").msgclass
BoolExpr = ::Google::Protobuf::DescriptorPool.generated_pool.lookup("Test.BoolExpr").msgclass
end
GC.stress = true
STDERR.puts("Starting test")
expr = Test::BoolExpr.decode_json('{"args": [{}, {}, {}]}')
args = expr.args
items = [nil]
STDERR.puts("This will work fine")
items.concat(args.to_ary)
STDERR.puts("Succeeded!") Output:
This is starting to smell like a bug in Ruby. |
I have a minimal repro that doesn't use protobuf at all! C extension: #include "ruby.h"
VALUE cFoo;
// Foo
typedef struct {
int dummy;
} Foo;
static void Foo_free(void* _self) {
fprintf(stderr, "Foo_free(%p)\n", _self);
xfree(_self);
}
static rb_data_type_t Foo_type = {
"Foo",
{NULL, Foo_free, NULL },
.flags = RUBY_TYPED_FREE_IMMEDIATELY,
};
static VALUE Foo_alloc(VALUE klass) {
Foo* _self = ALLOC(Foo);
fprintf(stderr, "Foo_alloc(%p)\n", _self);
return TypedData_Wrap_Struct(klass, &Foo_type, _self);
}
// Bar
typedef struct {
int dummy;
} Bar;
static void Bar_free(void* _self) {
fprintf(stderr, "Bar_free(%p)\n", _self);
xfree(_self);
}
static rb_data_type_t Bar_type = {
"Bar",
{NULL, Bar_free, NULL },
.flags = RUBY_TYPED_FREE_IMMEDIATELY,
};
static VALUE Bar_alloc(VALUE klass) {
Bar* bar = ALLOC(Bar);
return TypedData_Wrap_Struct(klass, &Bar_type, bar);
}
VALUE Bar_to_ary(VALUE _self) {
fprintf(stderr, "Bar_to_ary() begin\n");
VALUE ary = rb_ary_new2(3);
rb_ary_push(ary, Foo_alloc(cFoo));
rb_ary_push(ary, Foo_alloc(cFoo));
rb_ary_push(ary, Foo_alloc(cFoo));
fprintf(stderr, "Bar_to_ary() end\n");
return ary;
}
void Init_test_ext() {
cFoo = rb_define_class("Foo", rb_cObject);
rb_gc_register_address(&cFoo);
rb_define_alloc_func(cFoo, Foo_alloc);
VALUE bar = rb_define_class("Bar", rb_cObject);
rb_define_alloc_func(bar, Bar_alloc);
rb_define_method(bar, "to_ary", Bar_to_ary, 0);
} extconf.rb: #!/usr/bin/ruby
require 'mkmf'
$objs = ["test_ext.o"]
create_makefile("test_ext") Test script: require "test_ext"
GC.stress = true
arr = [nil]
STDERR.puts("Bar.new")
bar = Bar.new
STDERR.puts("concat")
arr.concat(bar)
STDERR.puts("concat done") Output:
|
For me this crash reproduces in every version of Ruby from 2.5.0 - 3.0.2. I'll report something in the Ruby bug tracker. Ideally there would be a workaround I could recommend, but nothing immediately comes to mind. |
Filed upstream: https://bugs.ruby-lang.org/issues/18140 |
@haberman Thanks for doing this investigation and filing the bug! |
You're welcome, the great repro really helped! Amazingly, the bug already has a proposed fix upstream: https://bugs.ruby-lang.org/issues/18140#note-2 |
Awesome, thanks! Closing this issue. |
What version of protobuf and what language are you using?
Version: master/v3.17.3/v13.5.8
Language: Ruby
What operating system (Linux, Windows, ...) and version?
Linux and macOS
What runtime / compiler are you using (e.g., python version or gcc version)
What did you do?
Steps to reproduce the behavior:
Attached is a reproduction script. Extract and run
ruby test_parse.rb
.repro2.zip
@haberman This bug was originally filed under pganalyze/pg_query#226. I took the C extension out of the equation and loaded the serialized bytes directly from a file, and this seems more like a memory issue with google-protobuf. Note there is some recursive
repeated
andoneof
fields that may be causing some issues.With google-protobuf v3.14.x, there is no seg fault. With versions >= 3.15, there is a seg fault. This sounds similar to a memory issue identified in #8559. I wasn't able to get
valgrind
or Address Sanitizer to flag an issue.What did you expect to see
Empty stdout
What did you see instead?
Seg fault
The text was updated successfully, but these errors were encountered: