Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

rb_str_resize(Qfalse, 0) with NKF and without C extensions lock #3625

Open
eregon opened this issue Jul 22, 2024 · 3 comments
Open

rb_str_resize(Qfalse, 0) with NKF and without C extensions lock #3625

eregon opened this issue Jul 22, 2024 · 3 comments
Assignees
Labels

Comments

@eregon
Copy link
Member

eregon commented Jul 22, 2024

Found by @mohamedhafez and reported on Slack:

class: TypeError
TruffleRuby doesn't have a case for the org.truffleruby.cext.CExtNodesFactory$RbStrResizeNodeFactory$RbStrResizeNodeGen node with values of type java.lang.Boolean=false java.lang.Integer=0
	from org.truffleruby.cext.CExtNodesFactory$RbStrResizeNodeFactory$RbStrResizeNodeGen.executeAndSpecialize(CExtNodesFactory.java:4234)
	from org.truffleruby.cext.CExtNodesFactory$RbStrResizeNodeFactory$RbStrResizeNodeGen.execute(CExtNodesFactory.java:4212)
	from org.truffleruby.language.RubyCoreMethodRootNode.execute(RubyCoreMethodRootNode.java:58)
string.c:197:in `rb_str_resize': TruffleRuby doesn't have a case for the org.truffleruby.cext.CExtNodesFactory$RbStrResizeNodeFactory$RbStrResizeNodeGen node with values of type java.lang.Boolean=false java.lang.Integer=0 (TypeError)
	from org.truffleruby.cext.CExtNodesFactory$RbStrResizeNodeFactory$RbStrResizeNodeGen.executeAndSpecialize(CExtNodesFactory.java:4234)
	from org.truffleruby.cext.CExtNodesFactory$RbStrResizeNodeFactory$RbStrResizeNodeGen.execute(CExtNodesFactory.java:4212)
	from org.truffleruby.language.RubyCoreMethodRootNode.execute(RubyCoreMethodRootNode.java:58)
	from string.c:197:in `rb_str_resize'
	from /Users/mohamed/.rbenv/versions/truffleruby+graalvm-24.0.1/lib/truffle/truffle/cext_ruby.rb:40:in `guess'
	from /Users/mohamed/rubyprojects/sa-mechanize/lib/mechanize/util.rb:58:in `detect_charset'
	from /Users/mohamed/rubyprojects/substitutealert/lib/ext/mechanize_additions.rb:22:in `initialize'

So this must be a call to rb_str_resize(str, 0) with str=0 but of course it shouldn't be 0.

Looking at NKF sources, there is a single call to rb_str_resize:

rb_str_resize(result, o_len);

result is a global variable 😨

static VALUE result;

/* use _result_ begin*/
result = tmp;
kanji_convert(NULL);
result = Qnil;
/* use _result_ end */

So that is very likely the issue.
One solution is to use a native thread-local variable instead.
Or pass it as an extra parameter but that would require touching more of the upstream nkf code.

Or, always use the C extensions lock/a lock for NKF.

@eregon eregon added the cexts label Jul 22, 2024
@eregon eregon assigned andrykonchin and unassigned andrykonchin Jul 22, 2024
@andrykonchin
Copy link
Member

It seems to me all the global variables could lead to such issues:

static unsigned char *output;
static unsigned char *input;
static int input_ctr;
static int i_len;
static int output_ctr;
static int o_len;
static int incsize;

So all these variables probably should be passed everywhere (as a struct?). They are used only in this file so it doesn't seem a big effort to fix. Does it makes sense to do it?

@eregon
Copy link
Member Author

eregon commented Jul 22, 2024

The problem is e.g. rb_nkf_putchar() is a callback called as putchar() from src/main/c/nkf/nkf-utf8/nkf.c.
So I think it's too messy to pass a struct there.
Let's try using thread_local (https://en.cppreference.com/w/c/thread/thread_local) or __thread on these variables, that should require minimal changes.

@andrykonchin
Copy link
Member

Created a PR ruby/nkf#19

@andrykonchin andrykonchin self-assigned this Jul 23, 2024
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Projects
None yet
Development

No branches or pull requests

2 participants