- 
                Notifications
    You must be signed in to change notification settings 
- Fork 1
Merge from upstream #28
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
          
     Merged
      
        
      
    
                
     Merged
            
            
          Conversation
  
    
      This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
      Learn more about bidirectional Unicode characters
    
  
  
    
    This commit does a bunch of stuff to try to eliminate as many unnecessary mov instructions as possible. First, it introduces the Insn::LoadInto instruction. Previously when we needed a value to go into a specific register (like in Insn::CCall when we're putting values into the argument registers or in Insn::CRet when we're putting a value into the return register) we would first load the value and then mov it into the correct register. This resulted in a lot of duplicated work with short live ranges since they basically immediately we unnecessary. The new instruction accepts a destination and does not interact with the register allocator at all, making it much more efficient. We then use the new instruction when we're loading values into argument registers for AArch64 or X86_64, and when we're returning a value from AArch64. Notably we don't do it when we're returning a value from X86_64 because everything can be accomplished with a single mov anyway. A couple of unnecessary movs were also present because when we called the split_load_opnd function in a lot of split passes we were loading all registers and instruction outputs. We no longer do that. This commit also makes it so that UImm(0) passes through the Insn::Store split without attempting to be loaded, which allows it can take advantage of the zero register. So now instead of mov-ing 0 into a register and then calling store, it just stores XZR.
…tern matching [Bug #18990]
I want to use more complicated macros with MJIT. For example: ``` # define SHAPE_MASK (((unsigned int)1 << SHAPE_BITS) - 1) ``` This commit adds a simple recursive descent parser that produces an AST and a small visitor that converts the AST to Ruby.
Previously, it was supported in prelude.c, but has not followed up the builtin-loader system.
Remove the superfluous str_modify_keep_cr() call from rb_str_update(). It ends up calling either rb_str_drop_bytes() or rb_str_splice_0(), which already does checks if necessary. The extra call makes the string "independent". This is not always wanted, in other words, it can keep the same shared root when merely removing the leading part of a shared string.
Gems without specific platform were being preferred over matching platform specific gems. ruby/rubygems@37b95b9159
* Introduce InstructionOffset for AArch64 There are a lot of instructions on AArch64 where we take an offset from PC in terms of the number of instructions. This is for loading a value relative to the PC or for jumping. We were usually accepting an A64Opnd or an i32. It can get confusing and inconsistent though because sometimes you would divide by 4 to get the number of instructions or multiply by 4 to get the number of bytes. This commit adds a struct that wraps an i32 in order to keep all of that logic in one place. It makes it much easier to read and reason about how these offsets are getting used. * Use b instruction when the offset fits on AArch64
* Eliminate redundant mov in csel/cmov. Translate mov reg,0 into xor * Fix x86 asm test * Remove dbg!() * xor optimization unsound because it resets flags
When the build is running with a base ruby then generating `x64-ucrt-ruby320.rc` could fail due to a missing dependency to `x64-mingw-ucrt-fake.rb`. This commit adds this dependency. A failing build looks like so: ``` generating x64-mingw-ucrt-fake.rb generating x64-ucrt-ruby320.rc ../snapshot-master/win32/resource.rb:in `require': cannot load such file -- ./x64-mingw-ucrt-fake (LoadError) make: *** [GNUmakefile:57: x64-ucrt-ruby320.rc] Error 1 make: *** Waiting for unfinished jobs.... linking miniruby.exe x64-mingw-ucrt-fake.rb updated ```
* So deprecated methods/constants/functions are dealt with early, instead of many tests breaking suddenly when removing a deprecated method/constant/function. * Follows https://bugs.ruby-lang.org/issues/17591
GnuWin32 sed strips only double quotes, but not single quotes, and dies: ``` sed: -e expression #1, char 1: unknown command: `'' ```
* See [Feature #18949].
First, rb_mjit_fork should call rb_thread_atfork to stop threads after fork in the child process. Unfortunately, we cannot use rb_fork_ruby to prevent this kind of mistakes because MJIT needs special handling of waiting_pid and mjit_pause/resume. Second, mjit_waitpid_finished should be checked regardless of trap_interrupt. It doesn't seem like the flag is not set when SIGCHLD is handled for an MJIT child process.
MinGW CI has been crashing too often. Now that we don't have slow test_mjit in MinGW, I'd like to see if not using parallel test workers fixes the problem.
1067441 made this possible.
Raise `ArgumentError` in `IO#sysread` on Windows when given a negative length. Fixes [Bug #18880]
This reverts commit b8c376c, as it seems no longer needed probably.
Leave the new coderange unknown if the original encoding is not ASCII-compatible. Non-ASCII-compatible encoding strings with valid or broken coderange can end up as ascii-only. Fixes 9a8f6e3 ("Cheaply derive code range for String#b return value", 2022-07-25).
str_new_shared already has all the necessary logic to do this and is also smart enough to skip this step if the source string is already a shared string itself. This saves a useless String allocation on each call.
Object Shapes is used for accessing instance variables and representing the
"frozenness" of objects.  Object instances have a "shape" and the shape
represents some attributes of the object (currently which instance variables are
set and the "frozenness").  Shapes form a tree data structure, and when a new
instance variable is set on an object, that object "transitions" to a new shape
in the shape tree.  Each shape has an ID that is used for caching. The shape
structure is independent of class, so objects of different types can have the
same shape.
For example:
```ruby
class Foo
  def initialize
    # Starts with shape id 0
    @A = 1 # transitions to shape id 1
    @b = 1 # transitions to shape id 2
  end
end
class Bar
  def initialize
    # Starts with shape id 0
    @A = 1 # transitions to shape id 1
    @b = 1 # transitions to shape id 2
  end
end
foo = Foo.new # `foo` has shape id 2
bar = Bar.new # `bar` has shape id 2
```
Both `foo` and `bar` instances have the same shape because they both set
instance variables of the same name in the same order.
This technique can help to improve inline cache hits as well as generate more
efficient machine code in JIT compilers.
This commit also adds some methods for debugging shapes on objects.  See
`RubyVM::Shape` for more details.
For more context on Object Shapes, see [Feature: #18776]
Co-Authored-By: Aaron Patterson <tenderlove@ruby-lang.org>
Co-Authored-By: Eileen M. Uchitelle <eileencodes@gmail.com>
Co-Authored-By: John Hawthorn <john@hawthorn.email>
    Tabs were expanded because the file did not have any tab indentation in unedited lines. Please update your editor config, and use misc/expand_tabs.rb in the pre-commit hook.
This commit extracts common code between str_substr and rb_str_subseq into a function called str_subseq. This commit also applies optimizations in commit 2e88bca to rb_str_subseq.
Kernel#y is defined by psych/yaml, which causes occasional nondeterministic problems with this test when doing parallel testing.
Text is reorganized so that most of the previous text is now in these newly-created sections:
    Basic IO
    Line IO
New text is added to form new sections:
    Character IO
    Byte IO
    Codepoint IO
This gives the page a functional orientation, so that a reader can quickly find pertinent sections.
The page retains its original mission: to provide good link targets for the doc for related classes.
    Otherwise the timeout thread would be added to the ThreadGroup of the thread that makes the first call to Timeout.timeout . Fixes bug 19020: https://bugs.ruby-lang.org/issues/19020 Add a test case to make sure the common thread doesn't leak to another ThreadGroup ruby/timeout@c4f1385c9a
This was disabled by b7577b4, while properly fixed upstream by: ruby/spec#939
Adds remarks about .new and .open. Uses ..open where convenient (not convenient where output would be in a block). Fixed examples for #ungetc.
…properly Previously 9eead86 introduced non-commutativity of platforms, and later commit 1b9f7f50 changed the behavior of `Gem::Platform.match` to ensure the callee of `#=~` was the gem platform. However, when the platform argument is a String, then the callee and argument of `#=~` are flipped (see docs for `String#=~`), which works against the fix from 1b9f7f50. Closes ruby#5938 ruby/rubygems@3b1fb562e8
* Change IncrCounter lowering on AArch64 Previously we were using LDADDAL which is not available on Graviton 1 chips. Instead, we're going to use an exclusive load/store group through the LDAXR/STLXR instructions. * Update yjit/src/backend/arm64/mod.rs Co-authored-by: Maxime Chevalier-Boisvert <maximechevalierb@gmail.com>
…viton1 (ruby#6457) Reverse configure.ac changes that disable YJIT stats on Graviton1
Add assertion wrt label names
This list is out of date. At least OpenBSD since 2013 does not allow one user to read the environment variables of a process run by another user. While we could try to keep the list updated, I think it's a bad idea to not use the user/password from the environment, even if another user on the system could read it. If http_proxy exists in the environment, and other users can read it, it doesn't make it more secure for Ruby to ignore it. You could argue that it encourages poor security practices, but net/http should provide mechanism, not policy. Fixes [Bug #18908] ruby/net-http@1e4585153d
Too big parts of fractional hour time zone offset can cause assertion failures. ruby/date@06bcfb2729
It is no longer single lldb_cruby.py only.
Just like commit 1c16645 for arrays, this commit changes string slices to be a view rather than a copy even if it can be allocated through VWA.
Commit aa2a428 introduced a bug where non-embedded string slices copied the encoding of the original string. If the original string had a broken encoding but the slice has valid encoding, then the slice would be incorrectly marked as broken encoding.
            
                  wks
  
            
            approved these changes
            
                
                  Sep 29, 2022 
                
            
            
          
          
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
LGTM
    
  wks 
      pushed a commit
      that referenced
      this pull request
    
      Aug 4, 2023 
    
    
      
  
    
      
    
  
[Bug #19793]
Dummy frames are created at the top level when requiring another file.
While requiring a file, it will try to convert using encodings. Some of
these encodings will not respond to to_str. If method_missing is
redefined on Object, then it will call method_missing and attempt raise
an error. However, the iseq is invalid as it's a dummy frame so it will
write an invalid iseq to the created NoMethodError.
The following script crashes:
```
GC.stress = true
class Object
  public :method_missing
end
File.write("/tmp/empty.rb", "")
require "/tmp/empty.rb"
```
With the following backtrace:
```
frame #0: 0x00000001000fa8b8 miniruby`RVALUE_MARKED(obj=4308637824) at gc.c:1638:12
frame #1: 0x00000001000fb440 miniruby`RVALUE_BLACK_P(obj=4308637824) at gc.c:1763:12
frame #2: 0x00000001000facdc miniruby`gc_writebarrier_incremental(a=4308637824, b=4308332208, objspace=0x000000010180b000) at gc.c:8822:9
frame #3: 0x00000001000faad8 miniruby`rb_gc_writebarrier(a=4308637824, b=4308332208) at gc.c:8864:17
frame #4: 0x000000010016aff0 miniruby`rb_obj_written(a=4308637824, oldv=36, b=4308332208, filename="../iseq.c", line=1279) at gc.h:804:9
frame #5: 0x0000000100162a60 miniruby`rb_obj_write(a=4308637824, slot=0x0000000100d09888, b=4308332208, filename="../iseq.c", line=1279) at gc.h:837:5
frame #6: 0x0000000100165b0c miniruby`iseqw_new(iseq=0x0000000100d09880) at iseq.c:1279:9
frame #7: 0x0000000100165a64 miniruby`rb_iseqw_new(iseq=0x0000000100d09880) at iseq.c:1289:12
frame #8: 0x00000001000d8324 miniruby`name_err_init_attr(exc=4309777920, recv=4304780496, method=827660) at error.c:1830:35
frame #9: 0x00000001000d1b80 miniruby`name_err_init(exc=4309777920, mesg=4308332496, recv=4304780496, method=827660) at error.c:1869:12
frame #10: 0x00000001000d1bd4 miniruby`rb_nomethod_err_new(mesg=4308332496, recv=4304780496, method=827660, args=4308332448, priv=0) at error.c:1957:5
frame #11: 0x000000010039049c miniruby`rb_make_no_method_exception(exc=4304914512, format=4308332496, obj=4304780496, argc=1, argv=0x000000016fdfab00, priv=0) at vm_eval.c:959:16
frame #12: 0x00000001003b3274 miniruby`raise_method_missing(ec=0x0000000100b06f40, argc=1, argv=0x000000016fdfab00, obj=4304780496, last_call_status=MISSING_NOENTRY) at vm_eval.c:999:15
frame #13: 0x00000001003945d4 miniruby`rb_method_missing(argc=1, argv=0x000000016fdfab00, obj=4304780496) at vm_eval.c:944:5
...
frame #23: 0x000000010038f5e4 miniruby`rb_vm_call_kw(ec=0x0000000100b06f40, recv=4304780496, id=2865, argc=1, argv=0x000000016fdfab00, me=0x0000000100cbfcf0, kw_splat=0) at vm_eval.c:326:12
frame #24: 0x00000001003c18e4 miniruby`call_method_entry(ec=0x0000000100b06f40, defined_class=4304927952, obj=4304780496, id=2865, cme=0x0000000100cbfcf0, argc=1, argv=0x000000016fdfab00, kw_splat=0) at vm_method.c:2720:20
frame #25: 0x00000001003c440c miniruby`check_funcall_exec(v=6171896792) at vm_eval.c:589:12
frame #26: 0x00000001000dec00 miniruby`rb_vrescue2(b_proc=(miniruby`check_funcall_exec at vm_eval.c:587), data1=6171896792, r_proc=(miniruby`check_funcall_failed at vm_eval.c:596), data2=6171896792, args="Pȗ") at eval.c:919:18
frame #27: 0x00000001000deab0 miniruby`rb_rescue2(b_proc=(miniruby`check_funcall_exec at vm_eval.c:587), data1=6171896792, r_proc=(miniruby`check_funcall_failed at vm_eval.c:596), data2=6171896792) at eval.c:900:17
frame #28: 0x000000010039008c miniruby`check_funcall_missing(ec=0x0000000100b06f40, klass=4304923536, recv=4304780496, mid=3233, argc=0, argv=0x0000000000000000, respond=-1, def=36, kw_splat=0) at vm_eval.c:666:15
frame #29: 0x000000010038fa60 miniruby`rb_check_funcall_default_kw(recv=4304780496, mid=3233, argc=0, argv=0x0000000000000000, def=36, kw_splat=0) at vm_eval.c:703:21
frame #30: 0x000000010038fb04 miniruby`rb_check_funcall(recv=4304780496, mid=3233, argc=0, argv=0x0000000000000000) at vm_eval.c:685:12
frame #31: 0x00000001001c469c miniruby`convert_type_with_id(val=4304780496, tname="String", method=3233, raise=0, index=-1) at object.c:3061:15
frame #32: 0x00000001001c4a4c miniruby`rb_check_convert_type_with_id(val=4304780496, type=5, tname="String", method=3233) at object.c:3153:9
frame #33: 0x00000001002d59f8 miniruby`rb_check_string_type(str=4304780496) at string.c:2571:11
frame #34: 0x000000010014b7b0 miniruby`io_encoding_set(fptr=0x0000000100d09ca0, v1=4304780496, v2=4, opt=4) at io.c:11655:19
frame #35: 0x0000000100139a58 miniruby`rb_io_set_encoding(argc=1, argv=0x000000016fdfb450, io=4308334032) at io.c:13497:5
frame #36: 0x00000001003c0004 miniruby`ractor_safe_call_cfunc_m1(recv=4308334032, argc=1, argv=0x000000016fdfb450, func=(miniruby`rb_io_set_encoding at io.c:13487)) at vm_insnhelper.c:3271:12
...
frame #43: 0x0000000100390b08 miniruby`rb_funcall(recv=4308334032, mid=16593, n=1) at vm_eval.c:1137:12
frame #44: 0x00000001002a43d8 miniruby`load_file_internal(argp_v=6171899936) at ruby.c:2500:5
...
```
    
    
  wks 
      pushed a commit
      that referenced
      this pull request
    
      Dec 5, 2024 
    
    
      
  
    
      
    
  
[Bug #20921]
When we create a cache entry for a constant, the following sequence of
events could happen:
- vm_track_constant_cache is called to insert a constant cache.
- In vm_track_constant_cache, we first look up the ST table for the ID
  of the constant. Assume the ST table exists because another iseq also
  holds a cache entry for this ID.
- We then insert into this ST table with the iseq_inline_constant_cache.
- However, while inserting into this ST table, it allocates memory, which
  could trigger a GC. Assume that it does trigger a GC.
- The GC frees the one and only other iseq that holds a cache entry for
  this ID.
- In remove_from_constant_cache, it will appear that the ST table is now
  empty because there are no more iseq with cache entries for this ID, so
  we free the ST table.
- We complete GC and continue our st_insert. However, this ST table has
  been freed so we now have a use-after-free.
This issue is very hard to reproduce, because it requires that the GC runs
at a very specific time. However, we can make it show up by applying this
patch which runs GC right before the st_insert to mimic the st_insert
triggering a GC:
    diff --git a/vm_insnhelper.c b/vm_insnhelper.c
    index 3cb23f0..a93998136a 100644
    --- a/vm_insnhelper.c
    +++ b/vm_insnhelper.c
    @@ -6338,6 +6338,10 @@ vm_track_constant_cache(ID id, void *ic)
            rb_id_table_insert(const_cache, id, (VALUE)ics);
        }
    +    if (id == rb_intern("MyConstant")) rb_gc();
    +
        st_insert(ics, (st_data_t) ic, (st_data_t) Qtrue);
    }
And if we run this script:
    Object.const_set("MyConstant", "Hello!")
    my_proc = eval("-> { MyConstant }")
    my_proc.call
    my_proc = eval("-> { MyConstant }")
    my_proc.call
We can see that ASAN outputs a use-after-free error:
    ==36540==ERROR: AddressSanitizer: heap-use-after-free on address 0x606000049528 at pc 0x000102f3ceac bp 0x00016d607a70 sp 0x00016d607a68
    READ of size 8 at 0x606000049528 thread T0
        #0 0x102f3cea8 in do_hash st.c:321
        #1 0x102f3ddd0 in rb_st_insert st.c:1132
        #2 0x103140700 in vm_track_constant_cache vm_insnhelper.c:6345
        #3 0x1030b91d8 in vm_ic_track_const_chain vm_insnhelper.c:6356
        #4 0x1030b8cf8 in rb_vm_opt_getconstant_path vm_insnhelper.c:6424
        #5 0x1030bc1e0 in vm_exec_core insns.def:263
        #6 0x1030b55fc in rb_vm_exec vm.c:2585
        #7 0x1030fe0ac in rb_iseq_eval_main vm.c:2851
        #8 0x102a82588 in rb_ec_exec_node eval.c:281
        #9 0x102a81fe0 in ruby_run_node eval.c:319
        #10 0x1027f3db4 in rb_main main.c:43
        #11 0x1027f3bd4 in main main.c:68
        #12 0x183900270  (<unknown module>)
    0x606000049528 is located 8 bytes inside of 56-byte region [0x606000049520,0x606000049558)
    freed by thread T0 here:
        #0 0x104174d40 in free+0x98 (libclang_rt.asan_osx_dynamic.dylib:arm64e+0x54d40)
        #1 0x102ada89c in rb_gc_impl_free default.c:8183
        #2 0x102ada7dc in ruby_sized_xfree gc.c:4507
        #3 0x102ac4d34 in ruby_xfree gc.c:4518
        #4 0x102f3cb34 in rb_st_free_table st.c:663
        #5 0x102bd52d8 in remove_from_constant_cache iseq.c:119
        #6 0x102bbe2cc in iseq_clear_ic_references iseq.c:153
        #7 0x102bbd2a0 in rb_iseq_free iseq.c:166
        #8 0x102b32ed0 in rb_imemo_free imemo.c:564
        #9 0x102ac4b44 in rb_gc_obj_free gc.c:1407
        #10 0x102af4290 in gc_sweep_plane default.c:3546
        #11 0x102af3bdc in gc_sweep_page default.c:3634
        #12 0x102aeb140 in gc_sweep_step default.c:3906
        #13 0x102aeadf0 in gc_sweep_rest default.c:3978
        #14 0x102ae4714 in gc_sweep default.c:4155
        #15 0x102af8474 in gc_start default.c:6484
        #16 0x102afbe30 in garbage_collect default.c:6363
        #17 0x102ad37f0 in rb_gc_impl_start default.c:6816
        #18 0x102ad3634 in rb_gc gc.c:3624
        #19 0x1031406ec in vm_track_constant_cache vm_insnhelper.c:6342
        #20 0x1030b91d8 in vm_ic_track_const_chain vm_insnhelper.c:6356
        #21 0x1030b8cf8 in rb_vm_opt_getconstant_path vm_insnhelper.c:6424
        #22 0x1030bc1e0 in vm_exec_core insns.def:263
        #23 0x1030b55fc in rb_vm_exec vm.c:2585
        #24 0x1030fe0ac in rb_iseq_eval_main vm.c:2851
        #25 0x102a82588 in rb_ec_exec_node eval.c:281
        #26 0x102a81fe0 in ruby_run_node eval.c:319
        #27 0x1027f3db4 in rb_main main.c:43
        #28 0x1027f3bd4 in main main.c:68
        #29 0x183900270  (<unknown module>)
    previously allocated by thread T0 here:
        #0 0x104174c04 in malloc+0x94 (libclang_rt.asan_osx_dynamic.dylib:arm64e+0x54c04)
        #1 0x102ada0ec in rb_gc_impl_malloc default.c:8198
        #2 0x102acee44 in ruby_xmalloc gc.c:4438
        #3 0x102f3c85c in rb_st_init_table_with_size st.c:571
        #4 0x102f3c900 in rb_st_init_table st.c:600
        #5 0x102f3c920 in rb_st_init_numtable st.c:608
        #6 0x103140698 in vm_track_constant_cache vm_insnhelper.c:6337
        #7 0x1030b91d8 in vm_ic_track_const_chain vm_insnhelper.c:6356
        #8 0x1030b8cf8 in rb_vm_opt_getconstant_path vm_insnhelper.c:6424
        #9 0x1030bc1e0 in vm_exec_core insns.def:263
        #10 0x1030b55fc in rb_vm_exec vm.c:2585
        #11 0x1030fe0ac in rb_iseq_eval_main vm.c:2851
        #12 0x102a82588 in rb_ec_exec_node eval.c:281
        #13 0x102a81fe0 in ruby_run_node eval.c:319
        #14 0x1027f3db4 in rb_main main.c:43
        #15 0x1027f3bd4 in main main.c:68
        #16 0x183900270  (<unknown module>)
This commit fixes this bug by adding a inserting_constant_cache_id field
to the VM, which stores the ID that is currently being inserted and, in
remove_from_constant_cache, we don't free the ST table for ID equal to
this one.
Co-Authored-By: Alan Wu <alanwu@ruby-lang.org>
    
  
    Sign up for free
    to join this conversation on GitHub.
    Already have an account?
    Sign in to comment
  
      
  Add this suggestion to a batch that can be applied as a single commit.
  This suggestion is invalid because no changes were made to the code.
  Suggestions cannot be applied while the pull request is closed.
  Suggestions cannot be applied while viewing a subset of changes.
  Only one suggestion per line can be applied in a batch.
  Add this suggestion to a batch that can be applied as a single commit.
  Applying suggestions on deleted lines is not supported.
  You must change the existing code in this line in order to create a valid suggestion.
  Outdated suggestions cannot be applied.
  This suggestion has been applied or marked resolved.
  Suggestions cannot be applied from pending reviews.
  Suggestions cannot be applied on multi-line comments.
  Suggestions cannot be applied while the pull request is queued to merge.
  Suggestion cannot be applied right now. Please check back later.
  
    
  
    
We did have some merge conflicts, but they seem to have gone away. I'll push a merge anyway.