-
Notifications
You must be signed in to change notification settings - Fork 1.6k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Add MemoryCopy IL instruction and take advantage of it in various places #42072
Labels
area-vm
Use area-vm for VM related issues, including code coverage, and the AOT and JIT backends.
Comments
mkustermann
added
the
area-vm
Use area-vm for VM related issues, including code coverage, and the AOT and JIT backends.
label
May 27, 2020
dart-bot
pushed a commit
that referenced
this issue
Jun 10, 2020
Used for copying the bytes from the Uint8List to the _OneByteString in String.fromCharCodes and the pure-ASCII case of UTF-8 decoding. Issue #42072 Closes #41703 Change-Id: I1ae300222877d1c6e64e32c2f40b8fb187a584c0 Reviewed-on: https://dart-review.googlesource.com/c/sdk/+/149500 Commit-Queue: Aske Simon Christensen <askesc@google.com> Reviewed-by: Martin Kustermann <kustermann@google.com>
dart-bot
pushed a commit
that referenced
this issue
Jun 10, 2020
…d strings." This reverts commit 6ecd8a1. Reason for revert: Breaks ABI Original change's description: > [vm] MemoryCopy instruction for copying between typed data and strings. > > Used for copying the bytes from the Uint8List to the _OneByteString in > String.fromCharCodes and the pure-ASCII case of UTF-8 decoding. > > Issue #42072 > Closes #41703 > > Change-Id: I1ae300222877d1c6e64e32c2f40b8fb187a584c0 > Reviewed-on: https://dart-review.googlesource.com/c/sdk/+/149500 > Commit-Queue: Aske Simon Christensen <askesc@google.com> > Reviewed-by: Martin Kustermann <kustermann@google.com> TBR=kustermann@google.com,askesc@google.com Change-Id: I5fc0b58da80dca23c91b57ec2833492e9d0885a8 No-Presubmit: true No-Tree-Checks: true No-Try: true Reviewed-on: https://dart-review.googlesource.com/c/sdk/+/150802 Reviewed-by: Ryan Macnak <rmacnak@google.com> Commit-Queue: Ryan Macnak <rmacnak@google.com>
dart-bot
pushed a commit
that referenced
this issue
Jun 11, 2020
…d strings." This is a reland of 6ecd8a1 Original change's description: > [vm] MemoryCopy instruction for copying between typed data and strings. > > Used for copying the bytes from the Uint8List to the _OneByteString in > String.fromCharCodes and the pure-ASCII case of UTF-8 decoding. > > Issue #42072 > Closes #41703 > > Change-Id: I1ae300222877d1c6e64e32c2f40b8fb187a584c0 > Reviewed-on: https://dart-review.googlesource.com/c/sdk/+/149500 > Commit-Queue: Aske Simon Christensen <askesc@google.com> > Reviewed-by: Martin Kustermann <kustermann@google.com> Change-Id: Ia231c521e5f2db168cfc6094dfc7322327dedc6d Reviewed-on: https://dart-review.googlesource.com/c/sdk/+/150925 Reviewed-by: Martin Kustermann <kustermann@google.com> Commit-Queue: Aske Simon Christensen <askesc@google.com>
tekknolagi
pushed a commit
to tekknolagi/dart-assembler
that referenced
this issue
Nov 3, 2020
Used for copying the bytes from the Uint8List to the _OneByteString in String.fromCharCodes and the pure-ASCII case of UTF-8 decoding. Issue dart-lang#42072 Closes dart-lang#41703 Change-Id: I1ae300222877d1c6e64e32c2f40b8fb187a584c0 Reviewed-on: https://dart-review.googlesource.com/c/sdk/+/149500 Commit-Queue: Aske Simon Christensen <askesc@google.com> Reviewed-by: Martin Kustermann <kustermann@google.com>
tekknolagi
pushed a commit
to tekknolagi/dart-assembler
that referenced
this issue
Nov 3, 2020
…d strings." This reverts commit 6ecd8a1. Reason for revert: Breaks ABI Original change's description: > [vm] MemoryCopy instruction for copying between typed data and strings. > > Used for copying the bytes from the Uint8List to the _OneByteString in > String.fromCharCodes and the pure-ASCII case of UTF-8 decoding. > > Issue dart-lang#42072 > Closes dart-lang#41703 > > Change-Id: I1ae300222877d1c6e64e32c2f40b8fb187a584c0 > Reviewed-on: https://dart-review.googlesource.com/c/sdk/+/149500 > Commit-Queue: Aske Simon Christensen <askesc@google.com> > Reviewed-by: Martin Kustermann <kustermann@google.com> TBR=kustermann@google.com,askesc@google.com Change-Id: I5fc0b58da80dca23c91b57ec2833492e9d0885a8 No-Presubmit: true No-Tree-Checks: true No-Try: true Reviewed-on: https://dart-review.googlesource.com/c/sdk/+/150802 Reviewed-by: Ryan Macnak <rmacnak@google.com> Commit-Queue: Ryan Macnak <rmacnak@google.com>
tekknolagi
pushed a commit
to tekknolagi/dart-assembler
that referenced
this issue
Nov 3, 2020
…d strings." This is a reland of 6ecd8a1 Original change's description: > [vm] MemoryCopy instruction for copying between typed data and strings. > > Used for copying the bytes from the Uint8List to the _OneByteString in > String.fromCharCodes and the pure-ASCII case of UTF-8 decoding. > > Issue dart-lang#42072 > Closes dart-lang#41703 > > Change-Id: I1ae300222877d1c6e64e32c2f40b8fb187a584c0 > Reviewed-on: https://dart-review.googlesource.com/c/sdk/+/149500 > Commit-Queue: Aske Simon Christensen <askesc@google.com> > Reviewed-by: Martin Kustermann <kustermann@google.com> Change-Id: Ia231c521e5f2db168cfc6094dfc7322327dedc6d Reviewed-on: https://dart-review.googlesource.com/c/sdk/+/150925 Reviewed-by: Martin Kustermann <kustermann@google.com> Commit-Queue: Aske Simon Christensen <askesc@google.com>
@sstrickl is going to look into making |
copybara-service bot
pushed a commit
that referenced
this issue
Aug 9, 2023
The MemoryCopy benchmark suite measures the overhead of copying data between compatible TypedData or Pointer values. Change-Id: Iaf5ea27b7f9177f4800880da36234afd2b908db2 Bug: #42072 Bug: b/294114694 Reviewed-on: https://dart-review.googlesource.com/c/sdk/+/318661 Reviewed-by: Martin Kustermann <kustermann@google.com> Commit-Queue: Tess Strickland <sstrickl@google.com>
Copying comments from Alex's last review of https://dart-review.googlesource.com/c/sdk/+/319521 here so I don't forget to address them later in followup work:
Also should look at when time allows:
|
copybara-service bot
pushed a commit
that referenced
this issue
Sep 4, 2023
When setRange is called on a TypedData receiver and the source is also a TypedData object with the same element size and clamping is not required, the VM implementation now calls _boundsCheckAndMemcpyN for element size N. The generated IL for these methods performs the copy using the MemoryCopy instruction (mostly, see the note below). Since the two TypedData objects might have the same underlying buffer, the CL adds a can_overlap flag to the MemoryCopy instruction which checks for overlapping regions. If can_overlap is set, then the copy is performed backwards instead of forwards when needed to ensure that elements of the source region are read before they are overwritten. The existing uses of the MemoryCopy instruction are adjusted as follows: * The IL generated for copyRangeFromUint8ListToOneByteString passes false for can_overlap, as all uses currently ensure that the OneByteString is non-external and thus cannot overlap. * The IL generated for _memCopy, used by the FFI library, passes true for can_overlap, as there is no guarantee that the regions pointed at by the Pointer objects do not overlap. The MemoryCopy instruction has also been adjusted so that all numeric inputs (the two start offsets and the length) are either boxed or unboxed instead of just the length. This exposed an issue in the inliner, where unboxed constants in the callee graph were replaced with boxed constants when inlining into the caller graph, since withList calls setRange with constant starting offsets of 0. Now the representation of constants in the callee graph are preserved when inlining the callee graph into the caller graph. Fixes #51237 by using TMP and TMP2 for the LDP/STP calls in the 16-byte element size case, so no temporaries need to be allocated for the instruction. On ARM when not unrolling the memory copy loop, uses TMP and a single additional temporary for LDM/STM calls in the 8-byte and 16-byte element cases, with the latter just using two LDM/STM calls within the loop, a different approach than the one described in #51229 . Note: Once the number of elements being copied reaches a certain threshold (1048576 on X86, 256 otherwise), _boundsCheckAndMemcpyN instead calls _nativeSetRange, which is a native call that uses memmove from the standard C library for non-clamped inputs. It does this because the code currently emitted for MemoryCopy performs poorly compared to the more optimized memmove implementation when copying larger regions of memory. Notable benchmark changes for dart-aot: * X64 * TypedDataDuplicate.*.fromList improvement from ~13%-~250% * Uf8Encode.*.10 improvement from ~50%-~75% * MapCopy.Map.*.of.Map.* improvement from ~13%-~65% * MemoryCopy.*.setRange.* improvement from ~13%-~500% * ARM7 * Uf8Encode.*.10 improvement from ~35%-~70% * MapCopy.Map.*.of.Map.* improvement from ~6%-~75% * MemoryCopy.*.setRange.{8,64} improvement from ~22%-~500% * Improvement of ~100%-~200% for MemoryCopy.512.setRange.*.Double * Regression of ~40% for MemoryCopy.512.setRange.*.Uint8 * Regression of ~85% for MemoryCopy.4096.setRange.*.Uint8 * ARM8 * Uf8Encode.*.10 improvement from ~35%-~70% * MapCopy.Map.*.of.Map.* improvement from ~7%-~75% * MemoryCopy.*.setRange.{8,64} improvement from ~22%-~500% * Improvement of ~75%-~160% for MemoryCopy.512.setRange.*.Double * Regression of ~40% for MemoryCopy.512.setRange.*.Uint8 * Regression of ~85% for MemoryCopy.4096.setRange.*.Uint8 TEST=vm/cc/IRTest_Memory, co19{,_2}/LibTest/typed_data, lib{,_2}/typed_data, corelib{,_2}/list_test Issue: #42072 Issue: b/294114694 Issue: b/259315681 Change-Id: Ic75521c5fe10b952b5b9ce5f2020c7e3f03672a9 Cq-Include-Trybots: luci.dart.try:vm-aot-linux-debug-simarm_x64-try,vm-aot-linux-debug-simriscv64-try,vm-aot-linux-debug-x64-try,vm-aot-linux-debug-x64c-try,vm-kernel-linux-debug-x64-try,vm-kernel-precomp-linux-debug-x64-try,vm-linux-debug-ia32-try,vm-linux-debug-simriscv64-try,vm-linux-debug-x64-try,vm-linux-debug-x64c-try,vm-mac-debug-arm64-try,vm-mac-debug-x64-try,vm-aot-linux-release-simarm64-try,vm-aot-linux-release-simarm_x64-try,vm-aot-linux-release-x64-try,vm-aot-mac-release-arm64-try,vm-aot-mac-release-x64-try,vm-ffi-qemu-linux-release-riscv64-try,vm-ffi-qemu-linux-release-arm-try,vm-aot-msan-linux-release-x64-try,vm-msan-linux-release-x64-try,vm-aot-tsan-linux-release-x64-try,vm-tsan-linux-release-x64-try,vm-linux-release-ia32-try,vm-linux-release-simarm-try,vm-linux-release-simarm64-try,vm-linux-release-x64-try,vm-mac-release-arm64-try,vm-mac-release-x64-try,vm-kernel-precomp-linux-release-x64-try,vm-aot-android-release-arm64c-try,vm-ffi-android-debug-arm64c-try Reviewed-on: https://dart-review.googlesource.com/c/sdk/+/319521 Reviewed-by: Daco Harkes <dacoharkes@google.com> Reviewed-by: Alexander Markov <alexmarkov@google.com> Commit-Queue: Tess Strickland <sstrickl@google.com>
copybara-service bot
pushed a commit
that referenced
this issue
Sep 5, 2023
The bounds checking was implemented in Dart previously, but this removes _checkSetRangeArguments, inlining it into _TypedListBase.setRange, renames _checkBoundsAndMemcpyN to _memMoveN since it no longer performs bounds checking, and also removes the now unneeded bounds checking from the native function TypedData_setRange. TEST=co19{,_2}/LibTest/typed_data lib{,_2}/typed_data corelib{,_2}/list_test Issue: #42072 Cq-Include-Trybots: luci.dart.try:vm-aot-linux-debug-simarm_x64-try,vm-aot-linux-debug-x64-try,vm-aot-linux-debug-x64c-try,vm-kernel-linux-debug-x64-try,vm-kernel-precomp-linux-debug-x64-try,vm-linux-debug-x64-try,vm-linux-debug-x64c-try,vm-mac-debug-arm64-try,vm-aot-linux-release-simarm_x64-try,vm-aot-linux-release-x64-try,vm-aot-mac-release-arm64-try,vm-linux-release-x64-try,vm-mac-release-arm64-try,vm-kernel-precomp-linux-release-x64-try Change-Id: I85ec751708f603f68729f4109d7339dd8407ae77 Reviewed-on: https://dart-review.googlesource.com/c/sdk/+/324102 Reviewed-by: Alexander Markov <alexmarkov@google.com> Commit-Queue: Tess Strickland <sstrickl@google.com>
copybara-service bot
pushed a commit
that referenced
this issue
Sep 7, 2023
When copying elements that are smaller than words on non-X86 architectures, only copy element by element until the remaining elements can be copied in word-sized chunks, and then do so. TEST=vm/cc/IRTest_Memory, co19{,_2}/LibTest/typed_data, lib{,_2}/typed_data, corelib{,_2}/list_test Issue: #42072 Cq-Include-Trybots: luci.dart.try:vm-aot-linux-debug-simarm_x64-try,vm-aot-linux-debug-simriscv64-try,vm-linux-debug-simriscv64-try,vm-mac-debug-arm64-try,vm-aot-linux-release-simarm64-try,vm-aot-linux-release-simarm_x64-try,vm-aot-mac-release-arm64-try,vm-ffi-qemu-linux-release-riscv64-try,vm-ffi-qemu-linux-release-arm-try,vm-linux-release-simarm-try,vm-linux-release-simarm64-try,vm-mac-release-arm64-try,vm-aot-android-release-arm64c-try,vm-ffi-android-debug-arm64c-try Change-Id: I61eab310b92a6bc5ebd88fa63d562103d887cb74 Reviewed-on: https://dart-review.googlesource.com/c/sdk/+/324280 Reviewed-by: Alexander Markov <alexmarkov@google.com> Commit-Queue: Tess Strickland <sstrickl@google.com> Reviewed-by: Ryan Macnak <rmacnak@google.com>
copybara-service bot
pushed a commit
that referenced
this issue
Sep 11, 2023
TEST=vm/cc/IRTest_Memory, co19{,_2}/LibTest/typed_data, lib{,_2}/typed_data, corelib{,_2}/list_test Issue: #42072 Cq-Include-Trybots: luci.dart.try:vm-mac-debug-arm64-try,vm-aot-linux-release-simarm64-try,vm-aot-mac-release-arm64-try,vm-linux-release-simarm64-try,vm-mac-release-arm64-try,vm-aot-android-release-arm64c-try,vm-ffi-android-debug-arm64c-try,vm-ffi-qemu-linux-release-arm-try,vm-ffi-qemu-linux-release-riscv64-try,vm-linux-debug-simriscv64-try,vm-linux-release-simarm-try Change-Id: Ife645a1d09be862d74e198162b124e657878280a Reviewed-on: https://dart-review.googlesource.com/c/sdk/+/324683 Commit-Queue: Tess Strickland <sstrickl@google.com> Reviewed-by: Alexander Markov <alexmarkov@google.com>
copybara-service bot
pushed a commit
that referenced
this issue
Sep 11, 2023
Instead of making a StaticCall to _TypedListBase.nativeSetRange inside _memMoveN, make a CCall to the memmove leaf runtime entry. Rename _TypedListBase._nativeSetRange to _setClampedRange, since it's now only used when per-element clamping is necessary. Fix the load optimizer so that loads of unboxed fields from freshly allocated objects do not have the tagged null value forwarded as their initial post-allocation value. TEST=co19{,_2}/LibTest/typed_data lib{,_2}/typed_data corelib{,_2}/list_test vm/cc/LoadOptimizer_LoadDataFieldOfNewTypedData Issue: #42072 Change-Id: Ib82e24a5b3287fa53099fffd3b563a27d777507e Cq-Include-Trybots: luci.dart.try:vm-aot-linux-debug-simarm_x64-try,vm-aot-linux-debug-x64-try,vm-aot-linux-debug-x64c-try,vm-kernel-linux-debug-x64-try,vm-kernel-precomp-linux-debug-x64-try,vm-linux-debug-x64-try,vm-linux-debug-x64c-try,vm-mac-debug-arm64-try,vm-aot-linux-release-simarm_x64-try,vm-aot-linux-release-x64-try,vm-aot-mac-release-arm64-try,vm-aot-msan-linux-release-x64-try,vm-msan-linux-release-x64-try,vm-aot-tsan-linux-release-x64-try,vm-tsan-linux-release-x64-try,vm-linux-release-x64-try,vm-mac-release-arm64-try,vm-kernel-precomp-linux-release-x64-try Reviewed-on: https://dart-review.googlesource.com/c/sdk/+/324080 Reviewed-by: Alexander Markov <alexmarkov@google.com> Commit-Queue: Tess Strickland <sstrickl@google.com>
copybara-service bot
pushed a commit
that referenced
this issue
Oct 13, 2023
This CL adds the ability to pass the payload address of the source and destination directly to the MemoryCopy instruction as an untagged value. The new translation of the _TypedListBase._memMoveN methods use the new MemoryCopy constructor, retrieving the untagged value of the data field of both the source and destination. This way, if inlining exposes the allocation of the object from which the data field is being retrieved, then allocation sinking can remove the intermediate allocation if there are no escaping uses of the object. Since Pointer.asTypedList allocates such ExternalTypedData objects, this CL makes that method inlined if at all possible, which removes the intermediate allocation if the only use of the TypedData object is to call setRange for memory copying purposes. This CL also separates unboxed native slots into two groups: those that contain untagged addresses and those that do not. The former group now have the kUntagged representation, which mimics the old use of LoadUntagged for the PointerBase data field and also ensures that any arithmetic operations on untagged addresses must first be explicitly converted to an unboxed integer and then explicitly converted back to untagged before being stored in a slot that contains untagged addresses. When a unboxed native slot that contains untagged addresses is defined, the definition also includes a boolean which represents whether addresses that may be moved by the GC can be stored in this slot or not. The redundancy eliminator uses this to decide whether it is safe to eliminate a duplicate load, replace a load with the value originally stored in the slot, or lift a load out of a loop. In particular, the PointerBase data field may contain GC-moveable addresses, but only for internal TypedData objects and views, not for external TypedData objects or Pointers. To allow load optimizations involving the latter, the LoadField and StoreField instructions now take boolean flags for whether loads or stores from the slot are guaranteed to not be GC-moveable, to override the information from the slot argument. Notable benchmark changes on x64 (similar for other archs unless noted): JIT: * FfiMemory.PointerPointer: 250.7% * FfiStructCopy.Copy1Bytes: -26.73% (only x64) * FfiStructCopy.Copy32Bytes: -25.18% (only x64) * MemoryCopy.64.setRange.Pointer.Uint8: 19.36% * MemoryCopy.64.setRange.Pointer.Double: 18.96% * MemoryCopy.8.setRange.Pointer.Double: 17.59% * MemoryCopy.8.setRange.Pointer.Uint8: 19.46% AOT: * FfiMemory.PointerPointer: 323.5% * FfiStruct.FieldLoadStore: 483.3% * FileIO_readwrite_64kb: 15.39% * FileIO_readwrite_512kb (Intel Xeon): 46.22% * MemoryCopy.512.setRange.Pointer.Uint8: 35.20% * MemoryCopy.64.setRange.Pointer.Uint8: 55.40% * MemoryCopy.512.setRange.Pointer.Double: 29.45% * MemoryCopy.64.setRange.Pointer.Double: 60.37% * MemoryCopy.8.setRange.Pointer.Double: 59.54% * MemoryCopy.8.setRange.Pointer.Uint8: 55.40% * FfiStructCopy.Copy32Bytes: 398.3% * FfiStructCopy.Copy1Bytes: 1233% TEST=vm/dart/address_local_pointer, vm/dart/pointer_as_typed_list Issue: #42072 Fixes: #53124 Cq-Include-Trybots: luci.dart.try:vm-ffi-qemu-linux-release-arm-try,vm-eager-optimization-linux-release-x64-try,vm-linux-release-x64-try,vm-linux-debug-x64-try,vm-aot-linux-release-x64-try,vm-aot-linux-debug-x64-try Change-Id: I563e0bfac5b1ac6cf1111649934067c12891b631 Reviewed-on: https://dart-review.googlesource.com/c/sdk/+/324820 Reviewed-by: Alexander Markov <alexmarkov@google.com> Commit-Queue: Tess Strickland <sstrickl@google.com> Reviewed-by: Martin Kustermann <kustermann@google.com>
copybara-service bot
pushed a commit
that referenced
this issue
Oct 13, 2023
…es." This reverts commit 06d7a23. Reason for revert: everything crashes on vm-aot-linux-debug-simarm_x64 Original change's description: > [vm/compiler] Change MemoryCopy to also take untagged addresses. > > This CL adds the ability to pass the payload address of the source > and destination directly to the MemoryCopy instruction as an untagged > value. > > The new translation of the _TypedListBase._memMoveN methods use the new > MemoryCopy constructor, retrieving the untagged value of the data field > of both the source and destination. This way, if inlining exposes the > allocation of the object from which the data field is being retrieved, > then allocation sinking can remove the intermediate allocation if there > are no escaping uses of the object. > > Since Pointer.asTypedList allocates such ExternalTypedData objects, > this CL makes that method inlined if at all possible, which removes > the intermediate allocation if the only use of the TypedData object > is to call setRange for memory copying purposes. > > This CL also separates unboxed native slots into two groups: those > that contain untagged addresses and those that do not. The former > group now have the kUntagged representation, which mimics the old > use of LoadUntagged for the PointerBase data field and also ensures > that any arithmetic operations on untagged addresses must first be > explicitly converted to an unboxed integer and then explicitly converted > back to untagged before being stored in a slot that contains untagged > addresses. > > When a unboxed native slot that contains untagged addresses is defined, > the definition also includes a boolean which represents whether > addresses that may be moved by the GC can be stored in this slot or not. > The redundancy eliminator uses this to decide whether it is safe to > eliminate a duplicate load, replace a load with the value originally > stored in the slot, or lift a load out of a loop. > > In particular, the PointerBase data field may contain GC-moveable > addresses, but only for internal TypedData objects and views, not > for external TypedData objects or Pointers. To allow load optimizations > involving the latter, the LoadField and StoreField instructions now > take boolean flags for whether loads or stores from the slot are > guaranteed to not be GC-moveable, to override the information from > the slot argument. > > Notable benchmark changes on x64 (similar for other archs unless noted): > > JIT: > * FfiMemory.PointerPointer: 250.7% > * FfiStructCopy.Copy1Bytes: -26.73% (only x64) > * FfiStructCopy.Copy32Bytes: -25.18% (only x64) > * MemoryCopy.64.setRange.Pointer.Uint8: 19.36% > * MemoryCopy.64.setRange.Pointer.Double: 18.96% > * MemoryCopy.8.setRange.Pointer.Double: 17.59% > * MemoryCopy.8.setRange.Pointer.Uint8: 19.46% > > AOT: > * FfiMemory.PointerPointer: 323.5% > * FfiStruct.FieldLoadStore: 483.3% > * FileIO_readwrite_64kb: 15.39% > * FileIO_readwrite_512kb (Intel Xeon): 46.22% > * MemoryCopy.512.setRange.Pointer.Uint8: 35.20% > * MemoryCopy.64.setRange.Pointer.Uint8: 55.40% > * MemoryCopy.512.setRange.Pointer.Double: 29.45% > * MemoryCopy.64.setRange.Pointer.Double: 60.37% > * MemoryCopy.8.setRange.Pointer.Double: 59.54% > * MemoryCopy.8.setRange.Pointer.Uint8: 55.40% > * FfiStructCopy.Copy32Bytes: 398.3% > * FfiStructCopy.Copy1Bytes: 1233% > > TEST=vm/dart/address_local_pointer, vm/dart/pointer_as_typed_list > > Issue: #42072 > Fixes: #53124 > > Cq-Include-Trybots: luci.dart.try:vm-ffi-qemu-linux-release-arm-try,vm-eager-optimization-linux-release-x64-try,vm-linux-release-x64-try,vm-linux-debug-x64-try,vm-aot-linux-release-x64-try,vm-aot-linux-debug-x64-try > Change-Id: I563e0bfac5b1ac6cf1111649934067c12891b631 > Reviewed-on: https://dart-review.googlesource.com/c/sdk/+/324820 > Reviewed-by: Alexander Markov <alexmarkov@google.com> > Commit-Queue: Tess Strickland <sstrickl@google.com> > Reviewed-by: Martin Kustermann <kustermann@google.com> Issue: #42072 Change-Id: I7c31434e01108487de69a32154bbefd1538c6f0f Cq-Include-Trybots: luci.dart.try:vm-ffi-qemu-linux-release-arm-try,vm-eager-optimization-linux-release-x64-try,vm-linux-release-x64-try,vm-linux-debug-x64-try,vm-aot-linux-release-x64-try,vm-aot-linux-debug-x64-try No-Presubmit: true No-Tree-Checks: true No-Try: true Reviewed-on: https://dart-review.googlesource.com/c/sdk/+/330523 Bot-Commit: Rubber Stamper <rubber-stamper@appspot.gserviceaccount.com> Commit-Queue: Alexander Aprelev <aam@google.com> Auto-Submit: Alexander Markov <alexmarkov@google.com> Reviewed-by: Alexander Aprelev <aam@google.com>
copybara-service bot
pushed a commit
that referenced
this issue
Oct 17, 2023
…es." This is a reland of commit 06d7a23 This version fixes an issue when a phi node has multiple inputs with different unboxed integer representations. The original CL made a change where only the representations were considered, not the range of values for the Phi calculated by range analysis. The reland goes back to the old behavior for this case. Also fixes the new tests on 32-bit architectures. Original change's description: > [vm/compiler] Change MemoryCopy to also take untagged addresses. > > This CL adds the ability to pass the payload address of the source > and destination directly to the MemoryCopy instruction as an untagged > value. > > The new translation of the _TypedListBase._memMoveN methods use the new > MemoryCopy constructor, retrieving the untagged value of the data field > of both the source and destination. This way, if inlining exposes the > allocation of the object from which the data field is being retrieved, > then allocation sinking can remove the intermediate allocation if there > are no escaping uses of the object. > > Since Pointer.asTypedList allocates such ExternalTypedData objects, > this CL makes that method inlined if at all possible, which removes > the intermediate allocation if the only use of the TypedData object > is to call setRange for memory copying purposes. > > This CL also separates unboxed native slots into two groups: those > that contain untagged addresses and those that do not. The former > group now have the kUntagged representation, which mimics the old > use of LoadUntagged for the PointerBase data field and also ensures > that any arithmetic operations on untagged addresses must first be > explicitly converted to an unboxed integer and then explicitly converted > back to untagged before being stored in a slot that contains untagged > addresses. > > When a unboxed native slot that contains untagged addresses is defined, > the definition also includes a boolean which represents whether > addresses that may be moved by the GC can be stored in this slot or not. > The redundancy eliminator uses this to decide whether it is safe to > eliminate a duplicate load, replace a load with the value originally > stored in the slot, or lift a load out of a loop. > > In particular, the PointerBase data field may contain GC-moveable > addresses, but only for internal TypedData objects and views, not > for external TypedData objects or Pointers. To allow load optimizations > involving the latter, the LoadField and StoreField instructions now > take boolean flags for whether loads or stores from the slot are > guaranteed to not be GC-moveable, to override the information from > the slot argument. > > Notable benchmark changes on x64 (similar for other archs unless noted): > > JIT: > * FfiMemory.PointerPointer: 250.7% > * FfiStructCopy.Copy1Bytes: -26.73% (only x64) > * FfiStructCopy.Copy32Bytes: -25.18% (only x64) > * MemoryCopy.64.setRange.Pointer.Uint8: 19.36% > * MemoryCopy.64.setRange.Pointer.Double: 18.96% > * MemoryCopy.8.setRange.Pointer.Double: 17.59% > * MemoryCopy.8.setRange.Pointer.Uint8: 19.46% > > AOT: > * FfiMemory.PointerPointer: 323.5% > * FfiStruct.FieldLoadStore: 483.3% > * FileIO_readwrite_64kb: 15.39% > * FileIO_readwrite_512kb (Intel Xeon): 46.22% > * MemoryCopy.512.setRange.Pointer.Uint8: 35.20% > * MemoryCopy.64.setRange.Pointer.Uint8: 55.40% > * MemoryCopy.512.setRange.Pointer.Double: 29.45% > * MemoryCopy.64.setRange.Pointer.Double: 60.37% > * MemoryCopy.8.setRange.Pointer.Double: 59.54% > * MemoryCopy.8.setRange.Pointer.Uint8: 55.40% > * FfiStructCopy.Copy32Bytes: 398.3% > * FfiStructCopy.Copy1Bytes: 1233% > > TEST=vm/dart/address_local_pointer, vm/dart/pointer_as_typed_list > > Issue: #42072 > Fixes: #53124 > > Cq-Include-Trybots: luci.dart.try:vm-ffi-qemu-linux-release-arm-try,vm-eager-optimization-linux-release-x64-try,vm-linux-release-x64-try,vm-linux-debug-x64-try,vm-aot-linux-release-x64-try,vm-aot-linux-debug-x64-try > Change-Id: I563e0bfac5b1ac6cf1111649934067c12891b631 > Reviewed-on: https://dart-review.googlesource.com/c/sdk/+/324820 > Reviewed-by: Alexander Markov <alexmarkov@google.com> > Commit-Queue: Tess Strickland <sstrickl@google.com> > Reviewed-by: Martin Kustermann <kustermann@google.com> TEST=vm/dart/address_local_pointer, vm/dart/pointer_as_typed_list Issue: #42072 Fixes: #53124 Change-Id: Iabb0e910f12636d0ff51e711c8c9c98ad40e5811 Cq-Include-Trybots: luci.dart.try:vm-ffi-qemu-linux-release-arm-try,vm-eager-optimization-linux-release-x64-try,vm-linux-release-x64-try,vm-linux-debug-x64-try,vm-aot-linux-release-x64-try,vm-aot-linux-debug-x64-try,vm-aot-linux-release-simarm_x64-try,vm-aot-linux-debug-simarm_x64-try Reviewed-on: https://dart-review.googlesource.com/c/sdk/+/330600 Reviewed-by: Alexander Markov <alexmarkov@google.com> Commit-Queue: Tess Strickland <sstrickl@google.com> Reviewed-by: Martin Kustermann <kustermann@google.com>
copybara-service bot
pushed a commit
that referenced
this issue
Oct 17, 2023
Missed a couple of LoadUntaggeds for PointerBase.data in the graph intrinsifier during 4d1bdaa. Creates slots for ExternalOneByteString.external_data and ExternalTwoByteString.external_data and uses LoadField with those slots instead of LoadUntagged. TEST=ci Issue: #42072 Fixes: #53124 Change-Id: I900281644c4c42ad303cc7a6121e4c8bb7852cfe Cq-Include-Trybots: luci.dart.try:vm-aot-linux-release-simarm_x64-try,vm-aot-linux-release-x64-try Reviewed-on: https://dart-review.googlesource.com/c/sdk/+/330787 Reviewed-by: Alexander Markov <alexmarkov@google.com> Commit-Queue: Tess Strickland <sstrickl@google.com>
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Labels
area-vm
Use area-vm for VM related issues, including code coverage, and the AOT and JIT backends.
We should add a generic
MemoryCopy
IL instruction, capable of copying ranges of bytes where source and destination can be_OneByteString
/_TwoByteString
as well as various typed data as long as their element sizes agree.This can be used for
List.setRange()
implementation for typed dataString.fromCharCodes()
implementation forUint8List
inputString.codeUnits
implementationThe text was updated successfully, but these errors were encountered: