Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

jdk19 OpenJDK Invalid JIT return address ASSERTION FAILED swalk.c:1601 #16249

Closed
pshipton opened this issue Nov 2, 2022 · 28 comments
Closed
Assignees
Labels
comp:jit comp:vm jdk19 project:loom Used to track Project Loom related work test failure
Milestone

Comments

@pshipton
Copy link
Member

pshipton commented Nov 2, 2022

https://openj9-jenkins.osuosl.org/job/Test_openjdk19_j9_sanity.openjdk_s390x_linux_Nightly/41/
jdk_lang_0 / jdk_lang_1
java/lang/Thread/virtual/stress/SleepALot.java#id0

https://openj9-artifactory.osuosl.org/artifactory/ci-openj9/Test/Test_openjdk19_j9_sanity.openjdk_s390x_linux_Nightly/41/openjdk_test_output.tar.gz

22:24:50  *** Invalid JIT return address 000003FF449B1474 in 000003FF8B0FE1B8
22:24:50  
22:24:50  02:24:13.958 0x3fef0002200    j9vm.249    *   ** ASSERTION FAILED ** at /home/jenkins/workspace/Build_JDK19_s390x_linux_Nightly/openj9/runtime/vm/swalk.c:1601: ((0 ))
@pshipton
Copy link
Member Author

pshipton commented Nov 2, 2022

@0xdaryl @tajila fyi

@tajila
Copy link
Contributor

tajila commented Nov 2, 2022

@fengxue-IS is this a dup of the one you were looking at?

@fengxue-IS
Copy link
Contributor

yes, this is the same error as PingPong test, though I have not seen it from this test during grinder testing

@fengxue-IS fengxue-IS added the project:loom Used to track Project Loom related work label Nov 2, 2022
@pshipton
Copy link
Member Author

pshipton commented Nov 3, 2022

https://openj9-jenkins.osuosl.org/job/Test_openjdk19_j9_sanity.openjdk_ppc64_aix_Nightly/41
jdk_lang_1
java/lang/Thread/virtual/stress/GetStackTraceALot.java#id0

00:41:36  *** Invalid JIT return address 00000100106262C8 in 00000100236EDD60
00:41:36  
00:41:36  04:38:38.482 0x1002323ed00    j9vm.249    *   ** ASSERTION FAILED ** at /home/jenkins/workspace/Build_JDK19_ppc64_aix_Nightly/openj9/runtime/vm/swalk.c:1601: ((0 ))

@pshipton pshipton changed the title jdk19 OpenJDK java/lang/Thread/virtual/stress/SleepALot.java#id0 jdk19 OpenJDK Invalid JIT return address ASSERTION FAILED swalk.c:1601 Nov 3, 2022
@JasonFengJ9
Copy link
Member

An internal build(ubu22s390x-rt4-1)

openjdk version "19.0.1-beta" 2022-10-18
IBM Semeru Runtime Open Edition 19.0.1+10-202211030547 (build 19.0.1-beta+10-202211030547)
Eclipse OpenJ9 VM 19.0.1+10-202211030547 (build master-a31c79403, JRE 19 Linux s390x-64-Bit Compressed References 20221102_70 (JIT enabled, AOT enabled)
OpenJ9   - a31c79403
OMR      - fc60df565
JCL      - 165f4c2690 based on jdk-19.0.1+10)

[2022-11-03T07:04:14.743Z] variation: -Xdump:system:none -Xdump:heap:none -Xdump:system:events=gpf+abort+traceassert+corruptcache -XX:-JITServerTechPreviewMessage Mode650
[2022-11-03T07:04:14.743Z] JVM_OPTIONS:  -Xdump:system:none -Xdump:heap:none -Xdump:system:events=gpf+abort+traceassert+corruptcache -XX:-JITServerTechPreviewMessage -XX:-UseCompressedOops 

[2022-11-03T07:25:42.862Z] TEST: java/lang/Thread/virtual/stress/SleepALot.java#id0


[2022-11-03T07:25:42.862Z] STDERR:
[2022-11-03T07:25:42.862Z] 
[2022-11-03T07:25:42.862Z] 
[2022-11-03T07:25:42.862Z] *** Invalid JIT return address 000003FF4672E874 in 000003FF841FBD88
[2022-11-03T07:25:42.862Z] 
[2022-11-03T07:25:42.862Z] 07:25:31.099 0x3fed8012f00    j9vm.249    *   ** ASSERTION FAILED ** at ../../../../../openj9/runtime/vm/swalk.c:1601: ((0 ))
[2022-11-03T07:25:42.862Z] 
[2022-11-03T07:25:42.862Z] 
[2022-11-03T07:25:42.862Z] *** Invalid JIT return address 000003FF4672E874 in 000003FF863FE258
[2022-11-03T07:25:42.862Z] 
[2022-11-03T07:25:42.862Z] 07:25:31.099 0x3fef0002400    j9vm.249    *   ** ASSERTION FAILED ** at ../../../../../openj9/runtime/vm/swalk.c:1601: ((0 ))

[2022-11-03T07:25:42.863Z] TEST RESULT: Failed. Unexpected exit from test [exit code: 255]
[2022-11-03T07:25:42.863Z] --------------------------------------------------
[2022-11-03T07:31:12.532Z] Test results: passed: 846; failed: 1
[2022-11-03T07:31:12.532Z] Report written to /home/jenkins/workspace/Test_openjdk19_j9_sanity.openjdk_s390x_linux_testList_1/aqa-tests/TKG/output_16674590536541/jdk_lang_1/report/html/report.html
[2022-11-03T07:31:12.532Z] Results written to /home/jenkins/workspace/Test_openjdk19_j9_sanity.openjdk_s390x_linux_testList_1/aqa-tests/TKG/output_16674590536541/jdk_lang_1/work
[2022-11-03T07:31:12.532Z] Error: Some tests failed or other problems occurred.
[2022-11-03T07:31:12.532Z] 
[2022-11-03T07:31:12.532Z] jdk_lang_1_FAILED

@pshipton
Copy link
Member Author

pshipton commented Nov 4, 2022

See also #16259 (comment)

@pshipton
Copy link
Member Author

pshipton commented Nov 7, 2022

https://openj9-jenkins.osuosl.org/job/Test_openjdk19_j9_sanity.openjdk_s390x_linux_Nightly/44
jdk_lang_0
java/lang/Thread/virtual/stress/GetStackTraceALot.java#id0

https://openj9-artifactory.osuosl.org/artifactory/ci-openj9/Test/Test_openjdk19_j9_sanity.openjdk_s390x_linux_Nightly/44/openjdk_test_output.tar.gz

20:51:50  STDERR:
20:51:50  Unhandled exception
20:51:50  Type=Segmentation error vmState=0x0002000f
20:51:50  J9Generic_Signal_Number=00000018 Signal_Number=0000000b Error_Value=000000a9 Signal_Code=00000001
20:51:50  Handler1=000003FF82544858 Handler2=000003FF82431CB8 InaccessibleAddress=0000000000000000
20:51:50  gpr0=0000000000000000 gpr1=0000000000000000 gpr2=000003FF7C0241E0 gpr3=000003FF04001AC8
20:51:50  gpr4=000003FF820C56E0 gpr5=000003FF820C5498 gpr6=000003FF820C54A0 gpr7=000003FF7C081380
20:51:50  gpr8=000000002C81CE78 gpr9=000003FF04001AC8 gpr10=000003FF820C56E0 gpr11=0000000000000000
20:51:50  gpr12=000003FF82E19000 gpr13=000003FF80F05720 gpr14=000003FF80E2F58A gpr15=000003FF820C5300
20:51:50  psw=000003FF80E59A9A mask=0705000180000000 fpc=0008fe00 bea=000003FF80E2F584
20:51:50  fpr0 4297efc500000000 (f: 0.000000, d: 6.579642e+12)
20:51:50  fpr1 3e638e2900000000 (f: 0.000000, d: 3.642461e-08)
20:51:50  fpr2 3e3ab28300000000 (f: 0.000000, d: 6.215952e-09)
20:51:50  fpr3 381e2cc900000000 (f: 0.000000, d: 2.216905e-38)
20:51:50  fpr4 bc48697400000000 (f: 0.000000, d: -2.646746e-18)
20:51:50  fpr5 3e92492500000000 (f: 0.000000, d: 2.724785e-07)
20:51:50  fpr6 3ecccccd00000000 (f: 0.000000, d: 3.433228e-06)
20:51:50  fpr7 3e3a332500000000 (f: 0.000000, d: 6.100112e-09)
20:51:50  fpr8 000003ff80dcf4b0 (f: 2161964288.000000, d: 2.171870e-311)
20:51:50  fpr9 0000000000529548 (f: 5412168.000000, d: 2.673966e-317)
20:51:50  fpr10 000003ff830731c0 (f: 2198286848.000000, d: 2.171888e-311)
20:51:50  fpr11 0000000000c02240 (f: 12591680.000000, d: 6.221117e-317)
20:51:50  fpr12 0005ecaf6e636fe5 (f: 1852010496.000000, d: 8.239103e-309)
20:51:50  fpr13 000003ff0c02aeb8 (f: 201502400.000000, d: 2.170901e-311)
20:51:50  fpr14 0000000000529550 (f: 5412176.000000, d: 2.673970e-317)
20:51:50  fpr15 000003ff0c02e828 (f: 201517088.000000, d: 2.170901e-311)
20:51:50  Module=/home/jenkins/workspace/Test_openjdk19_j9_sanity.openjdk_s390x_linux_Nightly/openjdkbinary/j2sdk-image/lib/default/libj9gc29.so
20:51:50  Module_base_address=000003FF80C80000
20:51:50  Target=2_90_20221104_109 (Linux 3.10.0-1160.76.1.el7.s390x)
20:51:50  CPU=s390x (4 logical CPUs) (0x1ec1b1000 RAM)
20:51:50  ----------- Stack Backtrace -----------
20:51:50  _ZN22GC_ObjectModelDelegate29calculateObjectDetailsForCopyEP18MM_EnvironmentBaseP18MM_ForwardedHeaderPmS4_S4_+0x32 (0x000003FF80E59A9A [libj9gc29.so+0x1d9a9a])
20:51:50  _ZN12MM_Scavenger14copyForVariantILb0EEEP8J9ObjectP22MM_EnvironmentStandardP18MM_ForwardedHeader+0x62 (0x000003FF80E2F58A [libj9gc29.so+0x1af58a])
20:51:50  _ZN12MM_Scavenger26incrementalScanCacheBySlotEP22MM_EnvironmentStandardP24MM_CopyScanCacheStandard+0xae2 (0x000003FF80E2C692 [libj9gc29.so+0x1ac692])
20:51:50  _ZN12MM_Scavenger12completeScanEP22MM_EnvironmentStandard+0xf4 (0x000003FF80E2CCD4 [libj9gc29.so+0x1accd4])
20:51:50  _ZN12MM_Scavenger24workThreadGarbageCollectEP22MM_EnvironmentStandard+0x30e (0x000003FF80E2D1EE [libj9gc29.so+0x1ad1ee])
20:51:50  _ZN21MM_ParallelDispatcher16workerEntryPointEP18MM_EnvironmentBase+0x290 (0x000003FF80DD05B0 [libj9gc29.so+0x1505b0])
20:51:50  _Z23dispatcher_thread_proc2P14OMRPortLibraryPv+0x138 (0x000003FF80DCFBF8 [libj9gc29.so+0x14fbf8])
20:51:50  omrsig_protect+0x3e8 (0x000003FF82432E10 [libj9prt29.so+0x32e10])
20:51:50  dispatcher_thread_proc+0x5c (0x000003FF80DCF50C [libj9gc29.so+0x14f50c])
20:51:50  thread_wrapper+0xf6 (0x000003FF82388A5E [libj9thr29.so+0x8a5e])
20:51:50  start_thread+0xea (0x000003FF82E08312 [libpthread.so.0+0x8312])
20:51:50   (0x000003FF82C8E232 [libc.so.6+0x10e232])
20:51:50  ---------------------------------------

@dmitripivkine
Copy link
Contributor

dmitripivkine commented Nov 7, 2022

Crash in Scavenger relates to Virtual Thread scanning. I am trying to figure out exact mechanism of failure (from scan object in scan cache to attempt to get object class from stale pointer)

@pshipton
Copy link
Member Author

pshipton commented Nov 8, 2022

https://openj9-jenkins.osuosl.org/job/Test_openjdk19_j9_sanity.openjdk_s390x_linux_Nightly/45
jdk_lang_1
java/lang/Thread/virtual/stress/GetStackTraceALot.java#id0

https://openj9-artifactory.osuosl.org/artifactory/ci-openj9/Test/Test_openjdk19_j9_sanity.openjdk_s390x_linux_Nightly/45/openjdk_test_output.tar.gz

22:12:57  Type=Segmentation error vmState=0x0002000f
22:12:57  J9Generic_Signal_Number=00000018 Signal_Number=0000000b Error_Value=000000a9 Signal_Code=00000001
22:12:57  Handler1=000003FF99D44858 Handler2=000003FF99C31CB8 InaccessibleAddress=0000000000000000
22:12:57  gpr0=0000000000000000 gpr1=000003FF940241E0 gpr2=000003FF9404A048 gpr3=000003FEFC004098
22:12:57  gpr4=000003FF99A7E6E0 gpr5=000003FF99A7E498 gpr6=000003FF99A7E4A0 gpr7=000003FF94084D10
22:12:57  gpr8=000003FF99A7E6F8 gpr9=000003FEFC004098 gpr10=000003FF94084D10 gpr11=0000000000000000
22:12:57  gpr12=000003FF9A619000 gpr13=000003FF99A7E6E0 gpr14=000003FF9862C138 gpr15=000003FF99A7E308
22:12:57  psw=000003FF98655FC4 mask=0705000180000000 fpc=0008fe00 bea=000003FF9862C132
22:12:57  fpr0 42943a0500000000 (f: 0.000000, d: 5.559856e+12)
22:12:57  fpr1 3e63af6f00000000 (f: 0.000000, d: 3.666671e-08)
22:12:57  fpr2 3e3ab28300000000 (f: 0.000000, d: 6.215952e-09)
22:12:57  fpr3 3cebd73300000000 (f: 0.000000, d: 3.090930e-15)
22:12:57  fpr4 3dd3099100000000 (f: 0.000000, d: 6.925754e-11)
22:12:57  fpr5 3e925ce900000000 (f: 0.000000, d: 2.736290e-07)
22:12:57  fpr6 3c90a3c700000000 (f: 0.000000, d: 5.773075e-17)
22:12:57  fpr7 3e3a534c00000000 (f: 0.000000, d: 6.129355e-09)
22:12:57  fpr8 000003ff985ccb48 (f: 2556218112.000000, d: 2.172065e-311)
22:12:57  fpr9 0000000000524070 (f: 5390448.000000, d: 2.663235e-317)
22:12:57  fpr10 000003ff9a8731c0 (f: 2592551424.000000, d: 2.172083e-311)
22:12:57  fpr11 0000000000c2e820 (f: 12773408.000000, d: 6.310902e-317)
22:12:57  fpr12 0005ecece99a2421 (f: 3919193088.000000, d: 8.240408e-309)
22:12:57  fpr13 000003ff100097e8 (f: 268474336.000000, d: 2.170934e-311)
22:12:57  fpr14 0000000000524078 (f: 5390456.000000, d: 2.663239e-317)
22:12:57  fpr15 000003ff10040aa8 (f: 268700320.000000, d: 2.170934e-311)
22:12:57  Module=/home/jenkins/workspace/Test_openjdk19_j9_sanity.openjdk_s390x_linux_Nightly/openjdkbinary/j2sdk-image/lib/default/libj9gc_full29.so
22:12:57  Module_base_address=000003FF98480000
22:12:57  Target=2_90_20221107_110 (Linux 3.10.0-1160.76.1.el7.s390x)
22:12:57  CPU=s390x (4 logical CPUs) (0x1ec1b1000 RAM)
22:12:57  ----------- Stack Backtrace -----------
22:12:57  _ZN22GC_ObjectModelDelegate29calculateObjectDetailsForCopyEP18MM_EnvironmentBaseP18MM_ForwardedHeaderPmS4_S4_+0x2c (0x000003FF98655FC4 [libj9gc_full29.so+0x1d5fc4])
22:12:57  _ZN12MM_Scavenger14copyForVariantILb0EEEP8J9ObjectP22MM_EnvironmentStandardP18MM_ForwardedHeader+0x60 (0x000003FF9862C138 [libj9gc_full29.so+0x1ac138])
22:12:57  _ZN12MM_Scavenger26incrementalScanCacheBySlotEP22MM_EnvironmentStandardP24MM_CopyScanCacheStandard+0xac2 (0x000003FF9862923A [libj9gc_full29.so+0x1a923a])
22:12:57  _ZN12MM_Scavenger12completeScanEP22MM_EnvironmentStandard+0xf4 (0x000003FF98629884 [libj9gc_full29.so+0x1a9884])
22:12:57  _ZN12MM_Scavenger24workThreadGarbageCollectEP22MM_EnvironmentStandard+0x30e (0x000003FF98629D9E [libj9gc_full29.so+0x1a9d9e])
22:12:57  _ZN21MM_ParallelDispatcher16workerEntryPointEP18MM_EnvironmentBase+0x290 (0x000003FF985CDC48 [libj9gc_full29.so+0x14dc48])
22:12:57  _Z23dispatcher_thread_proc2P14OMRPortLibraryPv+0x138 (0x000003FF985CD290 [libj9gc_full29.so+0x14d290])
22:12:57  omrsig_protect+0x3e8 (0x000003FF99C32E10 [libj9prt29.so+0x32e10])
22:12:57  dispatcher_thread_proc+0x5c (0x000003FF985CCBA4 [libj9gc_full29.so+0x14cba4])
22:12:57  thread_wrapper+0xf6 (0x000003FF99B88A5E [libj9thr29.so+0x8a5e])
22:12:57  start_thread+0xea (0x000003FF9A608312 [libpthread.so.0+0x8312])
22:12:57   (0x000003FF9A48E232 [libc.so.6+0x10e232])
22:12:57  ---------------------------------------

@dmitripivkine
Copy link
Contributor

dmitripivkine commented Nov 9, 2022

Comment #16259 (comment) solves puzzle. I see the reason for crash: for !j9object 0x2C65A710``java/lang/VirtualThread->linkNext and java/lang/VirtualThread->linkPrevious are set to the same (and bogus, mid-object) value 0x2c81ce78
Most likely the reason for data corruption is the same as in #16259

@pshipton
Copy link
Member Author

https://openj9-jenkins.osuosl.org/job/Test_openjdk19_j9_sanity.openjdk_x86-64_linux_Nightly/46
jdk_lang_1
java/lang/Thread/virtual/stress/GetStackTraceALot.java#id0

https://openj9-artifactory.osuosl.org/artifactory/ci-openj9/Test/Test_openjdk19_j9_sanity.openjdk_x86-64_linux_Nightly/46/openjdk_test_output.tar.gz

22:56:30  Type=Segmentation error vmState=0x0002000f
22:56:30  J9Generic_Signal_Number=00000018 Signal_Number=0000000b Error_Value=00000000 Signal_Code=00000001
22:56:30  Handler1=00007FF80B612250 Handler2=00007FF810C82B10 InaccessibleAddress=0000005A0000001A
22:56:30  RDI=00007FF80C04A258 RSI=00007FF76C0056F8 RAX=00007FF80C0223F0 RBX=00007FF7AAFF47C8
22:56:30  RCX=00007FF7AAFF47B8 RDX=00007FF7AAFF4980 R8=00007FF7AAFF47C0 R9=00007FF7AAFF47C8
22:56:30  R10=00007FF7AAFF4980 R11=00007FF810355D50 R12=0000005A00000000 R13=00007FF7AAFF47B8
22:56:30  R14=00007FF7AAFF4980 R15=00007FF80C04A1C0
22:56:30  RIP=00007FF803EA6029 GS=0000 FS=0000 RSP=00007FF7AAFF4710
22:56:30  EFlags=0000000000010246 CS=0033 RBP=00007FF7AAFF47C0 ERR=0000000000000004
22:56:30  TRAPNO=000000000000000E OLDMASK=0000000000000000 CR2=0000005A0000001A
22:56:30  xmm0 000000007f0fffc1 (f: 2131755008.000000, d: 1.053227e-314)
22:56:30  xmm1 0000000000000143 (f: 323.000000, d: 1.595832e-321)
22:56:30  xmm2 0000000000000000 (f: 0.000000, d: 0.000000e+00)
22:56:30  xmm3 0000000000000000 (f: 0.000000, d: 0.000000e+00)
22:56:30  xmm4 00007ff803f33949 (f: 66271560.000000, d: 6.951661e-310)
22:56:30  xmm5 00007ff803897718 (f: 59340568.000000, d: 6.951661e-310)
22:56:30  xmm6 00007ff803f33949 (f: 66271560.000000, d: 6.951661e-310)
22:56:30  xmm7 0000000000000000 (f: 0.000000, d: 0.000000e+00)
22:56:30  xmm8 0000000000000000 (f: 0.000000, d: 0.000000e+00)
22:56:30  xmm9 0000000000000000 (f: 0.000000, d: 0.000000e+00)
22:56:30  xmm10 0000000000000000 (f: 0.000000, d: 0.000000e+00)
22:56:30  xmm11 0000ff0000000000 (f: 0.000000, d: 1.385239e-309)
22:56:30  xmm12 0000000000000000 (f: 0.000000, d: 0.000000e+00)
22:56:30  xmm13 0000000000000000 (f: 0.000000, d: 0.000000e+00)
22:56:30  xmm14 0000000000000000 (f: 0.000000, d: 0.000000e+00)
22:56:30  xmm15 0000000000000000 (f: 0.000000, d: 0.000000e+00)
22:56:30  Module=/home/jenkins/workspace/Test_openjdk19_j9_sanity.openjdk_x86-64_linux_Nightly/openjdkbinary/j2sdk-image/lib/default/libj9gc_full29.so
22:56:30  Module_base_address=00007FF803CFE000
22:56:30  Target=2_90_20221110_113 (Linux 3.10.0-1160.76.1.el7.x86_64)
22:56:30  CPU=amd64 (4 logical CPUs) (0x1e8cea000 RAM)
22:56:30  ----------- Stack Backtrace -----------
22:56:30  _ZN22GC_ObjectModelDelegate29calculateObjectDetailsForCopyEP18MM_EnvironmentBaseP18MM_ForwardedHeaderPmS4_S4_+0x29 (0x00007FF803EA6029 [libj9gc_full29.so+0x1a8029])
22:56:30  _ZN12MM_Scavenger14copyForVariantILb0EEEP8J9ObjectP22MM_EnvironmentStandardP18MM_ForwardedHeader+0x60 (0x00007FF803E80500 [libj9gc_full29.so+0x182500])
22:56:30  _ZN12MM_Scavenger26incrementalScanCacheBySlotEP22MM_EnvironmentStandardP24MM_CopyScanCacheStandard+0x94f (0x00007FF803E7D9CF [libj9gc_full29.so+0x17f9cf])
22:56:30  _ZN12MM_Scavenger12completeScanEP22MM_EnvironmentStandard+0xf4 (0x00007FF803E7DF84 [libj9gc_full29.so+0x17ff84])
22:56:30  _ZN12MM_Scavenger24workThreadGarbageCollectEP22MM_EnvironmentStandard+0x397 (0x00007FF803E7E4F7 [libj9gc_full29.so+0x1804f7])
22:56:30  _ZN21MM_ParallelDispatcher16workerEntryPointEP18MM_EnvironmentBase+0x228 (0x00007FF803E2B1C8 [libj9gc_full29.so+0x12d1c8])
22:56:30  _Z23dispatcher_thread_proc2P14OMRPortLibraryPv+0x109 (0x00007FF803E2A9B9 [libj9gc_full29.so+0x12c9b9])
22:56:30  omrsig_protect+0x2b1 (0x00007FF810C83841 [libj9prt29.so+0x2a841])
22:56:30  dispatcher_thread_proc+0x3f (0x00007FF803E2A3FF [libj9gc_full29.so+0x12c3ff])
22:56:30  thread_wrapper+0x163 (0x00007FF810C4B333 [libj9thr29.so+0xb333])
22:56:30  start_thread+0xc5 (0x00007FF8107A1EA5 [libpthread.so.0+0x7ea5])
22:56:30  clone+0x6d (0x00007FF8102C6B0D [libc.so.6+0xfeb0d])
22:56:30  ---------------------------------------

@dmitripivkine
Copy link
Contributor

this failure is the same:

> !MM_ForwardedHeader 0x00007FF7AAFF4980
MM_ForwardedHeader at 0x7ff7aaff4980 {
  Fields for MM_ForwardedHeader:
	0x0: struct J9Object* _objectPtr = !j9object 0x00007FF803943260 <--- stale pointer mid object
	0x8: U64 _preserved = 0x0000005A00000002 (386547056642)
	0x10: const bool _compressObjectReferences = false
}

0x7FF803943250 :  0000000000000001 0000000300000000 [ ................ ] <---
0x7FF803943260 :  0000005a00000002 0000000000000000 [ ....Z........... ] <---
0x7FF803943270 :  00007ff80c248f00 000000000000005c [ ..$.....\....... ]

This bogus address is discovered in java/lang/VirtualThread object linkPrevious and linkNext slots.
This object just was copied from Nursery to Tenure from 0x7ff803895128

> !j9object 0x7FF7E3CB72C0
!J9Object 0x00007FF7E3CB72C0 {
	struct J9Class* clazz = !j9class 0x7FF80C29AC00 // java/lang/VirtualThread
	Object flags = 0x0000000A;
	J lockword = 0x0000000000000000 (offset = 0) (java/lang/Object) <hidden>
	J eetop = 0x0000000000000000 (offset = 8) (java/lang/Thread)
	J tid = 0x0000000000000000 (offset = 16) (java/lang/Thread)
	Ljava/lang/String; name = !fj9object 0x0 (offset = 40) (java/lang/Thread)
	Z interrupted = 0x00000000 (offset = 152) (java/lang/Thread)
	Ljava/lang/ClassLoader; contextClassLoader = !fj9object 0x0 (offset = 48) (java/lang/Thread)
	Ljava/security/AccessControlContext; inheritedAccessControlContext = !fj9object 0x0 (offset = 56) (java/lang/Thread)
	Ljava/lang/Thread$FieldHolder; holder = !fj9object 0x0 (offset = 64) (java/lang/Thread)
	Ljava/lang/ThreadLocal$ThreadLocalMap; threadLocals = !fj9object 0x0 (offset = 72) (java/lang/Thread)
	Ljava/lang/ThreadLocal$ThreadLocalMap; inheritableThreadLocals = !fj9object 0x0 (offset = 80) (java/lang/Thread)
	Ljava/lang/Object; extentLocalBindings = !fj9object 0x0 (offset = 88) (java/lang/Thread)
	Ljava/lang/Object; interruptLock = !fj9object 0x0 (offset = 96) (java/lang/Thread)
	Ljava/lang/Object; parkBlocker = !fj9object 0x0 (offset = 104) (java/lang/Thread)
	Lsun/nio/ch/Interruptible; nioBlocker = !fj9object 0x0 (offset = 112) (java/lang/Thread)
	Ljdk/internal/vm/Continuation; cont = !fj9object 0x0 (offset = 120) (java/lang/Thread)
	Ljava/lang/Thread$UncaughtExceptionHandler; uncaughtExceptionHandler = !fj9object 0x0 (offset = 128) (java/lang/Thread)
	J threadLocalRandomSeed = 0x0000000000000000 (offset = 24) (java/lang/Thread)
	I threadLocalRandomProbe = 0x00000000 (offset = 156) (java/lang/Thread)
	I threadLocalRandomSecondarySeed = 0x00000000 (offset = 160) (java/lang/Thread)
	Ljdk/internal/vm/ThreadContainer; container = !fj9object 0x0 (offset = 136) (java/lang/Thread)
	Ljdk/internal/vm/StackableScope; headStackableScopes = !fj9object 0x0 (offset = 144) (java/lang/Thread)
	Z started = 0x00000000 (offset = 164) (java/lang/Thread)
	Z stopCalled = 0x00000000 (offset = 168) (java/lang/Thread)
	J tls = 0x0000000000000000 (offset = 32) (java/lang/Thread) <hidden>
	Ljava/util/concurrent/Executor; scheduler = !fj9object 0x0 (offset = 184) (java/lang/VirtualThread)
	Ljdk/internal/vm/Continuation; cont = !fj9object 0x0 (offset = 192) (java/lang/VirtualThread)
	Ljava/lang/Runnable; runContinuation = !fj9object 0x0 (offset = 200) (java/lang/VirtualThread)
	I state = 0x00000000 (offset = 172) (java/lang/VirtualThread)
	Z parkPermit = 0x00000000 (offset = 240) (java/lang/VirtualThread)
	Ljava/lang/Thread; carrierThread = !fj9object 0x0 (offset = 208) (java/lang/VirtualThread)
	Ljava/util/concurrent/CountDownLatch; termination = !fj9object 0x0 (offset = 216) (java/lang/VirtualThread)
	Ljava/lang/VirtualThread; linkNext = !fj9object 0x7ff803943260 (offset = 232) (java/lang/VirtualThread) <hidden> <---------
	Ljava/lang/VirtualThread; linkPrevious = !fj9object 0x7ff803943260 (offset = 224) (java/lang/VirtualThread) <hidden> <---------
	J inspectorCount = 0x0000000000000000 (offset = 176) (java/lang/VirtualThread) <hidden>
	I isSuspendedByJVMTI = 0x00000000 (offset = 244) (java/lang/VirtualThread) <hidden>
}

So, again linkPrevious == linkNext and stale. And also as before none of other slots are set different than NULL
@babsingh Where we set linkPrevious == linkNext? I can see we do it for the list root. Is there chance GC can occur and make address (stored for example in native register) to be obsolete? GC does not handle this object special, stale address was written by VM helpers code

@babsingh
Copy link
Contributor

Where we set linkPrevious == linkNext?

We set them in JVM_VirtualThreadMountEnd.

j9object_t rootVirtualThread = mmFuncs->J9AllocateObject(currentThread, virtualThreadClass, J9_GC_ALLOCATE_OBJECT_NON_INSTRUMENTABLE);
if (NULL != rootVirtualThread) {
/* The global ref will be freed at vm death. */
jobject globalRef = vmFuncs->j9jni_createGlobalRef((JNIEnv *)currentThread, rootVirtualThread, JNI_FALSE);
if (NULL != globalRef) {
vm->liveVirtualThreadList = (j9object_t *)globalRef;
/* Set linkNext/linkPrevious to itself. */
J9OBJECT_OBJECT_STORE(currentThread, rootVirtualThread, vm->virtualThreadLinkNextOffset, rootVirtualThread);
J9OBJECT_OBJECT_STORE(currentThread, rootVirtualThread, vm->virtualThreadLinkPreviousOffset, rootVirtualThread);
} else {
vmFuncs->setNativeOutOfMemoryError(currentThread, 0, 0);
}
} else {
vmFuncs->setHeapOutOfMemoryError(currentThread);
}
}
if (NULL != vm->liveVirtualThreadList) {
j9object_t root = *(vm->liveVirtualThreadList);
j9object_t rootPrev = J9OBJECT_OBJECT_LOAD(currentThread, root, vm->virtualThreadLinkPreviousOffset);
/* Add thread to the end of the list. */
J9OBJECT_OBJECT_STORE(currentThread, threadObj, vm->virtualThreadLinkNextOffset, root);
J9OBJECT_OBJECT_STORE(currentThread, threadObj, vm->virtualThreadLinkPreviousOffset, rootPrev);
J9OBJECT_OBJECT_STORE(currentThread, rootPrev, vm->virtualThreadLinkNextOffset, threadObj);
J9OBJECT_OBJECT_STORE(currentThread, root, vm->virtualThreadLinkPreviousOffset, threadObj);
}

We unset them in JVM_VirtualThreadUnmountBegin:

if (lastUnmount) {
if (NULL != vm->liveVirtualThreadList) {
j9object_t threadPrev = J9OBJECT_OBJECT_LOAD(currentThread, threadObj, vm->virtualThreadLinkPreviousOffset);
j9object_t threadNext = J9OBJECT_OBJECT_LOAD(currentThread, threadObj, vm->virtualThreadLinkNextOffset);
/* Remove thread from list. The root will never be removed. */
J9OBJECT_OBJECT_STORE(currentThread, threadPrev, vm->virtualThreadLinkNextOffset, threadNext);
J9OBJECT_OBJECT_STORE(currentThread, threadNext, vm->virtualThreadLinkPreviousOffset, threadPrev);
}
TRIGGER_J9HOOK_VM_VIRTUAL_THREAD_END(vm->hookInterface, currentThread);
}

So, again linkPrevious == linkNext

This should be TRUE only if there is one element in the list.

/* Set linkNext/linkPrevious to itself. */
J9OBJECT_OBJECT_STORE(currentThread, rootVirtualThread, vm->virtualThreadLinkNextOffset, rootVirtualThread);
J9OBJECT_OBJECT_STORE(currentThread, rootVirtualThread, vm->virtualThreadLinkPreviousOffset, rootVirtualThread);

Is there chance GC can occur and make address (stored for example in native register) to be obsolete?

The above operations are done between internalEnterVMFromJNI and internalExitVMToJNI while holding a global lock. This should prevent the object addresses from becoming stale. The global lock should also prevent synchronization issues.

@dmitripivkine
Copy link
Contributor

Scavenger discovers this Virtual Thread object from JNI Global Ref. It means it should be copied and fixed up properly (as any other alive object) or Scavenger crashes on bogus pointer. The only way to bring stale pointer here is write bogus value in VM code.

@pshipton
Copy link
Member Author

https://openj9-jenkins.osuosl.org/job/Test_openjdk19_j9_sanity.openjdk_ppc64le_linux_Nightly/47
jdk_lang_1
java/lang/Thread/virtual/ParkWithFixedThreadPool.java

https://openj9-artifactory.osuosl.org/artifactory/ci-openj9/Test/Test_openjdk19_j9_sanity.openjdk_ppc64le_linux_Nightly/47/openjdk_test_output.tar.gz

23:53:47  Type=Segmentation error vmState=0x0002000f
23:53:47  J9Generic_Signal_Number=00000018 Signal_Number=0000000b Error_Value=00000000 Signal_Code=00000001
23:53:47  Handler1=00003FFF8FDC0140 Handler2=00003FFF8FCE8AA0
23:53:47  R0=00003FFF8E632408 R1=00003FFF4D53B9A0 R2=00003FFF8E7B5600 R3=00003FFF9005DC88
23:53:47  R4=00003FFEF8003FE8 R5=00003FFF4D53BCA0 R6=00003FFF4D53BA78 R7=00003FFF4D53BA80
23:53:47  R8=00003FFF4D53BA88 R9=00003FFF90035508 R10=0000000000000000 R11=00003FFF8E66C0A0
23:53:47  R12=0000000000004400 R13=00003FFF4D546900 R14=0000000000000000 R15=00003FFF8E07B208
23:53:47  R16=00003FFF8E07B208 R17=0000000000000000 R18=00003FFF4D53BCD0 R19=0000000000000000
23:53:47  R20=00003FFF900A7D50 R21=00003FFF4D53BCA0 R22=00003FFF8DE61E30 R23=0000000000000007
23:53:47  R24=4456240056E4BB55 R25=00003FFF6D620CF8 R26=00003FFF9005DBF0 R27=00003FFF4D53BA80
23:53:47  R28=00003FFF4D53BA88 R29=00003FFF4D53BA78 R30=00003FFF4D53BCA0 R31=4456240056E4BB00
23:53:47  NIP=00003FFF8E6670A8 MSR=800000000280F033 ORIG_GPR3=C000000000008774 CTR=00003FFF8E66C230
23:53:47  LINK=00003FFF8E63591C XER=0000000000000000 CCR=0000000022844242 SOFTE=0000000000000001
23:53:47  TRAP=0000000000000300 DAR=4456240056E4BB18 dsisr=0000000040000000 RESULT=0000000000000000
23:53:47  FPR0 00003fff8de599f8 (f: 2380634624.000000, d: 3.476583e-310)
23:53:47  FPR1 4054645060000000 (f: 1610612736.000000, d: 8.156741e+01)
23:53:47  FPR2 3fe8000000000000 (f: 0.000000, d: 7.500000e-01)
23:53:47  FPR3 3fee666660000000 (f: 1610612736.000000, d: 9.500000e-01)
23:53:47  FPR4 400cbe2600000000 (f: 0.000000, d: 3.592846e+00)
23:53:47  FPR5 0000000000000000 (f: 0.000000, d: 0.000000e+00)
23:53:47  FPR6 302e37373936322e (f: 959853120.000000, d: 1.304739e-76)
23:53:47  FPR7 616c5f6b646a2f36 (f: 1684680448.000000, d: 1.994476e+161)
23:53:47  FPR8 6374617263732f6b (f: 1668493184.000000, d: 1.230653e+171)
23:53:47  FPR9 313131323230322e (f: 842019392.000000, d: 9.730425e-72)
23:53:47  FPR10 3fd7bb8000000000 (f: 0.000000, d: 3.708191e-01)
23:53:47  FPR11 0000000000000000 (f: 0.000000, d: 0.000000e+00)
23:53:47  FPR12 0000000000000000 (f: 0.000000, d: 0.000000e+00)
23:53:47  FPR13 00003fff90030100 (f: 2416115968.000000, d: 3.476585e-310)
23:53:47  FPR14 0000000000000000 (f: 0.000000, d: 0.000000e+00)
23:53:47  FPR15 0000000000000000 (f: 0.000000, d: 0.000000e+00)
23:53:47  FPR16 0000000000000000 (f: 0.000000, d: 0.000000e+00)
23:53:47  FPR17 0000000000000000 (f: 0.000000, d: 0.000000e+00)
23:53:47  FPR18 0000000000000000 (f: 0.000000, d: 0.000000e+00)
23:53:47  FPR19 0000000000000000 (f: 0.000000, d: 0.000000e+00)
23:53:47  FPR20 0000000000000000 (f: 0.000000, d: 0.000000e+00)
23:53:47  FPR21 0000000000000000 (f: 0.000000, d: 0.000000e+00)
23:53:47  FPR22 0000000000000000 (f: 0.000000, d: 0.000000e+00)
23:53:47  FPR23 0000000000000000 (f: 0.000000, d: 0.000000e+00)
23:53:47  FPR24 0000000000000000 (f: 0.000000, d: 0.000000e+00)
23:53:47  FPR25 0000000000000000 (f: 0.000000, d: 0.000000e+00)
23:53:47  FPR26 0000000000000000 (f: 0.000000, d: 0.000000e+00)
23:53:47  FPR27 0000000000000000 (f: 0.000000, d: 0.000000e+00)
23:53:47  FPR28 0000000000000000 (f: 0.000000, d: 0.000000e+00)
23:53:47  FPR29 0000000000000000 (f: 0.000000, d: 0.000000e+00)
23:53:47  FPR30 0000000000000000 (f: 0.000000, d: 0.000000e+00)
23:53:47  FPR31 0000000000000000 (f: 0.000000, d: 0.000000e+00)
23:53:47  Module=/home/jenkins/jenkins-agent/workspace/Test_openjdk19_j9_sanity.openjdk_ppc64le_linux_Nightly/openjdkbinary/j2sdk-image/lib/default/libj9gc_full29.so
23:53:47  Module_base_address=00003FFF8E450000
23:53:47  Target=2_90_20221111_114 (Linux 4.4.0-210-generic)
23:53:47  CPU=ppc64le (4 logical CPUs) (0x1fe380000 RAM)
23:53:47  ----------- Stack Backtrace -----------
23:53:47  Unhandled exception
23:53:47  Type=Segmentation error vmState=0x0002000f

Also
java/lang/Thread/virtual/stress/GetStackTraceALot.java

23:34:54  Type=Segmentation error vmState=0x0002000f
23:34:54  J9Generic_Signal_Number=00000018 Signal_Number=0000000b Error_Value=00000000 Signal_Code=00000001
23:34:54  Handler1=00003FFF92C60140 Handler2=00003FFF92B88AA0
23:34:54  R0=00003FFF914C2408 R1=00003FFF503BB9A0 R2=00003FFF91645600 R3=00003FFF8C05D8E8
23:34:54  R4=00003FFF00003FE8 R5=00003FFF503BBCA0 R6=00003FFF503BBA78 R7=00003FFF503BBA80
23:34:54  R8=00003FFF503BBA88 R9=00003FFF8C035308 R10=0000000000000000 R11=00003FFF914FC0A0
23:34:54  R12=0000000000000000 R13=00003FFF503C6900 R14=0000000000000000 R15=00003FFF8BE70118
23:34:54  R16=00003FFF8BE70118 R17=00003FFF904B01B0 R18=00003FFF503BBCD0 R19=0000000000000000
23:34:54  R20=00003FFF8C0A79B0 R21=00003FFF503BBCA0 R22=00003FFF6C1D2038 R23=0000000000000000
23:34:54  R24=02018A0101018901 R25=00003FFF904B01F8 R26=00003FFF8C05D850 R27=00003FFF503BBA80
23:34:54  R28=00003FFF503BBA88 R29=00003FFF503BBA78 R30=00003FFF503BBCA0 R31=02018A0101018900
23:34:54  NIP=00003FFF914F70A8 MSR=800000000280F033 ORIG_GPR3=C000000000008774 CTR=00003FFF914FC230
23:34:54  LINK=00003FFF914C591C XER=0000000000000000 CCR=0000000024844222 SOFTE=0000000000000001
23:34:54  TRAP=0000000000000300 DAR=02018A0101018918 dsisr=0000000040000000 RESULT=0000000000000000
23:34:54  FPR0 00003fff8bf13690 (f: 2347841280.000000, d: 3.476582e-310)
23:34:54  FPR1 4054d839c0000000 (f: 3221225472.000000, d: 8.337852e+01)
23:34:54  FPR2 3feec70000000000 (f: 0.000000, d: 9.617920e-01)
23:34:54  FPR3 bf3d4a1000000000 (f: 0.000000, d: -4.469194e-04)
23:34:54  FPR4 3f499a0000000000 (f: 0.000000, d: 7.812977e-04)
23:34:54  FPR5 bfa01f0600000000 (f: 0.000000, d: -3.148669e-02)
23:34:54  FPR6 0000000000000002 (f: 2.000000, d: 9.881313e-324)
23:34:54  FPR7 00003fff8bdefbe0 (f: 2346646528.000000, d: 3.476582e-310)
23:34:54  FPR8 0000000000000000 (f: 0.000000, d: 0.000000e+00)
23:34:54  FPR9 0000000000000000 (f: 0.000000, d: 0.000000e+00)
23:34:54  FPR10 bfa7c48000000000 (f: 0.000000, d: -4.642105e-02)
23:34:54  FPR11 3fe62e3000000000 (f: 0.000000, d: 6.931381e-01)
23:34:54  FPR12 0000003f00000000 (f: 0.000000, d: 1.336857e-312)
23:34:54  FPR13 bfb7440000000000 (f: 0.000000, d: -9.088135e-02)
23:34:54  FPR14 0000000000000000 (f: 0.000000, d: 0.000000e+00)
23:34:54  FPR15 0000000000000000 (f: 0.000000, d: 0.000000e+00)
23:34:54  FPR16 0000000000000000 (f: 0.000000, d: 0.000000e+00)
23:34:54  FPR17 0000000000000000 (f: 0.000000, d: 0.000000e+00)
23:34:54  FPR18 0000000000000000 (f: 0.000000, d: 0.000000e+00)
23:34:54  FPR19 0000000000000000 (f: 0.000000, d: 0.000000e+00)
23:34:54  FPR20 0000000000000000 (f: 0.000000, d: 0.000000e+00)
23:34:54  FPR21 0000000000000000 (f: 0.000000, d: 0.000000e+00)
23:34:54  FPR22 0000000000000000 (f: 0.000000, d: 0.000000e+00)
23:34:54  FPR23 0000000000000000 (f: 0.000000, d: 0.000000e+00)
23:34:54  FPR24 0000000000000000 (f: 0.000000, d: 0.000000e+00)
23:34:54  FPR25 0000000000000000 (f: 0.000000, d: 0.000000e+00)
23:34:54  FPR26 0000000000000000 (f: 0.000000, d: 0.000000e+00)
23:34:54  FPR27 0000000000000000 (f: 0.000000, d: 0.000000e+00)
23:34:54  FPR28 0000000000000000 (f: 0.000000, d: 0.000000e+00)
23:34:54  FPR29 0000000000000000 (f: 0.000000, d: 0.000000e+00)
23:34:54  FPR30 0000000000000000 (f: 0.000000, d: 0.000000e+00)
23:34:54  FPR31 0000000000000000 (f: 0.000000, d: 0.000000e+00)
23:34:54  Module=/home/jenkins/jenkins-agent/workspace/Test_openjdk19_j9_sanity.openjdk_ppc64le_linux_Nightly/openjdkbinary/j2sdk-image/lib/default/libj9gc_full29.so
23:34:54  Module_base_address=00003FFF912E0000
23:34:54  Target=2_90_20221111_114 (Linux 4.4.0-210-generic)
23:34:54  CPU=ppc64le (4 logical CPUs) (0x1fe380000 RAM)
23:34:54  ----------- Stack Backtrace -----------
23:34:54  _ZN22GC_ObjectModelDelegate29calculateObjectDetailsForCopyEP18MM_EnvironmentBaseP18MM_ForwardedHeaderPmS4_S4_+0x48 (0x00003FFF914F70A8 [libj9gc_full29.so+0x2170a8])
23:34:54   (0x0000000004600002 [<unknown>+0x0])
23:34:54  _ZN12MM_Scavenger14copyForVariantILb0EEEP8J9ObjectP22MM_EnvironmentStandardP18MM_ForwardedHeader+0x360 (0x00003FFF914C5C10 [libj9gc_full29.so+0x1e5c10])
23:34:54  _ZN12MM_Scavenger26incrementalScanCacheBySlotEP22MM_EnvironmentStandardP24MM_CopyScanCacheStandard+0x958 (0x00003FFF914C2408 [libj9gc_full29.so+0x1e2408])
23:34:54  _ZN12MM_Scavenger12completeScanEP22MM_EnvironmentStandard+0x114 (0x00003FFF914C2AB4 [libj9gc_full29.so+0x1e2ab4])
23:34:54  _ZN12MM_Scavenger24workThreadGarbageCollectEP22MM_EnvironmentStandard+0x2b4 (0x00003FFF914C2FD4 [libj9gc_full29.so+0x1e2fd4])
23:34:54  _ZN23MM_ParallelScavengeTask3runEP18MM_EnvironmentBase+0x1c (0x00003FFF9153A86C [libj9gc_full29.so+0x25a86c])
23:34:54  _ZN21MM_ParallelDispatcher16workerEntryPointEP18MM_EnvironmentBase+0x2b8 (0x00003FFF9145DF88 [libj9gc_full29.so+0x17df88])
23:34:54  _Z23dispatcher_thread_proc2P14OMRPortLibraryPv+0x160 (0x00003FFF9145D350 [libj9gc_full29.so+0x17d350])
23:34:54  omrsig_protect+0x3f4 (0x00003FFF92B89F74 [libj9prt29.so+0x39f74])
23:34:54  dispatcher_thread_proc+0x50 (0x00003FFF9145CA10 [libj9gc_full29.so+0x17ca10])
23:34:54  thread_wrapper+0x190 (0x00003FFF92B1CBC0 [libj9thr29.so+0xcbc0])
23:34:54  start_thread+0xf0 (0x00003FFF935D8040 [libpthread.so.0+0x8040])
23:34:54  clone+0x98 (0x00003FFF934F4290 [libc.so.6+0x124290])
23:34:54  ---------------------------------------

@dmitripivkine
Copy link
Contributor

From failure https://openj9-jenkins.osuosl.org/job/Test_openjdk19_j9_sanity.openjdk_ppc64le_linux_Nightly/47
jdk_lang_1
java/lang/Thread/virtual/ParkWithFixedThreadPool.java

There is the same picture:
java/lang/VirtualThread object with only fields filled linkNext and linkPrevious with the same (and stale) pointer causes the crash. This object has been discovered by Scavenger from JNI Global Ref root:

> !j9object 0x3FFF8DE61D28
!J9Object 0x00003FFF8DE61D28 {
	struct J9Class* clazz = !j9class 0x3FFF902D6D00 // java/lang/VirtualThread
	Object flags = 0x0000002A;
	J lockword = 0x0000000000000008 (offset = 0) (java/lang/Object) <hidden>
	J eetop = 0x0000000000000000 (offset = 8) (java/lang/Thread)
	J tid = 0x0000000000000000 (offset = 16) (java/lang/Thread)
	Ljava/lang/String; name = !fj9object 0x0 (offset = 40) (java/lang/Thread)
	Z interrupted = 0x00000000 (offset = 152) (java/lang/Thread)
	Ljava/lang/ClassLoader; contextClassLoader = !fj9object 0x0 (offset = 48) (java/lang/Thread)
	Ljava/security/AccessControlContext; inheritedAccessControlContext = !fj9object 0x0 (offset = 56) (java/lang/Thread)
	Ljava/lang/Thread$FieldHolder; holder = !fj9object 0x0 (offset = 64) (java/lang/Thread)
	Ljava/lang/ThreadLocal$ThreadLocalMap; threadLocals = !fj9object 0x0 (offset = 72) (java/lang/Thread)
	Ljava/lang/ThreadLocal$ThreadLocalMap; inheritableThreadLocals = !fj9object 0x0 (offset = 80) (java/lang/Thread)
	Ljava/lang/Object; extentLocalBindings = !fj9object 0x0 (offset = 88) (java/lang/Thread)
	Ljava/lang/Object; interruptLock = !fj9object 0x0 (offset = 96) (java/lang/Thread)
	Ljava/lang/Object; parkBlocker = !fj9object 0x0 (offset = 104) (java/lang/Thread)
	Lsun/nio/ch/Interruptible; nioBlocker = !fj9object 0x0 (offset = 112) (java/lang/Thread)
	Ljdk/internal/vm/Continuation; cont = !fj9object 0x0 (offset = 120) (java/lang/Thread)
	Ljava/lang/Thread$UncaughtExceptionHandler; uncaughtExceptionHandler = !fj9object 0x0 (offset = 128) (java/lang/Thread)
	J threadLocalRandomSeed = 0x0000000000000000 (offset = 24) (java/lang/Thread)
	I threadLocalRandomProbe = 0x00000000 (offset = 156) (java/lang/Thread)
	I threadLocalRandomSecondarySeed = 0x00000000 (offset = 160) (java/lang/Thread)
	Ljdk/internal/vm/ThreadContainer; container = !fj9object 0x0 (offset = 136) (java/lang/Thread)
	Ljdk/internal/vm/StackableScope; headStackableScopes = !fj9object 0x0 (offset = 144) (java/lang/Thread)
	Z started = 0x00000000 (offset = 164) (java/lang/Thread)
	Z stopCalled = 0x00000000 (offset = 168) (java/lang/Thread)
	J tls = 0x0000000000000000 (offset = 32) (java/lang/Thread) <hidden>
	Ljava/util/concurrent/Executor; scheduler = !fj9object 0x0 (offset = 184) (java/lang/VirtualThread)
	Ljdk/internal/vm/Continuation; cont = !fj9object 0x0 (offset = 192) (java/lang/VirtualThread)
	Ljava/lang/Runnable; runContinuation = !fj9object 0x0 (offset = 200) (java/lang/VirtualThread)
	I state = 0x00000000 (offset = 172) (java/lang/VirtualThread)
	Z parkPermit = 0x00000000 (offset = 240) (java/lang/VirtualThread)
	Ljava/lang/Thread; carrierThread = !fj9object 0x0 (offset = 208) (java/lang/VirtualThread)
	Ljava/util/concurrent/CountDownLatch; termination = !fj9object 0x0 (offset = 216) (java/lang/VirtualThread)
	Ljava/lang/VirtualThread; linkNext = !fj9object 0x3fff8e07b208 (offset = 232) (java/lang/VirtualThread) <hidden>
	Ljava/lang/VirtualThread; linkPrevious = !fj9object 0x3fff8e07b208 (offset = 224) (java/lang/VirtualThread) <hidden>
	J inspectorCount = 0x0000000000000000 (offset = 176) (java/lang/VirtualThread) <hidden>
	I isSuspendedByJVMTI = 0x00000000 (offset = 244) (java/lang/VirtualThread) <hidden>
}

The point is as soon as this Virtual Thread object has been added to the JNI Global Refs pool it is discoverable by GC, in case of the copy of this object Scavenger is going to fixup all object pointers in it (and copy them as well). Stale pointer can be introduced by VM code only by explicit pointers update. And there is always case with list of one element

@pshipton
Copy link
Member Author

https://openj9-jenkins.osuosl.org/job/Test_openjdk19_j9_sanity.openjdk_aarch64_linux_Nightly/50
jdk_lang_1
java/lang/Thread/virtual/HoldsLock.java

https://openj9-artifactory.osuosl.org/artifactory/ci-openj9/Test/Test_openjdk19_j9_sanity.openjdk_aarch64_linux_Nightly/50/openjdk_test_output.tar.gz

00:56:37  Type=Segmentation error vmState=0x0002000f
00:56:37  J9Generic_Signal_Number=00000018 Signal_Number=0000000b Error_Value=00000000 Signal_Code=00000001
00:56:37  Handler1=0000FFFFBE924AA4 Handler2=0000FFFFBE88BAC0 InaccessibleAddress=0061007200720018
00:56:37  R0=0000FFFFB8037C70 R1=0000FFFF20004218 R2=0000FFFF7CCCB360 R3=0000FFFF7CCCB188
00:56:37  R4=0000FFFF7CCCB190 R5=0000FFFF7CCCB198 R6=0000000000000004 R7=0000FFFFBC759BB0
00:56:37  R8=0000000000000454 R9=000000000000000E R10=00000000020E0003 R11=0000000000000000
00:56:37  R12=0000000000000000 R13=0000000000000000 R14=0000000000000000 R15=0000000000000000
00:56:37  R16=0000FFFFBD705008 R17=0000FFFFBEF0FDD0 R18=0000000000000004 R19=0061007200720000
00:56:37  R20=0000FFFF7CCCB198 R21=0000FFFF7CCCB190 R22=0000FFFF7CCCB360 R23=0000FFFFB805EE40
00:56:37  R24=0000FFFF7CCCB188 R25=0000FFFFB7DE3BC0 R26=0000000000000000 R27=0000FFFFBC759B88
00:56:37  R28=0000000000000000 R29=0000FFFF7CCCB0A0 R30=0000FFFFBD5C1AEC R31=0000FFFF7CCCB0A0
00:56:37  PC=0000FFFFBD5E54F8 SP=0000FFFF7CCCB0A0 PSTATE=0000000060000000
00:56:37  V0 000000000000015b (f: 347.000000, d: 1.714408e-321)
00:56:37  V1 0000000000000001 (f: 1.000000, d: 4.940656e-324)
00:56:37  V2 0000000000000010 (f: 16.000000, d: 7.905050e-323)
00:56:37  V3 40292cc8062ded8a (f: 103673224.000000, d: 1.258746e+01)
00:56:37  V4 bfd00ea348b88334 (f: 1220051712.000000, d: -2.508934e-01)
00:56:37  V5 3fd5575b0be00b6a (f: 199232368.000000, d: 3.334568e-01)
00:56:37  V6 3fe62e42fefa39ef (f: 4277811712.000000, d: 6.931472e-01)
00:56:37  V7 8020080280200802 (f: 2149582848.000000, d: -4.458850e-308)
00:56:37  V8 0000000000000000 (f: 0.000000, d: 0.000000e+00)
00:56:37  V9 0000000000000000 (f: 0.000000, d: 0.000000e+00)
00:56:37  V10 0000000000000000 (f: 0.000000, d: 0.000000e+00)
00:56:37  V11 0000000000000000 (f: 0.000000, d: 0.000000e+00)
00:56:37  V12 0000000000000000 (f: 0.000000, d: 0.000000e+00)
00:56:37  V13 0000000000000000 (f: 0.000000, d: 0.000000e+00)
00:56:37  V14 0000000000000000 (f: 0.000000, d: 0.000000e+00)
00:56:37  V15 0000000000000000 (f: 0.000000, d: 0.000000e+00)
00:56:37  V16 4010040140100401 (f: 1074791424.000000, d: 4.003911e+00)
00:56:37  V17 fe00ff0000000000 (f: 0.000000, d: -8.892315e+298)
00:56:37  V18 0000000000000000 (f: 0.000000, d: 0.000000e+00)
00:56:37  V19 3f9eb851eb851eb8 (f: 3951369984.000000, d: 3.000000e-02)
00:56:37  V20 3fb1eb851eb851ec (f: 515396064.000000, d: 7.000000e-02)
00:56:37  V21 0000000000000008 (f: 8.000000, d: 3.952525e-323)
00:56:37  V22 3f0000003f800000 (f: 1065353216.000000, d: 3.051759e-05)
00:56:37  V23 3fc999999999999a (f: 2576980480.000000, d: 2.000000e-01)
00:56:37  V24 3fd6666666666666 (f: 1717986944.000000, d: 3.500000e-01)
00:56:37  V25 0000000000000000 (f: 0.000000, d: 0.000000e+00)
00:56:37  V26 3fb999999999999a (f: 2576980480.000000, d: 1.000000e-01)
00:56:37  V27 000000000000000a (f: 10.000000, d: 4.940656e-323)
00:56:37  V28 0000000000000800 (f: 2048.000000, d: 1.011846e-320)
00:56:37  V29 0000000000000300 (f: 768.000000, d: 3.794424e-321)
00:56:37  V30 4030000000000000 (f: 0.000000, d: 1.600000e+01)
00:56:37  V31 000000003f400000 (f: 1061158912.000000, d: 5.242822e-315)
00:56:37  Module=/home/jenkins/workspace/Test_openjdk19_j9_sanity.openjdk_aarch64_linux_Nightly/openjdkbinary/j2sdk-image/lib/default/libj9gc_full29.so
00:56:37  Module_base_address=0000FFFFBD455000
00:56:37  Target=2_90_20221115_116 (Linux 5.4.0-113-generic)
00:56:37  CPU=aarch64 (4 logical CPUs) (0x1f0016000 RAM)
00:56:37  ----------- Stack Backtrace -----------
00:56:37  _ZN22GC_ObjectModelDelegate29calculateObjectDetailsForCopyEP18MM_EnvironmentBaseP18MM_ForwardedHeaderPmS4_S4_+0x38 (0x0000FFFFBD5E54F8 [libj9gc_full29.so+0x1904f8])
00:56:37  _ZN12MM_Scavenger14copyForVariantILb0EEEP8J9ObjectP22MM_EnvironmentStandardP18MM_ForwardedHeader+0x5c (0x0000FFFFBD5C1AEC [libj9gc_full29.so+0x16caec])
00:56:37  _ZN12MM_Scavenger26incrementalScanCacheBySlotEP22MM_EnvironmentStandardP24MM_CopyScanCacheStandard+0x7ac (0x0000FFFFBD5BF220 [libj9gc_full29.so+0x16a220])
00:56:37  _ZN12MM_Scavenger12completeScanEP22MM_EnvironmentStandard+0xdc (0x0000FFFFBD5BF820 [libj9gc_full29.so+0x16a820])
00:56:37  _ZN12MM_Scavenger24workThreadGarbageCollectEP22MM_EnvironmentStandard+0x224 (0x0000FFFFBD5BFC14 [libj9gc_full29.so+0x16ac14])
00:56:37  _ZN21MM_ParallelDispatcher16workerEntryPointEP18MM_EnvironmentBase+0x21c (0x0000FFFFBD573610 [libj9gc_full29.so+0x11e610])
00:56:37  _Z23dispatcher_thread_proc2P14OMRPortLibraryPv+0x11c (0x0000FFFFBD572DCC [libj9gc_full29.so+0x11ddcc])
00:56:37  omrsig_protect+0x21c (0x0000FFFFBE88C74C [libj9prt29.so+0x2874c])
00:56:37  dispatcher_thread_proc+0x38 (0x0000FFFFBD5727D8 [libj9gc_full29.so+0x11d7d8])
00:56:37  thread_wrapper+0xcc (0x0000FFFFBE83A3BC [libj9thr29.so+0x73bc])
00:56:37  start_thread+0x184 (0x0000FFFFBF019624 [libpthread.so.0+0x7624])
00:56:37   (0x0000FFFFBEF5C49C [libc.so.6+0xd149c])
00:56:37  ---------------------------------------

and also
java/lang/Thread/virtual/ParkWithFixedThreadPool.java

babsingh added a commit to babsingh/openj9 that referenced this issue Nov 15, 2022
The asserts verify that a virtual thread is only
- added once to the list; and
- removed IFF it was added to the list.

The previous and next fields of the virtual thread object are set to
null when it is removed from the list.

Tracepoints output important details to triage failures.

Related: eclipse-openj9#16249
Related: eclipse-openj9#16259

Signed-off-by: Babneet Singh <sbabneet@ca.ibm.com>
@dmitripivkine
Copy link
Contributor

another occurrence was attributed to #16259 by mistake
see #16259 (comment) and followed
#16259 (comment)

@dmitripivkine
Copy link
Contributor

There is mess with items, originally this item refers to "Invalid JIT return" failures but later on switched to crashes in Scavenger investigation. I am not going to split issue to two but add vm/gc labels for tracking

@pshipton
Copy link
Member Author

https://openj9-jenkins.osuosl.org/job/Test_openjdk19_j9_sanity.openjdk_s390x_linux_Nightly/53
jdk_lang_1
java/lang/Thread/virtual/stress/SleepALot.java#id0

https://openj9-artifactory.osuosl.org/artifactory/ci-openj9/Test/Test_openjdk19_j9_sanity.openjdk_s390x_linux_Nightly/53/openjdk_test_output.tar.gz

22:35:26  *** Invalid JIT return address 000003FF68D2D074 in 000003FFA3F7CFD8
22:35:26  
22:35:26  03:34:46.952 0x3ff0c094600    j9vm.249    *   ** ASSERTION FAILED ** at /home/jenkins/workspace/Build_JDK19_s390x_linux_Nightly/openj9/runtime/vm/swalk.c:1632: ((0 ))

@JasonFengJ9
Copy link
Member

JDK19 internal build(rhel7s390x-3-6)

openjdk version "19.0.1" 2022-10-18
IBM Semeru Runtime Open Edition 19.0.1+10 (build 19.0.1+10)
Eclipse OpenJ9 VM 19.0.1+10 (build master-5e4baa709, JRE 19 Linux s390x-64-Bit Compressed References 20221018_78 (JIT enabled, AOT enabled)
OpenJ9   - 5e4baa709
OMR      - fe4c3b9b5
JCL      - 720d535776 based on jdk-19.0.1+10)

[2022-11-19T18:59:28.895Z] variation: -Xdump:system:none -Xdump:heap:none -Xdump:system:events=gpf+abort+traceassert+corruptcache -XX:-JITServerTechPreviewMessage Mode650
[2022-11-19T18:59:28.895Z] JVM_OPTIONS:  -Xdump:system:none -Xdump:heap:none -Xdump:system:events=gpf+abort+traceassert+corruptcache -XX:-JITServerTechPreviewMessage -XX:-UseCompressedOops 

[2022-11-19T19:33:16.081Z] TEST: java/lang/Thread/virtual/stress/SleepALot.java#id0

[2022-11-19T19:33:16.081Z] STDERR:
[2022-11-19T19:33:16.081Z] 
[2022-11-19T19:33:16.081Z] 
[2022-11-19T19:33:16.081Z] *** Invalid JIT return address 000003FF5C0AB074 in 000003FFA037CFF8
[2022-11-19T19:33:16.081Z] 
[2022-11-19T19:33:16.081Z] 19:32:58.799 0x3feec02cd00    j9vm.249    *   ** ASSERTION FAILED ** at /home/jenkins/workspace/build-scripts/jobs/jdk19/jdk19-linux-s390x-openj9/workspace/build/src/openj9/runtime/vm/swalk.c:1632: ((0 ))
[2022-11-19T19:33:16.081Z] JVMDUMP039I Processing dump event "traceassert", detail "" at 2022/11/19 11:32:58 - please wait.

[2022-11-19T19:38:59.537Z] jdk_lang_1_FAILED

@dmitripivkine
Copy link
Contributor

Another manifestation of crash in Scavenger #16351

@ChengJin01
Copy link

ChengJin01 commented Nov 29, 2022

The problem was also spotted in FFI tests at https://openj9-jenkins.osuosl.org/job/Grinder/1546/consoleText

[2022-11-29T22:12:10.111Z] test TestUpcall.testUpcallsNoScope(918, "f1_V_IIS_DIF", VOID, [INT, INT, STRUCT], [DOUBLE, INT, FLOAT]): success
[2022-11-29T22:12:10.111Z] STDERR:
[2022-11-29T22:12:10.111Z] WARNING: Using incubator modules: jdk.incubator.foreign
[2022-11-29T22:12:10.111Z] 
[2022-11-29T22:12:10.111Z] 
[2022-11-29T22:12:10.111Z] *** Invalid JIT return address 0000000010EA5AE0 in 0000000011024700
[2022-11-29T22:12:10.111Z] 
[2022-11-29T22:12:10.111Z] 22:11:12.977 0x11024400    j9vm.249    * 
** ASSERTION FAILED ** at /Users/jenkins/workspace/Build_JDK17_x86-64_mac_Nightly/openj9/runtime/vm/swalk.c:1632: ((0 ))

@pshipton
Copy link
Member Author

https://openj9-jenkins.osuosl.org/job/Test_openjdk19_j9_sanity.openjdk_x86-64_windows_Nightly/60
jdk_lang_1
java/lang/Thread/virtual/ParkWithFixedThreadPool.java

https://openj9-artifactory.osuosl.org/artifactory/ci-openj9/Test/Test_openjdk19_j9_sanity.openjdk_x86-64_windows_Nightly/60/openjdk_test_output.tar.gz

22:09:33  Unhandled exception
22:09:33  Type=Segmentation error vmState=0x0002000f
22:09:33  Windows_ExceptionCode=c0000005 J9Generic_Signal=00000004 ExceptionAddress=00007FFF7186D216 ContextFlags=0010005f
22:09:33  Handler1=00007FFF7344C400 Handler2=00007FFF736FAA50 InaccessibleReadAddress=0000000000000018
22:09:33  RDI=000000C1F77BF8B0 RSI=000000C1D7463450 RAX=000000C1D6983280 RBX=0000000000000000
22:09:33  RCX=000000C1D723F4C8 RDX=000000C1F660C998 R8=0000000000000000 R9=000000C1D723F430
22:09:33  R10=0000000000000000 R11=0000000000000000 R12=0000000000000000 R13=000000C1F77BF8B0
22:09:33  R14=0000000000000000 R15=000000C1F77BF598
22:09:33  RIP=00007FFF7186D216 RSP=000000C1F77BF4C0 RBP=000000C1F77BF630 EFLAGS=0000000000010246
22:09:33  FS=0053 ES=002B DS=002B
22:09:33  XMM0 00007ff6ef3733e4 (f: 4013372416.000000, d: 6.951432e-310)
22:09:33  XMM1 0000000000000000 (f: 0.000000, d: 0.000000e+00)
22:09:33  XMM2 00007ff6ef3733d4 (f: 4013372416.000000, d: 6.951432e-310)
22:09:33  XMM3 0000000000000000 (f: 0.000000, d: 0.000000e+00)
22:09:33  XMM4 0000000000000000 (f: 0.000000, d: 0.000000e+00)
22:09:33  XMM5 0000000000000000 (f: 0.000000, d: 0.000000e+00)
22:09:33  XMM6 0000000000000000 (f: 0.000000, d: 0.000000e+00)
22:09:33  XMM7 0000000000000000 (f: 0.000000, d: 0.000000e+00)
22:09:33  XMM8 0000000000000000 (f: 0.000000, d: 0.000000e+00)
22:09:33  XMM9 0000000000000000 (f: 0.000000, d: 0.000000e+00)
22:09:33  XMM10 0000000000000000 (f: 0.000000, d: 0.000000e+00)
22:09:33  XMM11 0000000000000000 (f: 0.000000, d: 0.000000e+00)
22:09:33  XMM12 0000000000000000 (f: 0.000000, d: 0.000000e+00)
22:09:33  XMM13 0000000000000000 (f: 0.000000, d: 0.000000e+00)
22:09:33  XMM14 0000000000000000 (f: 0.000000, d: 0.000000e+00)
22:09:33  XMM15 0000000000000000 (f: 0.000000, d: 0.000000e+00)
22:09:33  Module=F:\Users\jenkins\workspace\Test_openjdk19_j9_sanity.openjdk_x86-64_windows_Nightly\openjdkbinary\j2sdk-image\bin\default\j9gc_full29.dll
22:09:33  Module_base_address=00007FFF71700000 Offset_in_DLL=000000000016d216
22:09:33  Target=2_90_20221128_141 (Windows Server 2012 R2 6.3 build 9600)
22:09:33  CPU=amd64 (8 logical CPUs) (0x1ffb9c000 RAM)
22:09:33  ----------- Stack Backtrace -----------
22:09:33  J9VMDllMain+0x16c206 (0x00007FFF7186D216 [j9gc_full29+0x16d216])
22:09:33  J9VMDllMain+0x12b19f (0x00007FFF7182C1AF [j9gc_full29+0x12c1af])
22:09:33  J9VMDllMain+0x1190b1 (0x00007FFF7181A0C1 [j9gc_full29+0x11a0c1])
22:09:33  J9VMDllMain+0x1260d2 (0x00007FFF718270E2 [j9gc_full29+0x1270e2])
22:09:33  J9VMDllMain+0x125b90 (0x00007FFF71826BA0 [j9gc_full29+0x126ba0])
22:09:33  J9VMDllMain+0x14eb28 (0x00007FFF7184FB38 [j9gc_full29+0x14fb38])
22:09:33  J9VMDllMain+0x14fbc1 (0x00007FFF71850BD1 [j9gc_full29+0x150bd1])
22:09:33  j9port_isCompatible+0x1a08b (0x00007FFF736FCC3B [j9prt29+0x1cc3b])
22:09:33  J9VMDllMain+0x14fd1b (0x00007FFF71850D2B [j9gc_full29+0x150d2b])
22:09:33  omrthread_get_category+0xa42 (0x00007FFF73764232 [j9thr29+0x4232])
22:09:33  _o_strcat_s+0x5e (0x00007FFF73C2C1AE [ucrtbase+0x1c1ae])
22:09:33  BaseThreadInitThunk+0x22 (0x00007FFF859A13F2 [KERNEL32+0x13f2])
22:09:33  RtlUserThreadStart+0x34 (0x00007FFF860254F4 [ntdll+0x154f4])
22:09:33  ---------------------------------------

@dmitripivkine
Copy link
Contributor

I think we are collecting this crashes in Scavenger in #16351 (single element Virtual Threads List next/previous stale pointer)

@tajila
Copy link
Contributor

tajila commented Dec 5, 2022

Im going to close this issue, as the items discussed are being tracked elsewhere

@tajila tajila closed this as completed Dec 5, 2022
@pshipton
Copy link
Member Author

pshipton commented Dec 6, 2022

@tobi is there a better place to track "invalid JIT return address"? This occurred in the testing last night.

https://openj9-jenkins.osuosl.org/job/Test_openjdk19_j9_sanity.openjdk_ppc64_aix_Nightly/63/
jdk_lang_1
java/lang/Thread/virtual/stress/GetStackTraceALot.java#id0

https://openj9-artifactory.osuosl.org/artifactory/ci-openj9/Test/Test_openjdk19_j9_sanity.openjdk_ppc64_aix_Nightly/63/openjdk_test_output.tar.gz

23:45:13  STDERR:
23:45:13  
23:45:13  
23:45:13  *** Invalid JIT return address 0000000000000000 in 000001002312DE00
23:45:13  
23:45:13  04:41:21.095 0x1002312db00    j9vm.249    *   ** ASSERTION FAILED ** at /home/jenkins/workspace/Build_JDK19_ppc64_aix_Nightly/openj9/runtime/vm/swalk.c:1632: ((0 ))

@pshipton pshipton reopened this Dec 6, 2022
@tajila
Copy link
Contributor

tajila commented Dec 6, 2022

@tobi is there a better place to track "invalid JIT return address"? This occurred in the testing last night.

Yes #15939

@pshipton pshipton closed this as completed Dec 6, 2022
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
comp:jit comp:vm jdk19 project:loom Used to track Project Loom related work test failure
Projects
None yet
Development

No branches or pull requests

7 participants