Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

accessing past end of object and spl panic spl-err.c:48:vpanic()) SPL PANIC and txg_quiesce hung #2789

Closed
inevity opened this issue Oct 11, 2014 · 2 comments

Comments

@inevity
Copy link

inevity commented Oct 11, 2014

cents 6.4,ifs 0.6.0rc-14. sa seted ,

first some kernel waring,maybe not related,

Oct 10 11:30:04 kernel: WARNING: at fs/inode.c:727 unlock_new_inode+0x62/0x70() (Tainted: P W --------------- )
Oct 10 11:30:04 kernel: Hardware name: Tecal RH2288H V2-12L
Oct 10 11:30:04 kernel: Modules linked in: tcp_diag inet_diag zfs(P)(U) zcommon(P)(U) znvpair(P)(U) zavl(P)(U) zunicode(P)(U) spl(U) zlib_deflate bonding 8021q garp stp llc ipv6 vhost_net macvtap macvlan tun kvm_intel kvm uinput microcode ses enclosure sg serio_raw sb_edac edac_core iTCO_wdt iTCO_vendor_support i2c_i801 i2c_core ioatdma igb dca ptp pps_core ext3 jbd mbcache sd_mod crc_t10dif ahci wmi isci libsas mpt2sas scsi_transport_sas raid_class dm_mirror dm_region_hash dm_log dm_mod [last unloaded: scsi_wait_scan]
Oct 10 11:30:04 kernel: Pid: 4039, comm: glusterfsd Tainted: P W --------------- 2.6.32-358.el6.x86_64 #1
Oct 10 11:30:04 kernel: Call Trace:
Oct 10 11:30:04 kernel: [] ? warn_slowpath_common+0x87/0xc0
Oct 10 11:30:04 kernel: [] ? warn_slowpath_null+0x1a/0x20
Oct 10 11:30:04 kernel: [] ? unlock_new_inode+0x62/0x70
Oct 10 11:30:04 kernel: [] ? zfs_znode_alloc+0x485/0x540 [zfs]
Oct 10 11:30:04 kernel: [] ? dmu_object_info_from_dnode+0x144/0x1b0 [zfs]
Oct 10 11:30:04 kernel: [] ? zfs_zget+0x148/0x1d0 [zfs]
Oct 10 11:30:04 kernel: [] ? zfs_dirent_lock+0x47c/0x540 [zfs]
Oct 10 11:30:04 kernel: [] ? __d_lookup+0xa7/0x150
Oct 10 11:30:04 kernel: [] ? zfs_create+0x14f/0x6f0 [zfs]
Oct 10 11:30:04 kernel: [] ? zpl_create+0xa5/0x140 [zfs]
Oct 10 11:30:04 kernel: [] ? generic_permission+0x23/0xb0
Oct 10 11:30:04 kernel: [] ? vfs_create+0xb4/0xe0
Oct 10 11:30:04 kernel: [] ? sys_mknodat+0x280/0x2a0
Oct 10 11:30:04 kernel: [] ? vfs_lstat+0x1e/0x20
Oct 10 11:30:04 kernel: [] ? sys_newlstat+0x24/0x50
Oct 10 11:30:04 kernel: [] ? sys_mknod+0x1a/0x20(this is mknod )
Oct 10 11:30:04 kernel: [] ? system_call_fastpath+0x16/0x1b
Oct 10 11:30:04 kernel: ---[ end trace 5d559ad876223bbd ]---

but then
Oct 10 18:57:31 kernel: zfs: accessing past end of object 29/6d43 (size=1536 access=48+9723)

Oct 10 18:57:31 kernel: SPLError: 4037:0:(spl-err.c:48:vpanic()) SPL PANIC
Oct 10 18:57:31 kernel: SPL: Showing stack for process 4037
Oct 10 18:57:31 kernel: Pid: 4037, comm: glusterfsd Tainted: P W --------------- 2.6.32-358.el6.x86_64 #1
Oct 10 18:57:31 kernel: Call Trace:
Oct 10 18:57:31 kernel: [] ? spl_debug_dumpstack+0x27/0x40 [spl]
Oct 10 18:57:31 kernel: [] ? spl_debug_bug+0x81/0xd0 [spl]
Oct 10 18:57:31 kernel: [] ? vpanic+0x65/0x90 [spl]
Oct 10 18:57:31 kernel: [] ? mutex_lock+0x1e/0x50
Oct 10 18:57:31 kernel: [] ? vcmn_err+0x162/0x180 [spl]
Oct 10 18:57:31 kernel: [] ? spl_rw_clear_owner+0x39/0x50 [zfs]
Oct 10 18:57:31 kernel: [] ? dmu_zfetch+0xd2e/0xe40 [zfs]
Oct 10 18:57:31 kernel: [] ? __kmalloc+0x20c/0x220
Oct 10 18:57:31 kernel: [] ? zfs_panic_recover+0x52/0x60 [zfs]
Oct 10 18:57:31 kernel: [] ? kmem_free_debug+0x4b/0x150 [spl]
Oct 10 18:57:31 kernel: [] ? dmu_buf_hold_array_by_dnode+0x405/0x570 [zfs]
Oct 10 18:57:31 kernel: [] ? dmu_write_uio_dnode+0x46/0x140 [zfs]
Oct 10 18:57:31 kernel: [] ? dmu_object_set_blocksize+0x5b/0x70 [zfs]
Oct 10 18:57:31 kernel: [] ? dmu_write_uio_dbuf+0x46/0x60 [zfs]
Oct 10 18:57:31 kernel: [] ? zfs_write+0xc7c/0xca0 [zfs]
Oct 10 18:57:31 kernel: [] ? nvlist_lookup_byte_array+0x16/0x20 [znvpair]
Oct 10 18:57:31 kernel: [] ? __zpl_xattr_get+0x1e4/0x200 [zfs]
Oct 10 18:57:31 kernel: [] ? kvasprintf+0x70/0x90
Oct 10 18:57:31 kernel: [] ? zpl_xattr_trusted_get+0x90/0xa0 [zfs]
Oct 10 18:57:31 kernel: [] ? getxattr+0x9c/0x170
Oct 10 18:57:31 kernel: [] ? zpl_write_common+0x52/0x70 [zfs]
Oct 10 18:57:31 kernel: [] ? zpl_write+0x68/0xa0 [zfs]
Oct 10 18:57:31 kernel: [] ? security_file_permission+0x16/0x20
Oct 10 18:57:31 kernel: [] ? vfs_write+0xb8/0x1a0
Oct 10 18:57:31 kernel: [] ? sys_pwrite64+0x82/0xa0
Oct 10 18:57:31 kernel: [] ? system_call_fastpath+0x16/0x1b
Oct 10 18:57:31 kernel: SPL: Dumping log to /tmp/spl-log.1412938651.4037

txg_quiesce hung
Oct 10 19:00:50 kernel: INFO: task txg_quiesce:3862 blocked for more than 120 seconds.
Oct 10 19:00:50 kernel: "echo 0 > /proc/sys/kernel/hung_task_timeout_secs" disables this message.
Oct 10 19:00:50 kernel: txg_quiesce D 0000000000000001 0 3862 2 0x00000000
Oct 10 19:00:50 kernel: ffff88086ef5dd40 0000000000000046 0000000000000000 ffff88007d0dc620
Oct 10 19:00:50 kernel: ffff88086ef5dcf0 ffffffff811681bc ffff88086ef5dde0 ffffffff000080d0
Oct 10 19:00:50 kernel: ffff88086e8645f8 ffff88086ef5dfd8 000000000000fb88 ffff88086e8645f8
Oct 10 19:00:50 kernel: Call Trace:
Oct 10 19:00:50 kernel: [] ? __kmalloc+0x20c/0x220
Oct 10 19:00:50 kernel: [] cv_wait_common+0x105/0x1c0 [spl]
Oct 10 19:00:50 kernel: [] ? autoremove_wake_function+0x0/0x40
Oct 10 19:00:50 kernel: [] ? __bitmap_weight+0x8c/0xb0
Oct 10 19:00:50 kernel: [] __cv_wait+0x15/0x20 [spl]
Oct 10 19:00:50 kernel: [] txg_quiesce_thread+0x243/0x3a0 [zfs]
Oct 10 19:00:50 kernel: [] ? set_user_nice+0xc9/0x130
Oct 10 19:00:50 kernel: [] ? txg_quiesce_thread+0x0/0x3a0 [zfs]
Oct 10 19:00:50 kernel: [] thread_generic_wrapper+0x68/0x80 [spl]
Oct 10 19:00:50 kernel: [] ? thread_generic_wrapper+0x0/0x80 [spl]
Oct 10 19:00:50 kernel: [] kthread+0x96/0xa0
Oct 10 19:00:50 kernel: [] child_rip+0xa/0x20
Oct 10 19:00:50 kernel: [] ? kthread+0x0/0xa0
Oct 10 19:00:50 kernel: [] ? child_rip+0x0/0x20

glusterfsd hung
Oct 10 19:00:50 kernel: INFO: task glusterfsd:4019 blocked for more than 120 seconds.

Oct 10 19:00:50 kernel: INFO: task glusterfsd:4019 blocked for more than 120 seconds.
Oct 10 19:00:50 kernel: "echo 0 > /proc/sys/kernel/hung_task_timeout_secs" disables this message.
Oct 10 19:00:50 kernel: glusterfsd D 0000000000000002 0 4019 1 0x00000000
Oct 10 19:00:50 kernel: ffff880845345b48 0000000000000082 0000000000000000 0000000000000000
Oct 10 19:00:50 kernel: 0000000000000000 0000000000000000 0000000000000000 ffff88086cc28ae0
Oct 10 19:00:50 kernel: ffff88086cc29098 ffff880845345fd8 000000000000fb88 ffff88086cc29098
Oct 10 19:00:50 kernel: Call Trace:
Oct 10 19:00:50 kernel: [] cv_wait_common+0x105/0x1c0 [spl]
Oct 10 19:00:50 kernel: [] ? autoremove_wake_function+0x0/0x40
Oct 10 19:00:50 kernel: [] ? avl_find+0x60/0xb0 [zavl]
Oct 10 19:00:50 kernel: [] __cv_wait+0x15/0x20 [spl]
Oct 10 19:00:50 kernel: [] zfs_range_lock+0x2ac/0x5c0 [zfs]
Oct 10 19:00:50 kernel: [] zfs_write+0x5e7/0xca0 [zfs]
Oct 10 19:00:50 kernel: [] ? nvlist_lookup_byte_array+0x16/0x20 [znvpair]
Oct 10 19:00:50 kernel: [] ? __zpl_xattr_get+0x1e4/0x200 [zfs]
Oct 10 19:00:50 kernel: [] ? kvasprintf+0x70/0x90
Oct 10 19:00:50 kernel: [] ? zpl_xattr_get+0xe1/0x150 [zfs]
Oct 10 19:00:50 kernel: [] ? zpl_xattr_trusted_get+0x90/0xa0 [zfs]
Oct 10 19:00:50 kernel: [] ? getxattr+0x9c/0x170
Oct 10 19:00:50 kernel: [] zpl_write_common+0x52/0x70 [zfs]
Oct 10 19:00:50 kernel: [] zpl_write+0x68/0xa0 [zfs]
Oct 10 19:00:50 kernel: [] ? security_file_permission+0x16/0x20
Oct 10 19:00:50 kernel: [] vfs_write+0xb8/0x1a0
Oct 10 19:00:50 kernel: [] sys_pwrite64+0x82/0xa0
Oct 10 19:00:50 kernel: [] system_call_fastpath+0x16/0x1b

[root@ ~]# printf "%d %d\n" 0x29 0x6d43
41 27971
and
zdb -vvv zpool/zfs 27971
Dataset zpool/zfs [ZPL], ID 41, cr_txg 6, 34.1T, 1715436 objects, rootbp DVA[0]=<0:1f420d886000:2000> DVA[1]=<0:202bb426000:2000> [L0 DMU objset] fletcher4 lzjb LE contiguous unique double size=800L/200P birth=2140654L/2140654P fill=1715436 cksum=15de8d179c:7883b2b2ca9:166aca7799eb3:2f46b7899a8dca

Object  lvl   iblk   dblk  dsize  lsize   %full  type
 27971    1    16K  1.50K  14.0K  1.50K  100.00  ZFS directory
                                    168   bonus  System attributes
dnode flags: USED_BYTES USERUSED_ACCOUNTED
dnode maxblkid: 0
path    /.glusterfs/01/9d
uid     0
gid     0
atime   Wed Jun 18 13:08:26 2014
mtime   Tue Oct  7 05:12:28 2014
ctime   Tue Oct  7 05:12:28 2014
crtime  Wed Jun 18 13:08:26 2014
gen 205159
mode    40700
size    23
parent  1062
links   2
pflags  40800000044
microzap: 1536 bytes, 21 entries

    019da042-d2a2-489b-8b3c-d17648a1fa9b = 3218803 (type: Regular File)
    019d2df5-536e-4154-8aef-c4db3fb15008 = 3213185 (type: Regular File)
    019d49ab-3e15-4021-b1a0-b578e167da19 = 4819408 (type: Regular File)
    019d36e4-f383-418c-9055-6ed715a8b5d0 = 5805790 (type: Regular File)
    019d3823-cf6b-4ea5-b3f8-7ecbb0bc0f6f = 4848299 (type: Regular File)
    019da4b3-f54f-41d2-8243-8677882721d6 = 5790439 (type: Regular File)
    019d5ec6-ca6b-424b-a316-a99d61cee6b2 = 4745890 (type: Regular File)
    019dbc1a-a136-4a86-a713-80eeba6e7361 = 3718312 (type: Symbolic Link)
    019d7d62-d8d4-43ce-be46-43c814a0f126 = 2692060 (type: Regular File)
    019d44ab-c83b-4f40-b781-c0a5ae13b178 = 3726858 (type: Regular File)
    019d34bc-dfd8-42b3-a378-805bad7816a4 = 455696 (type: Regular File)
    019d6e19-7791-4819-be5c-4d1a5cebc4b0 = 4221896 (type: Regular File)
    019d788b-aa27-45f9-ade3-842e563a1754 = 3746255 (type: Symbolic Link)
    019d6f61-27fe-411f-ac82-223880ef577e = 5379858 (type: Regular File)
    019d5eb9-d836-4e80-a7d2-d58aab850d23 = 4266457 (type: Regular File)
    019df70b-a27c-4b48-869c-dc1e7625abae = 2150471 (type: Regular File)
    019d53af-eba7-4d9d-abbb-fcf392fc7301 = 4331154 (type: Regular File)
    019d31e0-cbb7-4b3c-8b9c-4fbf51e0eea2 = 1719186 (type: Regular File)
    019d5c4e-ec44-41ee-9ce6-c67ff4eb2bed = 3295792 (type: Regular File)
    019d0a4f-5205-43e6-b78a-19ce7f6098ba = 605846 (type: Regular File)
    019d2a46-f6f0-4877-9f54-50e258bc1e4e = 4786511 (type: Symbolic Link)

ls -la /mnt/zpool/zfs/.glusterfs/01 |grep 9d
drwx------ 2 root root 23 Oct 7 05:12 9d

I find a issue #121 same as this one .

and issue #2765 involed file ,but this issue involved dir.
How to troubleshooting?

@behlendorf
Copy link
Contributor

This issues was fixed quite a long time ago. Please upgrade to the latest stable release which is 0.6.3. I see you're still running 0.6.0rc-14.

@inevity
Copy link
Author

inevity commented Oct 14, 2014

Thank you.

@inevity inevity closed this as completed Oct 14, 2014
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants