Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Large kmem_alloc with zfs 0.6.5.7 #4979

Closed
koplover opened this issue Aug 17, 2016 · 11 comments
Closed

Large kmem_alloc with zfs 0.6.5.7 #4979

koplover opened this issue Aug 17, 2016 · 11 comments
Milestone

Comments

@koplover
Copy link

Noticed the following stack traces in our 0.6.5.7 system, see attached syslog output snippet for details and a little more context and additional instances:

Aug 17 10:41:26 zdiskdd0000-001d-00-00 kernel: [ 3.708980] Large kmem_alloc(59048, 0x1000), please file an issue at:
Aug 17 10:41:26 zdiskdd0000-001d-00-00 kernel: [ 3.708980] https://github.com/zfsonlinux/zfs/issues/new
Aug 17 10:41:26 zdiskdd0000-001d-00-00 kernel: [ 3.708994] CPU: 2 PID: 416 Comm: zpool Tainted: P OE 3.19.0-60-zdomu #67~14.04.1
Aug 17 10:41:26 zdiskdd0000-001d-00-00 kernel: [ 3.708999] 0000000000000000 ffff88028afbfbe8 ffffffff817940ec 0000000000000000
Aug 17 10:41:26 zdiskdd0000-001d-00-00 kernel: [ 3.709006] 000000000000c210 ffff88028afbfc28 ffffffffc04d2b93 ffff88028afbfc28
Aug 17 10:41:26 zdiskdd0000-001d-00-00 kernel: [ 3.709011] ffff88028f42b000 ffff88028f42b000 0000000000000000 0000000000001cd5
Aug 17 10:41:26 zdiskdd0000-001d-00-00 kernel: [ 3.709017] Call Trace:
Aug 17 10:41:26 zdiskdd0000-001d-00-00 kernel: [ 3.709027] [] dump_stack+0x63/0x81
Aug 17 10:41:26 zdiskdd0000-001d-00-00 kernel: [ 3.709038] [] spl_kmem_zalloc+0x113/0x180 [spl]
Aug 17 10:41:26 zdiskdd0000-001d-00-00 kernel: [ 3.709084] [] vdev_metaslab_init+0xa5/0x200 [zfs]
Aug 17 10:41:26 zdiskdd0000-001d-00-00 kernel: [ 3.709119] [] vdev_load+0xc4/0xd0 [zfs]
Aug 17 10:41:26 zdiskdd0000-001d-00-00 kernel: [ 3.709150] [] vdev_load+0x34/0xd0 [zfs]
Aug 17 10:41:26 zdiskdd0000-001d-00-00 kernel: [ 3.709180] [] spa_load+0xfe6/0x1b50 [zfs]
Aug 17 10:41:26 zdiskdd0000-001d-00-00 kernel: [ 3.709212] [] ? spa_add+0x5ba/0x650 [zfs]
Aug 17 10:41:26 zdiskdd0000-001d-00-00 kernel: [ 3.709243] [] spa_tryimport+0x9e/0x440 [zfs]
Aug 17 10:41:26 zdiskdd0000-001d-00-00 kernel: [ 3.709278] [] zfs_ioc_pool_tryimport+0x49/0x80 [zfs]
Aug 17 10:41:26 zdiskdd0000-001d-00-00 kernel: [ 3.709312] [] zfsdev_ioctl+0x4bc/0x500 [zfs]
Aug 17 10:41:26 zdiskdd0000-001d-00-00 kernel: [ 3.709319] [] ? __call_rcu+0xda/0x2e0
Aug 17 10:41:26 zdiskdd0000-001d-00-00 kernel: [ 3.709326] [] ? dput+0x24/0x180
Aug 17 10:41:26 zdiskdd0000-001d-00-00 kernel: [ 3.709330] [] do_vfs_ioctl+0x2f8/0x510
Aug 17 10:41:26 zdiskdd0000-001d-00-00 kernel: [ 3.709334] [] SyS_ioctl+0x81/0xa0
Aug 17 10:41:26 zdiskdd0000-001d-00-00 kernel: [ 3.709340] [] system_call_fastpath+0x16/0x1b

If there is any more information required, let me know and I'll gather it.

zfserror.txt

@ironMann
Copy link
Contributor

Probably duplicate of #4752
Fix is in the master de5ec6f

@koplover See #4752 for work-around.

@koplover
Copy link
Author

@ironMann Many thanks for the speed response

If I understand the referenced thread correctly, it seems that the alloc max default is already at a suitable size so that the error should not be met

zdb output gives:

diskconvm:
version: 5000
name: 'diskconvm'
state: 0
txg: 1149881
pool_guid: 82958690655894621
errata: 0
hostname: 'zdiskdd'
vdev_children: 2
vdev_tree:
type: 'root'
id: 0
guid: 82958690655894621
children[0]:
type: 'mirror'
id: 0
guid: 9744604207154990652
metaslab_array: 35
metaslab_shift: 29
ashift: 12
asize: 3962662551552
is_log: 0
create_txg: 4
children[0]:
type: 'disk'
id: 0
guid: 5770934025122296376
path: '/dev/disk/by-partid/ata-MB4000GCVBU_WMC130027610-part6'
whole_disk: 0
create_txg: 4
children[1]:
type: 'disk'
id: 1
guid: 11065621042494068812
path: '/dev/disk/by-partid/ata-MB4000GCVBU_WMC130027304-part6'
whole_disk: 0
create_txg: 4
children[1]:
type: 'disk'
id: 1
guid: 14336061620723897028
path: '/dev/disk/by-partuuid/1a7ecdfe-7c56-4336-8646-c13f93bdd6d7'
whole_disk: 0
metaslab_array: 4053
metaslab_shift: 25
ashift: 9
asize: 4993843200
is_log: 1
create_txg: 628170
features_for_read:
com.delphix:hole_birth
com.delphix:embedded_data

And cat /sys/module/spl/parameters/spl_kmem_alloc_max = 2097152 = 2MB

My reading of the zdb for the children asize / metaslab shift is that this should be enough - am I missing something?

@ironMann
Copy link
Contributor

@koplover spl_kmem_alloc_max is an absolute limit for allocation size. However, spl will warn if a reasonable threshold is exceeded (default 32kiB), which is why you are seeing this log.

To remove the warning try raising spl_kmem_alloc_warn spl parameter over 59048.

@koplover
Copy link
Author

@ironMann OK, that makes sense. As it is a warning therefore I take it that the trace is completely harmless in this case, and setting the value is a nicety to prevent the trace appearing in syslog, but no impact on the running system - correct?

I'll set this anyway to clean up our log - many thanks again for your help

@ironMann
Copy link
Contributor

As it is a warning therefore I take it that the trace is completely harmless in this case, and setting the value is a nicety to prevent the trace appearing in syslog, but no impact on the running system - correct?

Precisely

@behlendorf behlendorf added this to the 0.6.5.8 milestone Aug 17, 2016
@behlendorf
Copy link
Contributor

I've tagged this 0.6.5.8 so we cherry pick the fix from master for the next point release.

@stevleibelt
Copy link

@behlendorf

Did I get something wrong or why is this ticket still open if you write "we cherry pick the fix ..."?

Don't get me wrong, I don't want to be rude.

@mailinglists35
Copy link

mailinglists35 commented Sep 5, 2016

it means the fix for this issue will be included in the next release. the ticket is left open to have a list of tickets to close AFTER the release is made or immediately before the release. if the ticket was closed long BEFORE release then it would have been difficult to locate the ticket. so it's just a way to manage tickets and releases.

if you look here https://github.com/zfsonlinux/zfs/issues?q=is%3Aopen+is%3Aissue+milestone%3A0.6.5.8 you will find all the issues that all fixed will allow 0.6.5.8 to be released

@stevleibelt
Copy link

Hi @mailinglists35,

thanks for your quick reply. I know the link. It simple confuse me to see tickets to finished which are finished when going through the comments.
At the end, I am one of the arch linux users who are simple waiting for 0.6.5.8 being released to do a kernel update to 4.7 since arch linux switched to this kernel quiet a while.

@behlendorf
Copy link
Contributor

Frankly, GitHub's tools for managing issues are pretty minimal so we're making due with what we have. But yes the idea is to move issues which need to be applied to the upcoming release in to that milestone so we can track them and make sure they get applied.

@behlendorf
Copy link
Contributor

Fix applied to 0.6.5.8.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

5 participants