Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

kernel BUG at mm/filemap.c:2047! #5

Open
cmock opened this issue Aug 12, 2012 · 4 comments
Open

kernel BUG at mm/filemap.c:2047! #5

cmock opened this issue Aug 12, 2012 · 4 comments

Comments

@cmock
Copy link

cmock commented Aug 12, 2012

There's a reproducible kernel bug in 3.2.23-rpi1+ -- whenever cricket (http://cricket.sourceforge.net/) tries to create an RRD file, this occurs; it does not happen on other file creations...

[  589.838325] BUG: sleeping function called from invalid context at kernel/mute
x.c:85
[  589.838355] in_atomic(): 1, irqs_disabled(): 0, pid: 2330, name: collector
[  589.838439] [<c0012100>] (unwind_backtrace+0x0/0xe0) from [<c032c240>] (mutex
_lock+0x18/0x40)
[  589.838486] [<c032c240>] (mutex_lock+0x18/0x40) from [<c0090714>] (generic_fi
le_aio_write+0x44/0xc8)
[  589.838539] [<c0090714>] (generic_file_aio_write+0x44/0xc8) from [<c0128dfc>]
 (ext4_file_write+0x230/0x294)
[  589.838594] [<c0128dfc>] (ext4_file_write+0x230/0x294) from [<c00c5e74>] (do_
sync_write+0x9c/0xd8)
[  589.838641] [<c00c5e74>] (do_sync_write+0x9c/0xd8) from [<c00c6728>] (vfs_wri
te+0xa8/0x180)
[  589.838671] [<c00c6728>] (vfs_write+0xa8/0x180) from [<c00c6a18>] (sys_write+
0x34/0x68)
[  589.838709] [<c00c6a18>] (sys_write+0x34/0x68) from [<c000d580>] (ret_fast_sy
scall+0x0/0x30)
[  589.838824] ------------[ cut here ]------------
[  589.846909] kernel BUG at mm/filemap.c:2047!
[  589.854554] Internal error: Oops - undefined instruction: 0 [#1]
[  589.863946] Modules linked in: tun nls_utf8 nls_cp437 vfat fat spidev smsc95x
x usbnet leds_gpio spi_bcm2708 i2c_bcm2708
[  589.881948] CPU: 0    Not tainted  (3.2.23-rpi1+ #3)
[  589.890476] PC is at iov_iter_copy_from_user_atomic+0x24/0xb4
[  589.899789] LR is at 0x0
[  589.905814] pc : [<c008e878>]    lr : [<00000000>]    psr: 40000013
[  589.905826] sp : cdb41d38  ip : cdb41d78  fp : 00000000
[  589.924437] r10: 00000000  r9 : cdb40000  r8 : 00001000
[  589.933187] r7 : cdb40000  r6 : ce652ea0  r5 : 00000000  r4 : 00001000
[  589.943248] r3 : 00001000  r2 : 00000000  r1 : cdb40000  r0 : c0687a20
[  589.953231] Flags: nZcv  IRQs on  FIQs on  Mode SVC_32  ISA ARM  Segment user
[  589.963801] Control: 00c5387d  Table: 0e948008  DAC: 00000015
[  589.973045] Process collector (pid: 2330, stack limit = 0xcdb40268)
[  589.982858] Stack: (0xcdb41d38 to 0xcdb42000)
[  589.990740] 1d20:                                                       00001000 00000000
[  590.005879] 1d40: 00000000 c008ea28 00001000 00000000 cdb41d70 cdb41d74 00000000 cdb13e40
[  590.021207] 1d60: c0344410 00001000 00000000 00000000 c0687a20 00000000 cdb41ec0 00000001
[  590.036709] 1d80: 00000000 00001000 ce652de0 00000000 00000000 ce652de0 ce652e44 00000000
[  590.052632] 1da0: cdb13e40 cdb13e40 00001000 c0090660 00000000 00000000 cdb41f00 00001000
[  590.068918] 1dc0: 00000000 c000d580 00000000 cdb41ec8 cdb41ec0 cdb41f00 ce652ea0 00000001
[  590.085584] 1de0: 00001000 ce652e44 cdb41ec8 00000001 cdb13e40 c00121b4 be8b38d4 00000000
[  590.102480] 1e00: 00000000 00000000 ce652e44 cdb41ec8 00000001 cdb13e40 cdb41ec0 c0090730
[  590.119572] 1e20: 00000001 00000000 91827364 cdb41e2c cdb41e2c cdb41e34 cdb41e34 00000000
[  590.136744] 1e40: 00000000 00000000 00000000 00000000 cdb13e40 ce652de0 cdb41ec8 cdb41ec0
[  590.154185] 1e60: 00000001 c0128dfc 00000000 00000000 cdb41ef8 cdb13e40 00000042 c00d3230
[  590.171889] 1e80: ce64e5d8 cdad1b40 c050d5c0 2cdd052d 00000008 fffffdee 00000000 cdb41f88
[  590.189984] 1ea0: cdb13e40 00001000 cdb40000 00000000 be8b38d4 c00c5e74 00000
[  590.189984] 1ea0: cdb13e40 00001000 cdb40000 00000000 be8b38d4 c00c5e74 00000000 00000000
[  590.208384] 1ec0: be8b2818 00001000 ce80e1c0 ce623238 00000000 00000001 ffffffff cdb13e40
[  590.226938] 1ee0: 00000000 00000000 00000000 00000000 cdb02c00 00003be0 00000000 00000000
[  590.245733] 1f00: 00000000 00000000 00000000 00000000 00001000 ce64f330 00001000 00000000
[  590.264786] 1f20: 00000000 00000001 00000000 7ff80000 00000001 cdb13e40 cdb41f88 00001000
[  590.283864] 1f40: cdb13e40 be8b2818 cdb41f88 cdb13e40 be8b2818 cdb41f88 be8b2818 c00c6728
[  590.302927] 1f60: cdb13e40 be8b2818 00000000 00000000 cdb13e40 be8b2818 00001000 c00c6a18
[  590.321992] 1f80: 00000004 00000000 00000000 00000000 01796e78 01763838 015defec 00000004
[  590.341069] 1fa0: c000d704 c000d580 01796e78 01763838 00000004 be8b2818 00001000 be8b2818
[  590.360124] 1fc0: 01796e78 01763838 015defec 00000004 00000003 00000000 00000001 be8b38d4
[  590.379187] 1fe0: 402ae000 be8b2800 404a7c14 4036b54c 60000010 00000004 00000000 00000000
[  590.398279] [<c008e878>] (iov_iter_copy_from_user_atomic+0x24/0xb4) from [<c008ea28>] (generic_file_buffered_write+0x120/0x264)
[  590.420789] [<c008ea28>] (generic_file_buffered_write+0x120/0x264) from [<c0090660>] (__generic_file_aio_write+0x358/0x3c8)
[  590.442962] [<c0090660>] (__generic_file_aio_write+0x358/0x3c8) from [<c0090730>] (generic_file_aio_write+0x60/0xc8)
[  590.464513] [<c0090730>] (generic_file_aio_write+0x60/0xc8) from [<c0128dfc>] (ext4_file_write+0x230/0x294)
[  590.485288] [<c0128dfc>] (ext4_file_write+0x230/0x294) from [<c00c5e74>] (do_sync_write+0x9c/0xd8)
[  590.505262] [<c00c5e74>] (do_sync_write+0x9c/0xd8) from [<c00c6728>] (vfs_write+0xa8/0x180)
[  590.524586] [<c00c6728>] (vfs_write+0xa8/0x180) from [<c00c6a18>] (sys_write+0x34/0x68)
[  590.538200] [<c00c6a18>] (sys_write+0x34/0x68) from [<c000d580>] (ret_fast_syscall+0x0/0x30)
[  590.557486] Code: e1a04003 e591e004 e3de5101 1a000000 (e7f001f2) 
[  590.569294] ---[ end trace a5ea58d4db817366 ]---
@bootc
Copy link
Owner

bootc commented Aug 13, 2012

Interesting. Give me a couple of days to build and test 3.2.27 and let's see if it still occurs in that one.

@bootc
Copy link
Owner

bootc commented Aug 13, 2012

One more thing - does cricket reliably trigger this bug? If I install it on my RPi should I be able to replicate this easily?

@bootc
Copy link
Owner

bootc commented Aug 13, 2012

And lastly, I notice cricket hasn't been updated since 2004. You may want to give Munin or Cacti a look if you want something newer - not that this is in any way a fix to your problem.

@cmock
Copy link
Author

cmock commented Aug 13, 2012

I have a simple and reproducible test case now:

rrdtool create /tmp/test.rrd --start 1344258765 --step 300 DS:ds0:GAUGE:1800:0:U DS:ds1:GAUGE:1800:0:U DS:ds2:GAUGE:1800:0:U DS:ds3:GAUGE:1800:0:U DS:ds4:DERIVE:1800:0:U DS:ds5:DERIVE:1800:0:U DS:ds6:DERIVE:1800:0:U RRA:AVERAGE:0.5:1:600 RRA:AVERAGE:0.5:6:600 RRA:AVERAGE:0.5:24:600 RRA:MAX:0.5:24:600 RRA:AVERAGE:0.5:288:600 RRA:MAX:0.5:288:600

You need to apt-get install rrdtool, obviously, and it can be run as an ordinary user.

Note that I'm using raspian, so this is armhf -- no idea if that is relevant.

Package versions:

ii  rrdtool        1.4.7-1        time-series data storage and display system 
ii  libc6:armhf    2.13-33        Embedded GNU C Library: Shared libraries
ii  libcairo2:armh 1.12.2-2       The Cairo 2D vector graphics library
ii  libdbi1        0.8.4-6        DB Independent Abstraction Layer for C -- sh
ii  libglib2.0-0:a 2.32.3-1       GLib library of C routines
ii  libpango1.0-0: 1.30.0-1       Layout and rendering of internationalized te
ii  libpng12-0:arm 1.2.49-1       PNG library - runtime
ii  librrd4        1.4.7-1        time-series data storage and display system 
ii  libxml2:armhf  2.8.0+dfsg1-5  GNOME XML library

bootc pushed a commit that referenced this issue Aug 25, 2012
commit 3cf003c upstream.

Jian found that when he ran fsx on a 32 bit arch with a large wsize the
process and one of the bdi writeback kthreads would sometimes deadlock
with a stack trace like this:

crash> bt
PID: 2789   TASK: f02edaa0  CPU: 3   COMMAND: "fsx"
 #0 [eed63cbc] schedule at c083c5b3
 #1 [eed63d80] kmap_high at c0500ec8
 #2 [eed63db] cifs_async_writev at f7fabcd7 [cifs]
 #3 [eed63df0] cifs_writepages at f7fb7f5c [cifs]
 #4 [eed63e50] do_writepages at c04f3e32
 #5 [eed63e54] __filemap_fdatawrite_range at c04e152a
 #6 [eed63ea4] filemap_fdatawrite at c04e1b3e
 #7 [eed63eb4] cifs_file_aio_write at f7fa111a [cifs]
 torvalds#8 [eed63ecc] do_sync_write at c052d202
 torvalds#9 [eed63f74] vfs_write at c052d4ee
torvalds#10 [eed63f94] sys_write at c052df4c
torvalds#11 [eed63fb0] ia32_sysenter_target at c0409a98
    EAX: 00000004  EBX: 00000003  ECX: abd73b73  EDX: 012a65c6
    DS:  007b      ESI: 012a65c6  ES:  007b      EDI: 00000000
    SS:  007b      ESP: bf8db178  EBP: bf8db1f8  GS:  0033
    CS:  0073      EIP: 40000424  ERR: 00000004  EFLAGS: 00000246

Each task would kmap part of its address array before getting stuck, but
not enough to actually issue the write.

This patch fixes this by serializing the marshal_iov operations for
async reads and writes. The idea here is to ensure that cifs
aggressively tries to populate a request before attempting to fulfill
another one. As soon as all of the pages are kmapped for a request, then
we can unlock and allow another one to proceed.

There's no need to do this serialization on non-CONFIG_HIGHMEM arches
however, so optimize all of this out when CONFIG_HIGHMEM isn't set.

Reported-by: Jian Li <jiali@redhat.com>
Signed-off-by: Jeff Layton <jlayton@redhat.com>
Signed-off-by: Steve French <smfrench@gmail.com>
[bwh: Backported to 3.2: adjust context]
Signed-off-by: Ben Hutchings <ben@decadent.org.uk>
bootc pushed a commit that referenced this issue Aug 25, 2012
…d reasons

commit 5cf02d0 upstream.

We've had some reports of a deadlock where rpciod ends up with a stack
trace like this:

    PID: 2507   TASK: ffff88103691ab40  CPU: 14  COMMAND: "rpciod/14"
     #0 [ffff8810343bf2f0] schedule at ffffffff814dabd9
     #1 [ffff8810343bf3b8] nfs_wait_bit_killable at ffffffffa038fc04 [nfs]
     #2 [ffff8810343bf3c8] __wait_on_bit at ffffffff814dbc2f
     #3 [ffff8810343bf418] out_of_line_wait_on_bit at ffffffff814dbcd8
     #4 [ffff8810343bf488] nfs_commit_inode at ffffffffa039e0c1 [nfs]
     #5 [ffff8810343bf4f8] nfs_release_page at ffffffffa038bef6 [nfs]
     #6 [ffff8810343bf528] try_to_release_page at ffffffff8110c670
     #7 [ffff8810343bf538] shrink_page_list.clone.0 at ffffffff81126271
     torvalds#8 [ffff8810343bf668] shrink_inactive_list at ffffffff81126638
     torvalds#9 [ffff8810343bf818] shrink_zone at ffffffff8112788f
    torvalds#10 [ffff8810343bf8c8] do_try_to_free_pages at ffffffff81127b1e
    torvalds#11 [ffff8810343bf958] try_to_free_pages at ffffffff8112812f
    torvalds#12 [ffff8810343bfa08] __alloc_pages_nodemask at ffffffff8111fdad
    torvalds#13 [ffff8810343bfb28] kmem_getpages at ffffffff81159942
    torvalds#14 [ffff8810343bfb58] fallback_alloc at ffffffff8115a55a
    torvalds#15 [ffff8810343bfbd8] ____cache_alloc_node at ffffffff8115a2d9
    torvalds#16 [ffff8810343bfc38] kmem_cache_alloc at ffffffff8115b09b
    torvalds#17 [ffff8810343bfc78] sk_prot_alloc at ffffffff81411808
    torvalds#18 [ffff8810343bfcb8] sk_alloc at ffffffff8141197c
    torvalds#19 [ffff8810343bfce8] inet_create at ffffffff81483ba6
    torvalds#20 [ffff8810343bfd38] __sock_create at ffffffff8140b4a7
    torvalds#21 [ffff8810343bfd98] xs_create_sock at ffffffffa01f649b [sunrpc]
    torvalds#22 [ffff8810343bfdd8] xs_tcp_setup_socket at ffffffffa01f6965 [sunrpc]
    torvalds#23 [ffff8810343bfe38] worker_thread at ffffffff810887d0
    torvalds#24 [ffff8810343bfee8] kthread at ffffffff8108dd96
    torvalds#25 [ffff8810343bff48] kernel_thread at ffffffff8100c1ca

rpciod is trying to allocate memory for a new socket to talk to the
server. The VM ends up calling ->releasepage to get more memory, and it
tries to do a blocking commit. That commit can't succeed however without
a connected socket, so we deadlock.

Fix this by setting PF_FSTRANS on the workqueue task prior to doing the
socket allocation, and having nfs_release_page check for that flag when
deciding whether to do a commit call. Also, set PF_FSTRANS
unconditionally in rpc_async_schedule since that function can also do
allocations sometimes.

Signed-off-by: Jeff Layton <jlayton@redhat.com>
Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>
Signed-off-by: Ben Hutchings <ben@decadent.org.uk>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants