Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

lockdep false positive - move txg_kick() outside of ->dp_lock #9094

Merged
merged 1 commit into from
Jul 31, 2019

Conversation

jdike
Copy link
Contributor

@jdike jdike commented Jul 29, 2019

This fixes the lockdep warning reported below by breaking a link
between ->tx_sync_lock and ->dp_lock.

This patch is orginally from Brian Behlendorf and slightly simplified
by me.

Signed-off-by: Jeff Dike jdike@akamai.com

Motivation and Context

Fixes a lockdep false positive

Description

======================================================

WARNING: possible circular locking dependency detected
4.19.55-4.19.2-debug-b494b4b34cd8ef26 #1 Tainted: G           O
------------------------------------------------------
ccid2/5165 is trying to acquire lock:
000000001e2f33d9 (&tx->tx_sync_lock){+.+.}, at: txg_kick+0x61/0x420 [zfs]
Jul 19 19:38:53 a198-18-40-5 kernel: [  227.508666]
but task is already holding lock:
00000000ca317575 (&dp->dp_lock){+.+.}, at: dsl_pool_need_dirty_delay+0x8b/0x3f0 [zfs]
which lock already depends on the new lock.
the existing dependency chain (in reverse order) is:
-> #3 (&dp->dp_lock){+.+.}:
       dsl_pool_dirty_space+0x70/0x2d0 [zfs]
       dbuf_dirty+0x778/0x31d0 [zfs]
       dmu_write_uio_dnode+0xfe/0x300 [zfs]
       dmu_write_uio_dbuf+0xa0/0x100 [zfs]
       zfs_write+0x18bd/0x1f70 [zfs]
       zpl_write_common_iovec+0x15e/0x380 [zfs]
       zpl_iter_write+0x1c6/0x2a0 [zfs]
       __vfs_write+0x3a2/0x5b0
       vfs_write+0x15d/0x460
       ksys_write+0xed/0x210
       do_syscall_64+0x9b/0x410
       do_syscall_64+0x9b/0x410
       entry_SYSCALL_64_after_hwframe+0x49/0xbe
-> #2 (&db->db_mtx){+.+.}:
       dmu_buf_will_dirty+0x6b/0x6c0 [zfs]
       bpobj_iterate_impl+0xbe6/0x1410 [zfs]
       spa_sync+0x1756/0x29b0 [zfs]
       txg_sync_thread+0x6d6/0x1370 [zfs]
       thread_generic_wrapper+0x1ff/0x2a0 [spl]
       kthread+0x2e7/0x3e0
       ret_from_fork+0x3a/0x50
-> #1 (&bpo->bpo_lock){+.+.}:
       bpobj_space+0x63/0x470 [zfs]
       dsl_scan_active+0x340/0x3d0 [zfs]
       txg_sync_thread+0x3f2/0x1370 [zfs]
       thread_generic_wrapper+0x1ff/0x2a0 [spl]
       kthread+0x2e7/0x3e0
       ret_from_fork+0x3a/0x50
-> #0 (&tx->tx_sync_lock){+.+.}:
       __mutex_lock+0xef/0x1380
       txg_kick+0x61/0x420 [zfs]
       dsl_pool_need_dirty_delay+0x1c7/0x3f0 [zfs]
       dmu_tx_assign+0x13b/0xe40 [zfs]
       zfs_write+0xb5e/0x1f70 [zfs]
       zpl_write_common_iovec+0x15e/0x380 [zfs]
       zpl_iter_write+0x1c6/0x2a0 [zfs]
       __vfs_write+0x3a2/0x5b0
       vfs_write+0x15d/0x460
       ksys_write+0xed/0x210
       do_int80_syscall_32+0xf1/0x470
       entry_INT80_compat+0x8a/0xa0
other info that might help us debug this:
us debug this:
Chain exists of:
  &tx->tx_sync_lock --> &db->db_mtx --> &dp->dp_lock
 Possible unsafe locking scenario:
       CPU0                    CPU1
       ----                    ----
  lock(&dp->dp_lock);
                               lock(&db->db_mtx);
                               lock(&dp->dp_lock);
  lock(&tx->tx_sync_lock);
 *** DEADLOCK ***
3 locks held by ccid2/5165:
 #0: 00000000e0fb8e9d (&f->f_pos_lock){+.+.}, at: __fdget_pos+0xb1/0xe0
 #1: 00000000ad27a3c1 (sb_writers#12){.+.+}, at: vfs_write+0x332/0x460
 #2: 00000000ca317575 (&dp->dp_lock){+.+.}, at: dsl_pool_need_dirty_delay+0x8b/0x3f0 [zfs]
stack backtrace:
CPU: 5 PID: 5165 Comm: ccid2 Tainted: G           O      4.19.55-4.19.2-debug-b494b4b34cd8ef26 #1
Hardware name: Dell Inc. PowerEdge R510/0W844P, BIOS 1.1.4 11/04/2009
Call Trace:
 dump_stack+0x91/0xeb
 print_circular_bug.isra.16+0x30b/0x5b0
 ? save_trace+0xd6/0x240
 __lock_acquire+0x41be/0x4f10
 ? ftrace_caller_op_ptr+0xe/0xe
 ? debug_show_all_locks+0x2d0/0x2d0
 ? debug_show_all_locks+0x2d0/0x2d0
 ? kernel_text_address+0x6d/0x100
 ? __save_stack_trace+0x73/0xd0
 ? quarantine_put+0xb0/0x150
 ? lock_acquire+0x153/0x330
 lock_acquire+0x153/0x330
 ? txg_kick+0x61/0x420 [zfs]
 ? lock_contended+0xd60/0xd60
 ? txg_kick+0x61/0x420 [zfs]
 __mutex_lock+0xef/0x1380
 ? txg_kick+0x61/0x420 [zfs __mutex_lock+0xef/0x1380
 ? txg_kick+0x61/0x420 [zfs]
 ? txg_kick+0x61/0x420 [zfs]
 ? __mutex_add_waiter+0x160/0x160
 ? sched_clock+0x5/0x10
 ? __mutex_add_waiter+0x160/0x160
 ? __cv_destroy+0x9e/0x180 [spl]
 ? cv_destroy_wakeup+0xc0/0xc0 [spl]
 ? __mutex_unlock_slowpath+0xf3/0x660
 ? zio_wait+0x419/0x660 [zfs]
 ? txg_kick+0x61/0x420 [zfs]
 txg_kick+0x61/0x420 [zfs]
 dsl_pool_need_dirty_delay+0x1c7/0x3f0 [zfs]
 dmu_tx_assign+0x13b/0xe40 [zfs]
 ? dmu_tx_count_write+0x42f/0x700 [zfs]
 zfs_write+0xb5e/0x1f70 [zfs]
 ? sched_clock_cpu+0x18/0x170
 ? mark_held_locks+0xca/0x120
 ? sched_clock+0x5/0x10
 ? zfs_close+0x1e0/0x1e0 [zfs]
 ? sched_clock+0x5/0x10
 ? sched_clock_cpu+0x18/0x170
 ? __lock_acquire+0xe3b/0x4f10
 ? sched_clock+0x5/0x10
 ? sched_clock_cpu+0x18/0x170
 zpl_write_common_iovec+0x15e/0x380 [zfs]
 ? __lock_acquire+0xe3b/0x4f10
 ? zpl_read_common_iovec+0x2b0/0x2b0 [zfs]
 zpl_iter_write+0x1c6/0x2a0 [zfs]
 __vfs_write+0x3a2/0x5b0
 ? __fdget_pos+0xb1/0xe0
 ? kernel_read+0x130/0x130
 ? __mutex_add_waiter+0x160/0x160
 ? lock_acquire+0x153/0x330
 ? lock_acquire+0x153/0x330
 ? rcu_read_lock_sched_held+0x117/0x130
 ? rcu_sync_lockdep_assert+0x76/0xb0
 ? __sb_start_write+0xf0/0x0x76/0xb0
 ? __sb_start_write+0xf0/0x260
 vfs_write+0x15d/0x460
 ksys_write+0xed/0x210
 ? __ia32_sys_read+0xb0/0xb0
 ? trace_hardirqs_on_thunk+0x1a/0x1c
 ? do_int80_syscall_32+0x17/0x470
 do_int80_syscall_32+0xf1/0x470
 entry_INT80_compat+0x8a/0xa0

How Has This Been Tested?

Soaked the patch over the weekend with our test suite.

Types of changes

  • [ x] Bug fix (non-breaking change which fixes an issue)
  • New feature (non-breaking change which adds functionality)
  • Performance enhancement (non-breaking change which improves efficiency)
  • [ x] Code cleanup (non-breaking change which makes code smaller or more readable)
  • Breaking change (fix or feature that would cause existing functionality to change)
  • Documentation (a change to man pages or other documentation)

Checklist:

@codecov
Copy link

codecov bot commented Jul 30, 2019

Codecov Report

Merging #9094 into master will decrease coverage by 0.1%.
The diff coverage is 100%.

Impacted file tree graph

@@            Coverage Diff             @@
##           master    #9094      +/-   ##
==========================================
- Coverage   79.07%   78.97%   -0.11%     
==========================================
  Files         400      401       +1     
  Lines      121653   121648       -5     
==========================================
- Hits        96202    96070     -132     
- Misses      25451    25578     +127
Flag Coverage Δ
#kernel 79.66% <100%> (-0.03%) ⬇️
#user 66.69% <100%> (-0.01%) ⬇️

Continue to review full report at Codecov.

Legend - Click here to learn more
Δ = absolute <relative> (impact), ø = not affected, ? = missing data
Powered by Codecov. Last update cae97c8...0b50205. Read the comment docs.

@ahrens ahrens added the Status: Code Review Needed Ready for review and testing label Jul 30, 2019
@behlendorf behlendorf requested a review from ahrens July 30, 2019 16:26
@ahrens ahrens requested review from sdimitro and removed request for sdimitro July 30, 2019 21:13
@ahrens ahrens added Status: Accepted Ready to integrate (reviewed, tested) and removed Status: Code Review Needed Ready for review and testing labels Jul 30, 2019
@ahrens
Copy link
Member

ahrens commented Jul 30, 2019

If you get a chance, could you modify the commit message to fix this style check:

error: commit message body contains line over 72 characters

IMO, we could do without the whole stack traces in the commit message. Maybe you could replace it with a simpler description saying which locks are involved and the expected ordering.

@jdike
Copy link
Contributor Author

jdike commented Jul 30, 2019 via email

@jdike
Copy link
Contributor Author

jdike commented Jul 31, 2019 via email

@gmelikov
Copy link
Member

@jdike you can alter PR just by pushing to the same branch (in this case - jdike/zfs). To change existing commit - git commit --amend, or https://git-scm.com/docs/git-rebase rebase - squash several commits into one.

Example to change commit description and/or files:

#make sure you're on same branch
git add -u #(add or change files if you have them altered)
git commit --amend #(add changes to last commit and/or alter it's description)
git push origin HEAD -f #(force push changes)

@ahrens
Copy link
Member

ahrens commented Jul 31, 2019

@jdike This PR hasn't been merged yet (i.e. it is not yet part of zfsonlinux), so you can still modify it, including the commit message, by force-pushing to your branch on github. Thanks @gmelikov for the more detailed instructions.

This fixes a lockdep warning by breaking a link between ->tx_sync_lock
and ->dp_lock.

The deadlock envisioned by lockdep is this:
    thread 1 holds db->db_mtx and tries to get dp->dp_lock:
	dsl_pool_dirty_space+0x70/0x2d0 [zfs]
	dbuf_dirty+0x778/0x31d0 [zfs]

    thread 2 holds bpo->bpo_lock and tries to get db->db_mtx:
        dmu_buf_will_dirty_impl
        dmu_buf_will_dirty+0x6b/0x6c0 [zfs]
        bpobj_iterate_impl+0xbe6/0x1410 [zfs]

    thread 3 holds tx->tx_sync_lock and tries to get bpo->bpo_lock:
        bpobj_space+0x63/0x470 [zfs]
        dsl_scan_active+0x340/0x3d0 [zfs]
        txg_sync_thread+0x3f2/0x1370 [zfs]

    thread 4 holds dp->dp_lock and tries to get tx->tx_sync_lock
       txg_kick+0x61/0x420 [zfs]
       dsl_pool_need_dirty_delay+0x1c7/0x3f0 [zfs]

This patch is orginally from Brian Behlendorf and slightly simplified
by me.

It breaks this cycle in thread 4 by moving the call from
dsl_pool_need_dirty_delay to txg_kick outside the section controlled
by dp->dp_lock.

Signed-off-by: Jeff Dike <jdike@akamai.com>
@jdike
Copy link
Contributor Author

jdike commented Jul 31, 2019

I fixed the comment and the commit now looks better. I don't seem to be able to create a new pull request. I don't know if I'm done and this commit can be merged, but if not, let me know what I need to do...

@ahrens
Copy link
Member

ahrens commented Jul 31, 2019

You don't need to create a new pull request. You've successfully updated this one. You are all set, I'll get this merged. Thanks!

@ahrens ahrens merged commit 48be0df into openzfs:master Jul 31, 2019
allanjude pushed a commit to allanjude/zfs that referenced this pull request Aug 12, 2019
This fixes a lockdep warning by breaking a link between ->tx_sync_lock
and ->dp_lock.

The deadlock envisioned by lockdep is this:
    thread 1 holds db->db_mtx and tries to get dp->dp_lock:
	dsl_pool_dirty_space+0x70/0x2d0 [zfs]
	dbuf_dirty+0x778/0x31d0 [zfs]

    thread 2 holds bpo->bpo_lock and tries to get db->db_mtx:
        dmu_buf_will_dirty_impl
        dmu_buf_will_dirty+0x6b/0x6c0 [zfs]
        bpobj_iterate_impl+0xbe6/0x1410 [zfs]

    thread 3 holds tx->tx_sync_lock and tries to get bpo->bpo_lock:
        bpobj_space+0x63/0x470 [zfs]
        dsl_scan_active+0x340/0x3d0 [zfs]
        txg_sync_thread+0x3f2/0x1370 [zfs]

    thread 4 holds dp->dp_lock and tries to get tx->tx_sync_lock
       txg_kick+0x61/0x420 [zfs]
       dsl_pool_need_dirty_delay+0x1c7/0x3f0 [zfs]

This patch is orginally from Brian Behlendorf and slightly simplified
by me.

It breaks this cycle in thread 4 by moving the call from
dsl_pool_need_dirty_delay to txg_kick outside the section controlled
by dp->dp_lock.

Reviewed-by: Brian Behlendorf <behlendorf1@llnl.gov>
Reviewed-by: Matt Ahrens <mahrens@delphix.com>
Signed-off-by: Jeff Dike <jdike@akamai.com>
Closes openzfs#9094
allanjude pushed a commit to allanjude/zfs that referenced this pull request Oct 13, 2019
This fixes a lockdep warning by breaking a link between ->tx_sync_lock
and ->dp_lock.

The deadlock envisioned by lockdep is this:
    thread 1 holds db->db_mtx and tries to get dp->dp_lock:
	dsl_pool_dirty_space+0x70/0x2d0 [zfs]
	dbuf_dirty+0x778/0x31d0 [zfs]

    thread 2 holds bpo->bpo_lock and tries to get db->db_mtx:
        dmu_buf_will_dirty_impl
        dmu_buf_will_dirty+0x6b/0x6c0 [zfs]
        bpobj_iterate_impl+0xbe6/0x1410 [zfs]

    thread 3 holds tx->tx_sync_lock and tries to get bpo->bpo_lock:
        bpobj_space+0x63/0x470 [zfs]
        dsl_scan_active+0x340/0x3d0 [zfs]
        txg_sync_thread+0x3f2/0x1370 [zfs]

    thread 4 holds dp->dp_lock and tries to get tx->tx_sync_lock
       txg_kick+0x61/0x420 [zfs]
       dsl_pool_need_dirty_delay+0x1c7/0x3f0 [zfs]

This patch is orginally from Brian Behlendorf and slightly simplified
by me.

It breaks this cycle in thread 4 by moving the call from
dsl_pool_need_dirty_delay to txg_kick outside the section controlled
by dp->dp_lock.

Reviewed-by: Brian Behlendorf <behlendorf1@llnl.gov>
Reviewed-by: Matt Ahrens <mahrens@delphix.com>
Signed-off-by: Jeff Dike <jdike@akamai.com>
Closes openzfs#9094
tonyhutter pushed a commit to tonyhutter/zfs that referenced this pull request Dec 24, 2019
This fixes a lockdep warning by breaking a link between ->tx_sync_lock
and ->dp_lock.

The deadlock envisioned by lockdep is this:
    thread 1 holds db->db_mtx and tries to get dp->dp_lock:
	dsl_pool_dirty_space+0x70/0x2d0 [zfs]
	dbuf_dirty+0x778/0x31d0 [zfs]

    thread 2 holds bpo->bpo_lock and tries to get db->db_mtx:
        dmu_buf_will_dirty_impl
        dmu_buf_will_dirty+0x6b/0x6c0 [zfs]
        bpobj_iterate_impl+0xbe6/0x1410 [zfs]

    thread 3 holds tx->tx_sync_lock and tries to get bpo->bpo_lock:
        bpobj_space+0x63/0x470 [zfs]
        dsl_scan_active+0x340/0x3d0 [zfs]
        txg_sync_thread+0x3f2/0x1370 [zfs]

    thread 4 holds dp->dp_lock and tries to get tx->tx_sync_lock
       txg_kick+0x61/0x420 [zfs]
       dsl_pool_need_dirty_delay+0x1c7/0x3f0 [zfs]

This patch is orginally from Brian Behlendorf and slightly simplified
by me.

It breaks this cycle in thread 4 by moving the call from
dsl_pool_need_dirty_delay to txg_kick outside the section controlled
by dp->dp_lock.

Reviewed-by: Brian Behlendorf <behlendorf1@llnl.gov>
Reviewed-by: Matt Ahrens <mahrens@delphix.com>
Signed-off-by: Jeff Dike <jdike@akamai.com>
Closes openzfs#9094
tonyhutter pushed a commit to tonyhutter/zfs that referenced this pull request Dec 27, 2019
This fixes a lockdep warning by breaking a link between ->tx_sync_lock
and ->dp_lock.

The deadlock envisioned by lockdep is this:
    thread 1 holds db->db_mtx and tries to get dp->dp_lock:
	dsl_pool_dirty_space+0x70/0x2d0 [zfs]
	dbuf_dirty+0x778/0x31d0 [zfs]

    thread 2 holds bpo->bpo_lock and tries to get db->db_mtx:
        dmu_buf_will_dirty_impl
        dmu_buf_will_dirty+0x6b/0x6c0 [zfs]
        bpobj_iterate_impl+0xbe6/0x1410 [zfs]

    thread 3 holds tx->tx_sync_lock and tries to get bpo->bpo_lock:
        bpobj_space+0x63/0x470 [zfs]
        dsl_scan_active+0x340/0x3d0 [zfs]
        txg_sync_thread+0x3f2/0x1370 [zfs]

    thread 4 holds dp->dp_lock and tries to get tx->tx_sync_lock
       txg_kick+0x61/0x420 [zfs]
       dsl_pool_need_dirty_delay+0x1c7/0x3f0 [zfs]

This patch is orginally from Brian Behlendorf and slightly simplified
by me.

It breaks this cycle in thread 4 by moving the call from
dsl_pool_need_dirty_delay to txg_kick outside the section controlled
by dp->dp_lock.

Reviewed-by: Brian Behlendorf <behlendorf1@llnl.gov>
Reviewed-by: Matt Ahrens <mahrens@delphix.com>
Signed-off-by: Jeff Dike <jdike@akamai.com>
Closes openzfs#9094
tonyhutter pushed a commit that referenced this pull request Jan 23, 2020
This fixes a lockdep warning by breaking a link between ->tx_sync_lock
and ->dp_lock.

The deadlock envisioned by lockdep is this:
    thread 1 holds db->db_mtx and tries to get dp->dp_lock:
	dsl_pool_dirty_space+0x70/0x2d0 [zfs]
	dbuf_dirty+0x778/0x31d0 [zfs]

    thread 2 holds bpo->bpo_lock and tries to get db->db_mtx:
        dmu_buf_will_dirty_impl
        dmu_buf_will_dirty+0x6b/0x6c0 [zfs]
        bpobj_iterate_impl+0xbe6/0x1410 [zfs]

    thread 3 holds tx->tx_sync_lock and tries to get bpo->bpo_lock:
        bpobj_space+0x63/0x470 [zfs]
        dsl_scan_active+0x340/0x3d0 [zfs]
        txg_sync_thread+0x3f2/0x1370 [zfs]

    thread 4 holds dp->dp_lock and tries to get tx->tx_sync_lock
       txg_kick+0x61/0x420 [zfs]
       dsl_pool_need_dirty_delay+0x1c7/0x3f0 [zfs]

This patch is orginally from Brian Behlendorf and slightly simplified
by me.

It breaks this cycle in thread 4 by moving the call from
dsl_pool_need_dirty_delay to txg_kick outside the section controlled
by dp->dp_lock.

Reviewed-by: Brian Behlendorf <behlendorf1@llnl.gov>
Reviewed-by: Matt Ahrens <mahrens@delphix.com>
Signed-off-by: Jeff Dike <jdike@akamai.com>
Closes #9094
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Status: Accepted Ready to integrate (reviewed, tested)
Projects
None yet
Development

Successfully merging this pull request may close these issues.

4 participants