[SRU][B][PULL v2] bcache: fix hung task timeout in bch_bucket_alloc()

classic Classic list List threaded Threaded
5 messages Options
Reply | Threaded
Open this post in threaded view
|

[SRU][B][PULL v2] bcache: fix hung task timeout in bch_bucket_alloc()

Andrea Righi
BugLink: https://bugs.launchpad.net/bugs/1784665

[Impact]

bcache_allocator can call the following:

 bch_allocator_thread()
  -> bch_prio_write()
     -> bch_bucket_alloc()
        -> wait on &ca->set->bucket_wait

But the wake up event on bucket_wait is supposed to come from
bch_allocator_thread() itself causing a deadlock.

[Test Case]

This is a simple script that can easily trigger the deadlock condition:
https://launchpadlibrarian.net/381282009/bcache-basic-repro.sh

A better test case has been also provided in LP: #1796292:
https://bugs.launchpad.net/curtin/+bug/1796292/+attachment/5280353/+files/curtin-nvme.sh

[Fix]

Fix by making the call to bch_prio_write() non-blocking, so that
bch_allocator_thread() never waits on itself. Moreover, make sure to
wake up the garbage collector thread when bch_prio_write() is failing to
allocate buckets to increase the chance of freeing up more buckets.

In addition to that it would be safer to also import other upstream
bcache fixes (all clean cherry picks):

ce4c3e19e5201424357a0c82176633b32a98d2ec bcache: Replace bch_read_string_list() by __sysfs_match_string()
ecb37ce9baac653cc09e2b631393dde3df82979f bcache: Move couple of functions to sysfs.c
04cbc21137bfa4d7b8771a5b14f3d6c9b2aee671 bcache: Move couple of string arrays to sysfs.c
5f2b18ec8e1643410a2369f06888951cdedea0bf bcache: Fix a compiler warning in bcache_device_init()
20d3a518713e394efa5a899c84574b4b79ec5098 bcache: Reduce the number of sparse complaints about lock imbalances
42361469ae84c851e40cb1f94c8c9a14cdd94039 bcache: Suppress more warnings about set-but-not-used variables
f0d3814090ac77de94c42b7124c37ece23629197 bcache: Remove an unused variable
47344e330eabc1515cbe6061eb337100a3ab6d37 bcache: Fix kernel-doc warnings
9dfbdec7b7fea1ff1b7b5d5d12980dbc7dca46c7 bcache: Annotate switch fall-through
4a4e443835a43a79113cc237c472c0d268eb1e1c bcache: Add __printf annotation to __bch_check_keys()
fd01991d5c20098c5c1ffc4dca6c821cc60a2f74 bcache: Fix indentation
ca71df31661a0518ed58a1a59cf1993962153ebb bcache: fix using of loop variable in memory shrink
f3641c3abd1da978ee969b0203b71b86ec1bfa93 bcache: fix error return value in memory shrink
688892b3bc05e25da94866e32210e5f503f16f69 bcache: fix incorrect sysfs output value of strip size
09a44ca2114737e0932257619c16a2b50c7807f1 bcache: use pr_info() to inform duplicated CACHE_SET_IO_DISABLE set
c4dc2497d50d9c6fb16aa0d07b6a14f3b2adb1e0 bcache: fix high CPU occupancy during journal
a728eacbbdd229d1d903e46261c57d5206f87a4a bcache: add journal statistic
616486ab52ab7f9739b066d958bdd20e65aefd74 bcache: fix writeback target calc on large devices
eb8cbb6df38f6e5124a3d5f1f8a3dbf519537c60 bcache: improve bcache_reboot()
9951379b0ca88c95876ad9778b9099e19a95d566 bcache: never writeback a discard operation

[Regression Potential]

The upstream fixes are all clean cherry picks from stable (most of them
are small cleanups), so regression potential is minimal.

The only special patch is "UBUNTU: SAUCE: bcache: fix deadlock in
bcache_allocator" that is addressing the main deadlock bug (that seems
to be a mainline bug - not fixed yet). We should spend more time trying
to reproduce this deadlock with a mainline kernel and post the patch to
the LKML for review / feedback.

However, considering that this patch seems to fix/prevent the specific
deadlock problem reported in this bug (tested on the affected platform),
it should be considered safe to apply it as it is for now, to prevent
potential hung task timeout conditions.

Changes in v2:
 - fix potential buckets leak in "UBUNTU: SAUCE: bcache: fix deadlock in
   bcache_allocator"

---
The following changes since commit 0bec748bb0dbb97ef4075b42843c054678a10bf9:

  UBUNTU: upstream stable to v4.14.136, v4.19.64 (2019-08-07 01:53:42 -0400)

are available in the Git repository at:

  git://git.launchpad.net/~arighi/+git/bionic-linux bcache-fix-v2

for you to fetch changes up to 113fddeeca479432205f61ff77d9550442bf4256:

  UBUNTU: SAUCE: bcache: fix deadlock in bcache_allocator (2019-08-07 13:51:23 +0200)

----------------------------------------------------------------
Andrea Righi (1):
      UBUNTU: SAUCE: bcache: fix deadlock in bcache_allocator

Andy Shevchenko (3):
      bcache: Move couple of string arrays to sysfs.c
      bcache: Move couple of functions to sysfs.c
      bcache: Replace bch_read_string_list() by __sysfs_match_string()

Bart Van Assche (8):
      bcache: Fix indentation
      bcache: Add __printf annotation to __bch_check_keys()
      bcache: Annotate switch fall-through
      bcache: Fix kernel-doc warnings
      bcache: Remove an unused variable
      bcache: Suppress more warnings about set-but-not-used variables
      bcache: Reduce the number of sparse complaints about lock imbalances
      bcache: Fix a compiler warning in bcache_device_init()

Coly Li (2):
      bcache: improve bcache_reboot()
      bcache: use pr_info() to inform duplicated CACHE_SET_IO_DISABLE set

Daniel Axtens (1):
      bcache: never writeback a discard operation

Michael Lyle (1):
      bcache: fix writeback target calc on large devices

Tang Junhui (5):
      bcache: add journal statistic
      bcache: fix high CPU occupancy during journal
      bcache: fix incorrect sysfs output value of strip size
      bcache: fix error return value in memory shrink
      bcache: fix using of loop variable in memory shrink

 drivers/md/bcache/alloc.c     |  5 +++-
 drivers/md/bcache/bcache.h    | 10 +++++--
 drivers/md/bcache/bset.c      |  4 +--
 drivers/md/bcache/bset.h      |  5 ++--
 drivers/md/bcache/btree.c     | 15 ++++++----
 drivers/md/bcache/closure.c   |  8 ++---
 drivers/md/bcache/extents.c   |  2 --
 drivers/md/bcache/journal.c   | 56 +++++++++++++++++++++++++----------
 drivers/md/bcache/request.c   |  1 +
 drivers/md/bcache/super.c     | 65 ++++++++++++++++++++++-------------------
 drivers/md/bcache/sysfs.c     | 68 ++++++++++++++++++++++++++++++++++---------
 drivers/md/bcache/util.c      | 60 ++++++++++----------------------------
 drivers/md/bcache/util.h      |  7 ++---
 drivers/md/bcache/writeback.c | 31 +++++++++++++++++---
 drivers/md/bcache/writeback.h | 12 +++++++-
 15 files changed, 215 insertions(+), 134 deletions(-)

--
kernel-team mailing list
[hidden email]
https://lists.ubuntu.com/mailman/listinfo/kernel-team
Reply | Threaded
Open this post in threaded view
|

Re: [SRU][B][PULL v2] bcache: fix hung task timeout in bch_bucket_alloc()

Andrea Righi
On Wed, Aug 07, 2019 at 02:29:05PM +0200, Andrea Righi wrote:
...

> [Regression Potential]
>
> The upstream fixes are all clean cherry picks from stable (most of them
> are small cleanups), so regression potential is minimal.
>
> The only special patch is "UBUNTU: SAUCE: bcache: fix deadlock in
> bcache_allocator" that is addressing the main deadlock bug (that seems
> to be a mainline bug - not fixed yet). We should spend more time trying
> to reproduce this deadlock with a mainline kernel and post the patch to
> the LKML for review / feedback.
>
> However, considering that this patch seems to fix/prevent the specific
> deadlock problem reported in this bug (tested on the affected platform),
> it should be considered safe to apply it as it is for now, to prevent
> potential hung task timeout conditions.

Sorry, this comment is not valid anymore. The bug has been reproduced
with the latest mainline kernel (from Linus' git) and the fix has been
successfully tested, so we can confirm that it's definitely a mainline
bug.

-Andrea

--
kernel-team mailing list
[hidden email]
https://lists.ubuntu.com/mailman/listinfo/kernel-team
Reply | Threaded
Open this post in threaded view
|

ACK/Cmnt: [SRU][B][PULL v2] bcache: fix hung task timeout in bch_bucket_alloc()

Stefan Bader-2
In reply to this post by Andrea Righi
On 07.08.19 14:29, Andrea Righi wrote:

> BugLink: https://bugs.launchpad.net/bugs/1784665
>
> [Impact]
>
> bcache_allocator can call the following:
>
>  bch_allocator_thread()
>   -> bch_prio_write()
>      -> bch_bucket_alloc()
>         -> wait on &ca->set->bucket_wait
>
> But the wake up event on bucket_wait is supposed to come from
> bch_allocator_thread() itself causing a deadlock.
>
> [Test Case]
>
> This is a simple script that can easily trigger the deadlock condition:
> https://launchpadlibrarian.net/381282009/bcache-basic-repro.sh
>
> A better test case has been also provided in LP: #1796292:
> https://bugs.launchpad.net/curtin/+bug/1796292/+attachment/5280353/+files/curtin-nvme.sh
>
> [Fix]
>
> Fix by making the call to bch_prio_write() non-blocking, so that
> bch_allocator_thread() never waits on itself. Moreover, make sure to
> wake up the garbage collector thread when bch_prio_write() is failing to
> allocate buckets to increase the chance of freeing up more buckets.
>
> In addition to that it would be safer to also import other upstream
> bcache fixes (all clean cherry picks):
>
> ce4c3e19e5201424357a0c82176633b32a98d2ec bcache: Replace bch_read_string_list() by __sysfs_match_string()
> ecb37ce9baac653cc09e2b631393dde3df82979f bcache: Move couple of functions to sysfs.c
> 04cbc21137bfa4d7b8771a5b14f3d6c9b2aee671 bcache: Move couple of string arrays to sysfs.c
> 5f2b18ec8e1643410a2369f06888951cdedea0bf bcache: Fix a compiler warning in bcache_device_init()
> 20d3a518713e394efa5a899c84574b4b79ec5098 bcache: Reduce the number of sparse complaints about lock imbalances
> 42361469ae84c851e40cb1f94c8c9a14cdd94039 bcache: Suppress more warnings about set-but-not-used variables
> f0d3814090ac77de94c42b7124c37ece23629197 bcache: Remove an unused variable
> 47344e330eabc1515cbe6061eb337100a3ab6d37 bcache: Fix kernel-doc warnings
> 9dfbdec7b7fea1ff1b7b5d5d12980dbc7dca46c7 bcache: Annotate switch fall-through
> 4a4e443835a43a79113cc237c472c0d268eb1e1c bcache: Add __printf annotation to __bch_check_keys()
> fd01991d5c20098c5c1ffc4dca6c821cc60a2f74 bcache: Fix indentation
> ca71df31661a0518ed58a1a59cf1993962153ebb bcache: fix using of loop variable in memory shrink
> f3641c3abd1da978ee969b0203b71b86ec1bfa93 bcache: fix error return value in memory shrink
> 688892b3bc05e25da94866e32210e5f503f16f69 bcache: fix incorrect sysfs output value of strip size
> 09a44ca2114737e0932257619c16a2b50c7807f1 bcache: use pr_info() to inform duplicated CACHE_SET_IO_DISABLE set
> c4dc2497d50d9c6fb16aa0d07b6a14f3b2adb1e0 bcache: fix high CPU occupancy during journal
> a728eacbbdd229d1d903e46261c57d5206f87a4a bcache: add journal statistic
> 616486ab52ab7f9739b066d958bdd20e65aefd74 bcache: fix writeback target calc on large devices
> eb8cbb6df38f6e5124a3d5f1f8a3dbf519537c60 bcache: improve bcache_reboot()
> 9951379b0ca88c95876ad9778b9099e19a95d566 bcache: never writeback a discard operation
>
> [Regression Potential]
>
> The upstream fixes are all clean cherry picks from stable (most of them
> are small cleanups), so regression potential is minimal.
>
> The only special patch is "UBUNTU: SAUCE: bcache: fix deadlock in
> bcache_allocator" that is addressing the main deadlock bug (that seems
> to be a mainline bug - not fixed yet). We should spend more time trying
> to reproduce this deadlock with a mainline kernel and post the patch to
> the LKML for review / feedback.
>
> However, considering that this patch seems to fix/prevent the specific
> deadlock problem reported in this bug (tested on the affected platform),
> it should be considered safe to apply it as it is for now, to prevent
> potential hung task timeout conditions.
>
> Changes in v2:
>  - fix potential buckets leak in "UBUNTU: SAUCE: bcache: fix deadlock in
>    bcache_allocator"
>
> ---
> The following changes since commit 0bec748bb0dbb97ef4075b42843c054678a10bf9:
>
>   UBUNTU: upstream stable to v4.14.136, v4.19.64 (2019-08-07 01:53:42 -0400)
>
> are available in the Git repository at:
>
>   git://git.launchpad.net/~arighi/+git/bionic-linux bcache-fix-v2
>
> for you to fetch changes up to 113fddeeca479432205f61ff77d9550442bf4256:
>
>   UBUNTU: SAUCE: bcache: fix deadlock in bcache_allocator (2019-08-07 13:51:23 +0200)
>
> ----------------------------------------------------------------
> Andrea Righi (1):
>       UBUNTU: SAUCE: bcache: fix deadlock in bcache_allocator
>
> Andy Shevchenko (3):
>       bcache: Move couple of string arrays to sysfs.c
>       bcache: Move couple of functions to sysfs.c
>       bcache: Replace bch_read_string_list() by __sysfs_match_string()
>
> Bart Van Assche (8):
>       bcache: Fix indentation
>       bcache: Add __printf annotation to __bch_check_keys()
>       bcache: Annotate switch fall-through
>       bcache: Fix kernel-doc warnings
>       bcache: Remove an unused variable
>       bcache: Suppress more warnings about set-but-not-used variables
>       bcache: Reduce the number of sparse complaints about lock imbalances
>       bcache: Fix a compiler warning in bcache_device_init()
>
> Coly Li (2):
>       bcache: improve bcache_reboot()
>       bcache: use pr_info() to inform duplicated CACHE_SET_IO_DISABLE set
>
> Daniel Axtens (1):
>       bcache: never writeback a discard operation
>
> Michael Lyle (1):
>       bcache: fix writeback target calc on large devices
>
> Tang Junhui (5):
>       bcache: add journal statistic
>       bcache: fix high CPU occupancy during journal
>       bcache: fix incorrect sysfs output value of strip size
>       bcache: fix error return value in memory shrink
>       bcache: fix using of loop variable in memory shrink
>
>  drivers/md/bcache/alloc.c     |  5 +++-
>  drivers/md/bcache/bcache.h    | 10 +++++--
>  drivers/md/bcache/bset.c      |  4 +--
>  drivers/md/bcache/bset.h      |  5 ++--
>  drivers/md/bcache/btree.c     | 15 ++++++----
>  drivers/md/bcache/closure.c   |  8 ++---
>  drivers/md/bcache/extents.c   |  2 --
>  drivers/md/bcache/journal.c   | 56 +++++++++++++++++++++++++----------
>  drivers/md/bcache/request.c   |  1 +
>  drivers/md/bcache/super.c     | 65 ++++++++++++++++++++++-------------------
>  drivers/md/bcache/sysfs.c     | 68 ++++++++++++++++++++++++++++++++++---------
>  drivers/md/bcache/util.c      | 60 ++++++++++----------------------------
>  drivers/md/bcache/util.h      |  7 ++---
>  drivers/md/bcache/writeback.c | 31 +++++++++++++++++---
>  drivers/md/bcache/writeback.h | 12 +++++++-
>  15 files changed, 215 insertions(+), 134 deletions(-)
>
This is quite a bit change, however it is all limited to the bcache driver
(which are not the most quick ones in maintaining stable patches) so it should
be possible to test. So with the assumption that you will be testing the applied
set with the reproducer and some generic fs stresstest...

Acked-by: Stefan Bader <[hidden email]>


--
kernel-team mailing list
[hidden email]
https://lists.ubuntu.com/mailman/listinfo/kernel-team

signature.asc (849 bytes) Download Attachment
Reply | Threaded
Open this post in threaded view
|

ACK: [SRU][B][PULL v2] bcache: fix hung task timeout in bch_bucket_alloc()

Kleber Sacilotto de Souza
In reply to this post by Andrea Righi
On 8/7/19 2:29 PM, Andrea Righi wrote:

> BugLink: https://bugs.launchpad.net/bugs/1784665
>
> [Impact]
>
> bcache_allocator can call the following:
>
>  bch_allocator_thread()
>   -> bch_prio_write()
>      -> bch_bucket_alloc()
>         -> wait on &ca->set->bucket_wait
>
> But the wake up event on bucket_wait is supposed to come from
> bch_allocator_thread() itself causing a deadlock.
>
> [Test Case]
>
> This is a simple script that can easily trigger the deadlock condition:
> https://launchpadlibrarian.net/381282009/bcache-basic-repro.sh
>
> A better test case has been also provided in LP: #1796292:
> https://bugs.launchpad.net/curtin/+bug/1796292/+attachment/5280353/+files/curtin-nvme.sh
>
> [Fix]
>
> Fix by making the call to bch_prio_write() non-blocking, so that
> bch_allocator_thread() never waits on itself. Moreover, make sure to
> wake up the garbage collector thread when bch_prio_write() is failing to
> allocate buckets to increase the chance of freeing up more buckets.
>
> In addition to that it would be safer to also import other upstream
> bcache fixes (all clean cherry picks):
>
> ce4c3e19e5201424357a0c82176633b32a98d2ec bcache: Replace bch_read_string_list() by __sysfs_match_string()
> ecb37ce9baac653cc09e2b631393dde3df82979f bcache: Move couple of functions to sysfs.c
> 04cbc21137bfa4d7b8771a5b14f3d6c9b2aee671 bcache: Move couple of string arrays to sysfs.c
> 5f2b18ec8e1643410a2369f06888951cdedea0bf bcache: Fix a compiler warning in bcache_device_init()
> 20d3a518713e394efa5a899c84574b4b79ec5098 bcache: Reduce the number of sparse complaints about lock imbalances
> 42361469ae84c851e40cb1f94c8c9a14cdd94039 bcache: Suppress more warnings about set-but-not-used variables
> f0d3814090ac77de94c42b7124c37ece23629197 bcache: Remove an unused variable
> 47344e330eabc1515cbe6061eb337100a3ab6d37 bcache: Fix kernel-doc warnings
> 9dfbdec7b7fea1ff1b7b5d5d12980dbc7dca46c7 bcache: Annotate switch fall-through
> 4a4e443835a43a79113cc237c472c0d268eb1e1c bcache: Add __printf annotation to __bch_check_keys()
> fd01991d5c20098c5c1ffc4dca6c821cc60a2f74 bcache: Fix indentation
> ca71df31661a0518ed58a1a59cf1993962153ebb bcache: fix using of loop variable in memory shrink
> f3641c3abd1da978ee969b0203b71b86ec1bfa93 bcache: fix error return value in memory shrink
> 688892b3bc05e25da94866e32210e5f503f16f69 bcache: fix incorrect sysfs output value of strip size
> 09a44ca2114737e0932257619c16a2b50c7807f1 bcache: use pr_info() to inform duplicated CACHE_SET_IO_DISABLE set
> c4dc2497d50d9c6fb16aa0d07b6a14f3b2adb1e0 bcache: fix high CPU occupancy during journal
> a728eacbbdd229d1d903e46261c57d5206f87a4a bcache: add journal statistic
> 616486ab52ab7f9739b066d958bdd20e65aefd74 bcache: fix writeback target calc on large devices
> eb8cbb6df38f6e5124a3d5f1f8a3dbf519537c60 bcache: improve bcache_reboot()
> 9951379b0ca88c95876ad9778b9099e19a95d566 bcache: never writeback a discard operation
>
> [Regression Potential]
>
> The upstream fixes are all clean cherry picks from stable (most of them
> are small cleanups), so regression potential is minimal.
>
> The only special patch is "UBUNTU: SAUCE: bcache: fix deadlock in
> bcache_allocator" that is addressing the main deadlock bug (that seems
> to be a mainline bug - not fixed yet). We should spend more time trying
> to reproduce this deadlock with a mainline kernel and post the patch to
> the LKML for review / feedback.
>
> However, considering that this patch seems to fix/prevent the specific
> deadlock problem reported in this bug (tested on the affected platform),
> it should be considered safe to apply it as it is for now, to prevent
> potential hung task timeout conditions.
>
> Changes in v2:
>  - fix potential buckets leak in "UBUNTU: SAUCE: bcache: fix deadlock in
>    bcache_allocator"
>
> ---
> The following changes since commit 0bec748bb0dbb97ef4075b42843c054678a10bf9:
>
>   UBUNTU: upstream stable to v4.14.136, v4.19.64 (2019-08-07 01:53:42 -0400)
>
> are available in the Git repository at:
>
>   git://git.launchpad.net/~arighi/+git/bionic-linux bcache-fix-v2
>
> for you to fetch changes up to 113fddeeca479432205f61ff77d9550442bf4256:
>
>   UBUNTU: SAUCE: bcache: fix deadlock in bcache_allocator (2019-08-07 13:51:23 +0200)
>
> ----------------------------------------------------------------
> Andrea Righi (1):
>       UBUNTU: SAUCE: bcache: fix deadlock in bcache_allocator
>
> Andy Shevchenko (3):
>       bcache: Move couple of string arrays to sysfs.c
>       bcache: Move couple of functions to sysfs.c
>       bcache: Replace bch_read_string_list() by __sysfs_match_string()
>
> Bart Van Assche (8):
>       bcache: Fix indentation
>       bcache: Add __printf annotation to __bch_check_keys()
>       bcache: Annotate switch fall-through
>       bcache: Fix kernel-doc warnings
>       bcache: Remove an unused variable
>       bcache: Suppress more warnings about set-but-not-used variables
>       bcache: Reduce the number of sparse complaints about lock imbalances
>       bcache: Fix a compiler warning in bcache_device_init()
>
> Coly Li (2):
>       bcache: improve bcache_reboot()
>       bcache: use pr_info() to inform duplicated CACHE_SET_IO_DISABLE set
>
> Daniel Axtens (1):
>       bcache: never writeback a discard operation
>
> Michael Lyle (1):
>       bcache: fix writeback target calc on large devices
>
> Tang Junhui (5):
>       bcache: add journal statistic
>       bcache: fix high CPU occupancy during journal
>       bcache: fix incorrect sysfs output value of strip size
>       bcache: fix error return value in memory shrink
>       bcache: fix using of loop variable in memory shrink
>
>  drivers/md/bcache/alloc.c     |  5 +++-
>  drivers/md/bcache/bcache.h    | 10 +++++--
>  drivers/md/bcache/bset.c      |  4 +--
>  drivers/md/bcache/bset.h      |  5 ++--
>  drivers/md/bcache/btree.c     | 15 ++++++----
>  drivers/md/bcache/closure.c   |  8 ++---
>  drivers/md/bcache/extents.c   |  2 --
>  drivers/md/bcache/journal.c   | 56 +++++++++++++++++++++++++----------
>  drivers/md/bcache/request.c   |  1 +
>  drivers/md/bcache/super.c     | 65 ++++++++++++++++++++++-------------------
>  drivers/md/bcache/sysfs.c     | 68 ++++++++++++++++++++++++++++++++++---------
>  drivers/md/bcache/util.c      | 60 ++++++++++----------------------------
>  drivers/md/bcache/util.h      |  7 ++---
>  drivers/md/bcache/writeback.c | 31 +++++++++++++++++---
>  drivers/md/bcache/writeback.h | 12 +++++++-
>  15 files changed, 215 insertions(+), 134 deletions(-)
>

Acked-by: Kleber Sacilotto de Souza <[hidden email]>

--
kernel-team mailing list
[hidden email]
https://lists.ubuntu.com/mailman/listinfo/kernel-team
Reply | Threaded
Open this post in threaded view
|

APPLIED: [SRU][B][PULL v2] bcache: fix hung task timeout in bch_bucket_alloc()

Khaled Elmously
In reply to this post by Andrea Righi
On 2019-08-07 14:29:05 , Andrea Righi wrote:

> BugLink: https://bugs.launchpad.net/bugs/1784665
>
> [Impact]
>
> bcache_allocator can call the following:
>
>  bch_allocator_thread()
>   -> bch_prio_write()
>      -> bch_bucket_alloc()
>         -> wait on &ca->set->bucket_wait
>
> But the wake up event on bucket_wait is supposed to come from
> bch_allocator_thread() itself causing a deadlock.
>
> [Test Case]
>
> This is a simple script that can easily trigger the deadlock condition:
> https://launchpadlibrarian.net/381282009/bcache-basic-repro.sh
>
> A better test case has been also provided in LP: #1796292:
> https://bugs.launchpad.net/curtin/+bug/1796292/+attachment/5280353/+files/curtin-nvme.sh
>
> [Fix]
>
> Fix by making the call to bch_prio_write() non-blocking, so that
> bch_allocator_thread() never waits on itself. Moreover, make sure to
> wake up the garbage collector thread when bch_prio_write() is failing to
> allocate buckets to increase the chance of freeing up more buckets.
>
> In addition to that it would be safer to also import other upstream
> bcache fixes (all clean cherry picks):
>
> ce4c3e19e5201424357a0c82176633b32a98d2ec bcache: Replace bch_read_string_list() by __sysfs_match_string()
> ecb37ce9baac653cc09e2b631393dde3df82979f bcache: Move couple of functions to sysfs.c
> 04cbc21137bfa4d7b8771a5b14f3d6c9b2aee671 bcache: Move couple of string arrays to sysfs.c
> 5f2b18ec8e1643410a2369f06888951cdedea0bf bcache: Fix a compiler warning in bcache_device_init()
> 20d3a518713e394efa5a899c84574b4b79ec5098 bcache: Reduce the number of sparse complaints about lock imbalances
> 42361469ae84c851e40cb1f94c8c9a14cdd94039 bcache: Suppress more warnings about set-but-not-used variables
> f0d3814090ac77de94c42b7124c37ece23629197 bcache: Remove an unused variable
> 47344e330eabc1515cbe6061eb337100a3ab6d37 bcache: Fix kernel-doc warnings
> 9dfbdec7b7fea1ff1b7b5d5d12980dbc7dca46c7 bcache: Annotate switch fall-through
> 4a4e443835a43a79113cc237c472c0d268eb1e1c bcache: Add __printf annotation to __bch_check_keys()
> fd01991d5c20098c5c1ffc4dca6c821cc60a2f74 bcache: Fix indentation
> ca71df31661a0518ed58a1a59cf1993962153ebb bcache: fix using of loop variable in memory shrink
> f3641c3abd1da978ee969b0203b71b86ec1bfa93 bcache: fix error return value in memory shrink
> 688892b3bc05e25da94866e32210e5f503f16f69 bcache: fix incorrect sysfs output value of strip size
> 09a44ca2114737e0932257619c16a2b50c7807f1 bcache: use pr_info() to inform duplicated CACHE_SET_IO_DISABLE set
> c4dc2497d50d9c6fb16aa0d07b6a14f3b2adb1e0 bcache: fix high CPU occupancy during journal
> a728eacbbdd229d1d903e46261c57d5206f87a4a bcache: add journal statistic
> 616486ab52ab7f9739b066d958bdd20e65aefd74 bcache: fix writeback target calc on large devices
> eb8cbb6df38f6e5124a3d5f1f8a3dbf519537c60 bcache: improve bcache_reboot()
> 9951379b0ca88c95876ad9778b9099e19a95d566 bcache: never writeback a discard operation
>
> [Regression Potential]
>
> The upstream fixes are all clean cherry picks from stable (most of them
> are small cleanups), so regression potential is minimal.
>
> The only special patch is "UBUNTU: SAUCE: bcache: fix deadlock in
> bcache_allocator" that is addressing the main deadlock bug (that seems
> to be a mainline bug - not fixed yet). We should spend more time trying
> to reproduce this deadlock with a mainline kernel and post the patch to
> the LKML for review / feedback.
>
> However, considering that this patch seems to fix/prevent the specific
> deadlock problem reported in this bug (tested on the affected platform),
> it should be considered safe to apply it as it is for now, to prevent
> potential hung task timeout conditions.
>
> Changes in v2:
>  - fix potential buckets leak in "UBUNTU: SAUCE: bcache: fix deadlock in
>    bcache_allocator"
>
> ---
> The following changes since commit 0bec748bb0dbb97ef4075b42843c054678a10bf9:
>
>   UBUNTU: upstream stable to v4.14.136, v4.19.64 (2019-08-07 01:53:42 -0400)
>
> are available in the Git repository at:
>
>   git://git.launchpad.net/~arighi/+git/bionic-linux bcache-fix-v2
>
> for you to fetch changes up to 113fddeeca479432205f61ff77d9550442bf4256:
>
>   UBUNTU: SAUCE: bcache: fix deadlock in bcache_allocator (2019-08-07 13:51:23 +0200)
>
> ----------------------------------------------------------------
> Andrea Righi (1):
>       UBUNTU: SAUCE: bcache: fix deadlock in bcache_allocator
>
> Andy Shevchenko (3):
>       bcache: Move couple of string arrays to sysfs.c
>       bcache: Move couple of functions to sysfs.c
>       bcache: Replace bch_read_string_list() by __sysfs_match_string()
>
> Bart Van Assche (8):
>       bcache: Fix indentation
>       bcache: Add __printf annotation to __bch_check_keys()
>       bcache: Annotate switch fall-through
>       bcache: Fix kernel-doc warnings
>       bcache: Remove an unused variable
>       bcache: Suppress more warnings about set-but-not-used variables
>       bcache: Reduce the number of sparse complaints about lock imbalances
>       bcache: Fix a compiler warning in bcache_device_init()
>
> Coly Li (2):
>       bcache: improve bcache_reboot()
>       bcache: use pr_info() to inform duplicated CACHE_SET_IO_DISABLE set
>
> Daniel Axtens (1):
>       bcache: never writeback a discard operation
>
> Michael Lyle (1):
>       bcache: fix writeback target calc on large devices
>
> Tang Junhui (5):
>       bcache: add journal statistic
>       bcache: fix high CPU occupancy during journal
>       bcache: fix incorrect sysfs output value of strip size
>       bcache: fix error return value in memory shrink
>       bcache: fix using of loop variable in memory shrink
>
>  drivers/md/bcache/alloc.c     |  5 +++-
>  drivers/md/bcache/bcache.h    | 10 +++++--
>  drivers/md/bcache/bset.c      |  4 +--
>  drivers/md/bcache/bset.h      |  5 ++--
>  drivers/md/bcache/btree.c     | 15 ++++++----
>  drivers/md/bcache/closure.c   |  8 ++---
>  drivers/md/bcache/extents.c   |  2 --
>  drivers/md/bcache/journal.c   | 56 +++++++++++++++++++++++++----------
>  drivers/md/bcache/request.c   |  1 +
>  drivers/md/bcache/super.c     | 65 ++++++++++++++++++++++-------------------
>  drivers/md/bcache/sysfs.c     | 68 ++++++++++++++++++++++++++++++++++---------
>  drivers/md/bcache/util.c      | 60 ++++++++++----------------------------
>  drivers/md/bcache/util.h      |  7 ++---
>  drivers/md/bcache/writeback.c | 31 +++++++++++++++++---
>  drivers/md/bcache/writeback.h | 12 +++++++-
>  15 files changed, 215 insertions(+), 134 deletions(-)
>
> --
> kernel-team mailing list
> [hidden email]
> https://lists.ubuntu.com/mailman/listinfo/kernel-team

--
kernel-team mailing list
[hidden email]
https://lists.ubuntu.com/mailman/listinfo/kernel-team