[SRU][D][PATCH v2 0/3] bcache: fix hung task timeout in bch_bucket_alloc()

classic Classic list List threaded Threaded
7 messages Options
Reply | Threaded
Open this post in threaded view
|

[SRU][D][PATCH v2 0/3] bcache: fix hung task timeout in bch_bucket_alloc()

Andrea Righi
BugLink: https://bugs.launchpad.net/bugs/1784665

[Impact]

bcache_allocator can call the following:

 bch_allocator_thread()
  -> bch_prio_write()
     -> bch_bucket_alloc()
        -> wait on &ca->set->bucket_wait

But the wake up event on bucket_wait is supposed to come from
bch_allocator_thread() itself causing a deadlock.

[Test Case]

This is a simple script that can easily trigger the deadlock condition:
https://launchpadlibrarian.net/381282009/bcache-basic-repro.sh

A better test case has been also provided in LP: #1796292:
https://bugs.launchpad.net/curtin/+bug/1796292/+attachment/5280353/+files/curtin-nvme.sh

[Fix]

Fix by making the call to bch_prio_write() non-blocking, so that
bch_allocator_thread() never waits on itself. Moreover, make sure to
wake up the garbage collector thread when bch_prio_write() is failing to
allocate buckets to increase the chance of freeing up more buckets.

In addition to that it would be safer to also import other upstream
bcache fixes (all clean cherry picks):

eb8cbb6df38f6e5124a3d5f1f8a3dbf519537c60 bcache: improve bcache_reboot()
9951379b0ca88c95876ad9778b9099e19a95d566 bcache: never writeback a discard operation

[Regression Potential]

The upstream fixes are all clean cherry picks from stable (most of them
are small cleanups), so regression potential is minimal.

The only special patch is "UBUNTU: SAUCE: bcache: fix deadlock in
bcache_allocator" that is addressing the main deadlock bug (that seems
to be a mainline bug - not fixed yet). We should spend more time trying
to reproduce this deadlock with a mainline kernel and post the patch to
the LKML for review / feedback.

However, considering that this patch seems to fix/prevent the specific
deadlock problem reported in this bug (tested on the affected platform),
it should be considered safe to apply it as it is for now, to prevent
potential hung task timeout conditions.

Changes in v2:
 - fix potential buckets leak in "UBUNTU: SAUCE: bcache: fix deadlock in
   bcache_allocator"

----------------------------------------------------------------
Andrea Righi (1):
      UBUNTU: SAUCE: bcache: fix deadlock in bcache_allocator

Coly Li (1):
      bcache: improve bcache_reboot()

Daniel Axtens (1):
      bcache: never writeback a discard operation

 drivers/md/bcache/alloc.c     |  5 ++++-
 drivers/md/bcache/bcache.h    |  2 +-
 drivers/md/bcache/super.c     | 39 +++++++++++++++++++++++++++++++--------
 drivers/md/bcache/writeback.h |  3 +++
 4 files changed, 39 insertions(+), 10 deletions(-)


--
kernel-team mailing list
[hidden email]
https://lists.ubuntu.com/mailman/listinfo/kernel-team
Reply | Threaded
Open this post in threaded view
|

[PATCH 1/3] bcache: never writeback a discard operation

Andrea Righi
From: Daniel Axtens <[hidden email]>

BugLink: https://bugs.launchpad.net/bugs/1784665

Some users see panics like the following when performing fstrim on a
bcached volume:

[  529.803060] BUG: unable to handle kernel NULL pointer dereference at 0000000000000008
[  530.183928] #PF error: [normal kernel read fault]
[  530.412392] PGD 8000001f42163067 P4D 8000001f42163067 PUD 1f42168067 PMD 0
[  530.750887] Oops: 0000 [#1] SMP PTI
[  530.920869] CPU: 10 PID: 4167 Comm: fstrim Kdump: loaded Not tainted 5.0.0-rc1+ #3
[  531.290204] Hardware name: HP ProLiant DL360 Gen9/ProLiant DL360 Gen9, BIOS P89 12/27/2015
[  531.693137] RIP: 0010:blk_queue_split+0x148/0x620
[  531.922205] Code: 60 38 89 55 a0 45 31 db 45 31 f6 45 31 c9 31 ff 89 4d 98 85 db 0f 84 7f 04 00 00 44 8b 6d 98 4c 89 ee 48 c1 e6 04 49 03 70 78 <8b> 46 08 44 8b 56 0c 48
8b 16 44 29 e0 39 d8 48 89 55 a8 0f 47 c3
[  532.838634] RSP: 0018:ffffb9b708df39b0 EFLAGS: 00010246
[  533.093571] RAX: 00000000ffffffff RBX: 0000000000046000 RCX: 0000000000000000
[  533.441865] RDX: 0000000000000200 RSI: 0000000000000000 RDI: 0000000000000000
[  533.789922] RBP: ffffb9b708df3a48 R08: ffff940d3b3fdd20 R09: 0000000000000000
[  534.137512] R10: ffffb9b708df3958 R11: 0000000000000000 R12: 0000000000000000
[  534.485329] R13: 0000000000000000 R14: 0000000000000000 R15: ffff940d39212020
[  534.833319] FS:  00007efec26e3840(0000) GS:ffff940d1f480000(0000) knlGS:0000000000000000
[  535.224098] CS:  0010 DS: 0000 ES: 0000 CR0: 0000000080050033
[  535.504318] CR2: 0000000000000008 CR3: 0000001f4e256004 CR4: 00000000001606e0
[  535.851759] Call Trace:
[  535.970308]  ? mempool_alloc_slab+0x15/0x20
[  536.174152]  ? bch_data_insert+0x42/0xd0 [bcache]
[  536.403399]  blk_mq_make_request+0x97/0x4f0
[  536.607036]  generic_make_request+0x1e2/0x410
[  536.819164]  submit_bio+0x73/0x150
[  536.980168]  ? submit_bio+0x73/0x150
[  537.149731]  ? bio_associate_blkg_from_css+0x3b/0x60
[  537.391595]  ? _cond_resched+0x1a/0x50
[  537.573774]  submit_bio_wait+0x59/0x90
[  537.756105]  blkdev_issue_discard+0x80/0xd0
[  537.959590]  ext4_trim_fs+0x4a9/0x9e0
[  538.137636]  ? ext4_trim_fs+0x4a9/0x9e0
[  538.324087]  ext4_ioctl+0xea4/0x1530
[  538.497712]  ? _copy_to_user+0x2a/0x40
[  538.679632]  do_vfs_ioctl+0xa6/0x600
[  538.853127]  ? __do_sys_newfstat+0x44/0x70
[  539.051951]  ksys_ioctl+0x6d/0x80
[  539.212785]  __x64_sys_ioctl+0x1a/0x20
[  539.394918]  do_syscall_64+0x5a/0x110
[  539.568674]  entry_SYSCALL_64_after_hwframe+0x44/0xa9

We have observed it where both:
1) LVM/devmapper is involved (bcache backing device is LVM volume) and
2) writeback cache is involved (bcache cache_mode is writeback)

On one machine, we can reliably reproduce it with:

 # echo writeback > /sys/block/bcache0/bcache/cache_mode
   (not sure whether above line is required)
 # mount /dev/bcache0 /test
 # for i in {0..10}; do
        file="$(mktemp /test/zero.XXX)"
        dd if=/dev/zero of="$file" bs=1M count=256
        sync
        rm $file
    done
  # fstrim -v /test

Observing this with tracepoints on, we see the following writes:

fstrim-18019 [022] .... 91107.302026: bcache_write: 73f95583-561c-408f-a93a-4cbd2498f5c8 inode 0  DS 4260112 + 196352 hit 0 bypass 1
fstrim-18019 [022] .... 91107.302050: bcache_write: 73f95583-561c-408f-a93a-4cbd2498f5c8 inode 0  DS 4456464 + 262144 hit 0 bypass 1
fstrim-18019 [022] .... 91107.302075: bcache_write: 73f95583-561c-408f-a93a-4cbd2498f5c8 inode 0  DS 4718608 + 81920 hit 0 bypass 1
fstrim-18019 [022] .... 91107.302094: bcache_write: 73f95583-561c-408f-a93a-4cbd2498f5c8 inode 0  DS 5324816 + 180224 hit 0 bypass 1
fstrim-18019 [022] .... 91107.302121: bcache_write: 73f95583-561c-408f-a93a-4cbd2498f5c8 inode 0  DS 5505040 + 262144 hit 0 bypass 1
fstrim-18019 [022] .... 91107.302145: bcache_write: 73f95583-561c-408f-a93a-4cbd2498f5c8 inode 0  DS 5767184 + 81920 hit 0 bypass 1
fstrim-18019 [022] .... 91107.308777: bcache_write: 73f95583-561c-408f-a93a-4cbd2498f5c8 inode 0  DS 6373392 + 180224 hit 1 bypass 0
<crash>

Note the final one has different hit/bypass flags.

This is because in should_writeback(), we were hitting a case where
the partial stripe condition was returning true and so
should_writeback() was returning true early.

If that hadn't been the case, it would have hit the would_skip test, and
as would_skip == s->iop.bypass == true, should_writeback() would have
returned false.

Looking at the git history from 'commit 72c270612bd3 ("bcache: Write out
full stripes")', it looks like the idea was to optimise for raid5/6:

       * If a stripe is already dirty, force writes to that stripe to
         writeback mode - to help build up full stripes of dirty data

To fix this issue, make sure that should_writeback() on a discard op
never returns true.

More details of debugging:
https://www.spinics.net/lists/linux-bcache/msg06996.html

Previous reports:
 - https://bugzilla.kernel.org/show_bug.cgi?id=201051
 - https://bugzilla.kernel.org/show_bug.cgi?id=196103
 - https://www.spinics.net/lists/linux-bcache/msg06885.html

(Coly Li: minor modification to follow maximum 75 chars per line rule)

Cc: Kent Overstreet <[hidden email]>
Cc: [hidden email]
Fixes: 72c270612bd3 ("bcache: Write out full stripes")
Signed-off-by: Daniel Axtens <[hidden email]>
Signed-off-by: Coly Li <[hidden email]>
Signed-off-by: Jens Axboe <[hidden email]>
(cherry picked from commit 9951379b0ca88c95876ad9778b9099e19a95d566)
Signed-off-by: Andrea Righi <[hidden email]>
---
 drivers/md/bcache/writeback.h | 3 +++
 1 file changed, 3 insertions(+)

diff --git a/drivers/md/bcache/writeback.h b/drivers/md/bcache/writeback.h
index 4e4c6810dc3c..8e049a16b3e6 100644
--- a/drivers/md/bcache/writeback.h
+++ b/drivers/md/bcache/writeback.h
@@ -74,6 +74,9 @@ static inline bool should_writeback(struct cached_dev *dc, struct bio *bio,
  if (bio_op(bio) == REQ_OP_DISCARD)
  return false;
 
+ if (bio_op(bio) == REQ_OP_DISCARD)
+ return false;
+
  if (dc->partial_stripes_expensive &&
     bcache_dev_stripe_dirty(dc, bio->bi_iter.bi_sector,
     bio_sectors(bio)))
--
2.20.1


--
kernel-team mailing list
[hidden email]
https://lists.ubuntu.com/mailman/listinfo/kernel-team
Reply | Threaded
Open this post in threaded view
|

[PATCH 2/3] bcache: improve bcache_reboot()

Andrea Righi
In reply to this post by Andrea Righi
From: Coly Li <[hidden email]>

BugLink: https://bugs.launchpad.net/bugs/1784665

This patch tries to release mutex bch_register_lock early, to give
chance to stop cache set and bcache device early.

This patch also expends time out of stopping all bcache device from
2 seconds to 10 seconds, because stopping writeback rate update worker
may delay for 5 seconds, 2 seconds is not enough.

After this patch applied, stopping bcache devices during system reboot
or shutdown is very hard to be observed any more.

Signed-off-by: Coly Li <[hidden email]>
Reviewed-by: Hannes Reinecke <[hidden email]>
Signed-off-by: Jens Axboe <[hidden email]>
(cherry picked from commit eb8cbb6df38f6e5124a3d5f1f8a3dbf519537c60)
Signed-off-by: Andrea Righi <[hidden email]>
---
 drivers/md/bcache/super.c | 12 ++++++++++--
 1 file changed, 10 insertions(+), 2 deletions(-)

diff --git a/drivers/md/bcache/super.c b/drivers/md/bcache/super.c
index ffc3093dd9f7..5f7b3ce09c6f 100644
--- a/drivers/md/bcache/super.c
+++ b/drivers/md/bcache/super.c
@@ -2384,10 +2384,19 @@ static int bcache_reboot(struct notifier_block *n, unsigned long code, void *x)
  list_for_each_entry_safe(dc, tdc, &uncached_devices, list)
  bcache_device_stop(&dc->disk);
 
+ mutex_unlock(&bch_register_lock);
+
+ /*
+ * Give an early chance for other kthreads and
+ * kworkers to stop themselves
+ */
+ schedule();
+
  /* What's a condition variable? */
  while (1) {
- long timeout = start + 2 * HZ - jiffies;
+ long timeout = start + 10 * HZ - jiffies;
 
+ mutex_lock(&bch_register_lock);
  stopped = list_empty(&bch_cache_sets) &&
  list_empty(&uncached_devices);
 
@@ -2399,7 +2408,6 @@ static int bcache_reboot(struct notifier_block *n, unsigned long code, void *x)
 
  mutex_unlock(&bch_register_lock);
  schedule_timeout(timeout);
- mutex_lock(&bch_register_lock);
  }
 
  finish_wait(&unregister_wait, &wait);
--
2.20.1


--
kernel-team mailing list
[hidden email]
https://lists.ubuntu.com/mailman/listinfo/kernel-team
Reply | Threaded
Open this post in threaded view
|

[PATCH 3/3] UBUNTU: SAUCE: bcache: fix deadlock in bcache_allocator

Andrea Righi
In reply to this post by Andrea Righi
bcache_allocator() can call the following:

 bch_allocator_thread()
  -> bch_prio_write()
     -> bch_bucket_alloc()
        -> wait on &ca->set->bucket_wait

But the wake up event on bucket_wait is supposed to come from
bch_allocator_thread() itself => deadlock:

[ 1158.490744] INFO: task bcache_allocato:15861 blocked for more than 10 seconds.
[ 1158.495929]       Not tainted 5.3.0-050300rc3-generic #201908042232
[ 1158.500653] "echo 0 > /proc/sys/kernel/hung_task_timeout_secs" disables this message.
[ 1158.504413] bcache_allocato D    0 15861      2 0x80004000
[ 1158.504419] Call Trace:
[ 1158.504429]  __schedule+0x2a8/0x670
[ 1158.504432]  schedule+0x2d/0x90
[ 1158.504448]  bch_bucket_alloc+0xe5/0x370 [bcache]
[ 1158.504453]  ? wait_woken+0x80/0x80
[ 1158.504466]  bch_prio_write+0x1dc/0x390 [bcache]
[ 1158.504476]  bch_allocator_thread+0x233/0x490 [bcache]
[ 1158.504491]  kthread+0x121/0x140
[ 1158.504503]  ? invalidate_buckets+0x890/0x890 [bcache]
[ 1158.504506]  ? kthread_park+0xb0/0xb0
[ 1158.504510]  ret_from_fork+0x35/0x40

Fix by making the call to bch_prio_write() non-blocking, so that
bch_allocator_thread() never waits on itself.

Moreover, make sure to wake up the garbage collector thread when
bch_prio_write() is failing to allocate buckets.

BugLink: https://bugs.launchpad.net/bugs/1784665
BugLink: https://bugs.launchpad.net/bugs/1796292
Signed-off-by: Andrea Righi <[hidden email]>
---
 drivers/md/bcache/alloc.c  |  5 ++++-
 drivers/md/bcache/bcache.h |  2 +-
 drivers/md/bcache/super.c  | 27 +++++++++++++++++++++------
 3 files changed, 26 insertions(+), 8 deletions(-)

diff --git a/drivers/md/bcache/alloc.c b/drivers/md/bcache/alloc.c
index 5002838ea476..0a2cdaac682e 100644
--- a/drivers/md/bcache/alloc.c
+++ b/drivers/md/bcache/alloc.c
@@ -376,7 +376,10 @@ static int bch_allocator_thread(void *arg)
  if (!fifo_full(&ca->free_inc))
  goto retry_invalidate;
 
- bch_prio_write(ca);
+ if (bch_prio_write(ca, false) < 0) {
+ ca->invalidate_needs_gc = 1;
+ wake_up_gc(ca->set);
+ }
  }
  }
 out:
diff --git a/drivers/md/bcache/bcache.h b/drivers/md/bcache/bcache.h
index e5d2158f4f32..9f64ae22915b 100644
--- a/drivers/md/bcache/bcache.h
+++ b/drivers/md/bcache/bcache.h
@@ -979,7 +979,7 @@ bool bch_cached_dev_error(struct cached_dev *dc);
 __printf(2, 3)
 bool bch_cache_set_error(struct cache_set *c, const char *fmt, ...);
 
-void bch_prio_write(struct cache *ca);
+int bch_prio_write(struct cache *ca, bool wait);
 void bch_write_bdev_super(struct cached_dev *dc, struct closure *parent);
 
 extern struct workqueue_struct *bcache_wq;
diff --git a/drivers/md/bcache/super.c b/drivers/md/bcache/super.c
index 5f7b3ce09c6f..9176f5962aa6 100644
--- a/drivers/md/bcache/super.c
+++ b/drivers/md/bcache/super.c
@@ -525,12 +525,29 @@ static void prio_io(struct cache *ca, uint64_t bucket, int op,
  closure_sync(cl);
 }
 
-void bch_prio_write(struct cache *ca)
+int bch_prio_write(struct cache *ca, bool wait)
 {
  int i;
  struct bucket *b;
  struct closure cl;
 
+ pr_debug("free_prio=%zu, free_none=%zu, free_inc=%zu",
+ fifo_used(&ca->free[RESERVE_PRIO]),
+ fifo_used(&ca->free[RESERVE_NONE]),
+ fifo_used(&ca->free_inc));
+
+ /*
+ * Pre-check if there are enough free buckets. In the non-blocking
+ * scenario it's better to fail early rather than starting to allocate
+ * buckets and do a cleanup later in case of failure.
+ */
+ if (!wait) {
+ size_t avail = fifo_used(&ca->free[RESERVE_PRIO]) +
+       fifo_used(&ca->free[RESERVE_NONE]);
+ if (prio_buckets(ca) > avail)
+ return -ENOMEM;
+ }
+
  closure_init_stack(&cl);
 
  lockdep_assert_held(&ca->set->bucket_lock);
@@ -540,9 +557,6 @@ void bch_prio_write(struct cache *ca)
  atomic_long_add(ca->sb.bucket_size * prio_buckets(ca),
  &ca->meta_sectors_written);
 
- //pr_debug("free %zu, free_inc %zu, unused %zu", fifo_used(&ca->free),
- // fifo_used(&ca->free_inc), fifo_used(&ca->unused));
-
  for (i = prio_buckets(ca) - 1; i >= 0; --i) {
  long bucket;
  struct prio_set *p = ca->disk_buckets;
@@ -560,7 +574,7 @@ void bch_prio_write(struct cache *ca)
  p->magic = pset_magic(&ca->sb);
  p->csum = bch_crc64(&p->magic, bucket_bytes(ca) - 8);
 
- bucket = bch_bucket_alloc(ca, RESERVE_PRIO, true);
+ bucket = bch_bucket_alloc(ca, RESERVE_PRIO, wait);
  BUG_ON(bucket == -1);
 
  mutex_unlock(&ca->set->bucket_lock);
@@ -589,6 +603,7 @@ void bch_prio_write(struct cache *ca)
 
  ca->prio_last_buckets[i] = ca->prio_buckets[i];
  }
+ return 0;
 }
 
 static void prio_read(struct cache *ca, uint64_t bucket)
@@ -1880,7 +1895,7 @@ static void run_cache_set(struct cache_set *c)
 
  mutex_lock(&c->bucket_lock);
  for_each_cache(ca, c, i)
- bch_prio_write(ca);
+ bch_prio_write(ca, true);
  mutex_unlock(&c->bucket_lock);
 
  err = "cannot allocate new UUID bucket";
--
2.20.1


--
kernel-team mailing list
[hidden email]
https://lists.ubuntu.com/mailman/listinfo/kernel-team
Reply | Threaded
Open this post in threaded view
|

ACK: [SRU][D][PATCH v2 0/3] bcache: fix hung task timeout in bch_bucket_alloc()

Stefan Bader-2
In reply to this post by Andrea Righi
On 07.08.19 14:46, Andrea Righi wrote:

> BugLink: https://bugs.launchpad.net/bugs/1784665
>
> [Impact]
>
> bcache_allocator can call the following:
>
>  bch_allocator_thread()
>   -> bch_prio_write()
>      -> bch_bucket_alloc()
>         -> wait on &ca->set->bucket_wait
>
> But the wake up event on bucket_wait is supposed to come from
> bch_allocator_thread() itself causing a deadlock.
>
> [Test Case]
>
> This is a simple script that can easily trigger the deadlock condition:
> https://launchpadlibrarian.net/381282009/bcache-basic-repro.sh
>
> A better test case has been also provided in LP: #1796292:
> https://bugs.launchpad.net/curtin/+bug/1796292/+attachment/5280353/+files/curtin-nvme.sh
>
> [Fix]
>
> Fix by making the call to bch_prio_write() non-blocking, so that
> bch_allocator_thread() never waits on itself. Moreover, make sure to
> wake up the garbage collector thread when bch_prio_write() is failing to
> allocate buckets to increase the chance of freeing up more buckets.
>
> In addition to that it would be safer to also import other upstream
> bcache fixes (all clean cherry picks):
>
> eb8cbb6df38f6e5124a3d5f1f8a3dbf519537c60 bcache: improve bcache_reboot()
> 9951379b0ca88c95876ad9778b9099e19a95d566 bcache: never writeback a discard operation
>
> [Regression Potential]
>
> The upstream fixes are all clean cherry picks from stable (most of them
> are small cleanups), so regression potential is minimal.
>
> The only special patch is "UBUNTU: SAUCE: bcache: fix deadlock in
> bcache_allocator" that is addressing the main deadlock bug (that seems
> to be a mainline bug - not fixed yet). We should spend more time trying
> to reproduce this deadlock with a mainline kernel and post the patch to
> the LKML for review / feedback.
>
> However, considering that this patch seems to fix/prevent the specific
> deadlock problem reported in this bug (tested on the affected platform),
> it should be considered safe to apply it as it is for now, to prevent
> potential hung task timeout conditions.
>
> Changes in v2:
>  - fix potential buckets leak in "UBUNTU: SAUCE: bcache: fix deadlock in
>    bcache_allocator"
>
> ----------------------------------------------------------------
> Andrea Righi (1):
>       UBUNTU: SAUCE: bcache: fix deadlock in bcache_allocator
>
> Coly Li (1):
>       bcache: improve bcache_reboot()
>
> Daniel Axtens (1):
>       bcache: never writeback a discard operation
>
>  drivers/md/bcache/alloc.c     |  5 ++++-
>  drivers/md/bcache/bcache.h    |  2 +-
>  drivers/md/bcache/super.c     | 39 +++++++++++++++++++++++++++++++--------
>  drivers/md/bcache/writeback.h |  3 +++
>  4 files changed, 39 insertions(+), 10 deletions(-)
>
>
Acked-by: Stefan Bader <[hidden email]>


--
kernel-team mailing list
[hidden email]
https://lists.ubuntu.com/mailman/listinfo/kernel-team

signature.asc (849 bytes) Download Attachment
Reply | Threaded
Open this post in threaded view
|

ACK: [SRU][D][PATCH v2 0/3] bcache: fix hung task timeout in bch_bucket_alloc()

Kleber Sacilotto de Souza
In reply to this post by Andrea Righi
On 8/7/19 2:46 PM, Andrea Righi wrote:

> BugLink: https://bugs.launchpad.net/bugs/1784665
>
> [Impact]
>
> bcache_allocator can call the following:
>
>  bch_allocator_thread()
>   -> bch_prio_write()
>      -> bch_bucket_alloc()
>         -> wait on &ca->set->bucket_wait
>
> But the wake up event on bucket_wait is supposed to come from
> bch_allocator_thread() itself causing a deadlock.
>
> [Test Case]
>
> This is a simple script that can easily trigger the deadlock condition:
> https://launchpadlibrarian.net/381282009/bcache-basic-repro.sh
>
> A better test case has been also provided in LP: #1796292:
> https://bugs.launchpad.net/curtin/+bug/1796292/+attachment/5280353/+files/curtin-nvme.sh
>
> [Fix]
>
> Fix by making the call to bch_prio_write() non-blocking, so that
> bch_allocator_thread() never waits on itself. Moreover, make sure to
> wake up the garbage collector thread when bch_prio_write() is failing to
> allocate buckets to increase the chance of freeing up more buckets.
>
> In addition to that it would be safer to also import other upstream
> bcache fixes (all clean cherry picks):
>
> eb8cbb6df38f6e5124a3d5f1f8a3dbf519537c60 bcache: improve bcache_reboot()
> 9951379b0ca88c95876ad9778b9099e19a95d566 bcache: never writeback a discard operation
>
> [Regression Potential]
>
> The upstream fixes are all clean cherry picks from stable (most of them
> are small cleanups), so regression potential is minimal.
>
> The only special patch is "UBUNTU: SAUCE: bcache: fix deadlock in
> bcache_allocator" that is addressing the main deadlock bug (that seems
> to be a mainline bug - not fixed yet). We should spend more time trying
> to reproduce this deadlock with a mainline kernel and post the patch to
> the LKML for review / feedback.
>
> However, considering that this patch seems to fix/prevent the specific
> deadlock problem reported in this bug (tested on the affected platform),
> it should be considered safe to apply it as it is for now, to prevent
> potential hung task timeout conditions.
>
> Changes in v2:
>  - fix potential buckets leak in "UBUNTU: SAUCE: bcache: fix deadlock in
>    bcache_allocator"
>
> ----------------------------------------------------------------
> Andrea Righi (1):
>       UBUNTU: SAUCE: bcache: fix deadlock in bcache_allocator
>
> Coly Li (1):
>       bcache: improve bcache_reboot()
>
> Daniel Axtens (1):
>       bcache: never writeback a discard operation
>
>  drivers/md/bcache/alloc.c     |  5 ++++-
>  drivers/md/bcache/bcache.h    |  2 +-
>  drivers/md/bcache/super.c     | 39 +++++++++++++++++++++++++++++++--------
>  drivers/md/bcache/writeback.h |  3 +++
>  4 files changed, 39 insertions(+), 10 deletions(-)
>
>


Acked-by: Kleber Sacilotto de Souza <[hidden email]>

--
kernel-team mailing list
[hidden email]
https://lists.ubuntu.com/mailman/listinfo/kernel-team
Reply | Threaded
Open this post in threaded view
|

APPLIED: [SRU][D][PATCH v2 0/3] bcache: fix hung task timeout in bch_bucket_alloc()

Khalid Elmously
In reply to this post by Andrea Righi
On 2019-08-07 14:46:14 , Andrea Righi wrote:

> BugLink: https://bugs.launchpad.net/bugs/1784665
>
> [Impact]
>
> bcache_allocator can call the following:
>
>  bch_allocator_thread()
>   -> bch_prio_write()
>      -> bch_bucket_alloc()
>         -> wait on &ca->set->bucket_wait
>
> But the wake up event on bucket_wait is supposed to come from
> bch_allocator_thread() itself causing a deadlock.
>
> [Test Case]
>
> This is a simple script that can easily trigger the deadlock condition:
> https://launchpadlibrarian.net/381282009/bcache-basic-repro.sh
>
> A better test case has been also provided in LP: #1796292:
> https://bugs.launchpad.net/curtin/+bug/1796292/+attachment/5280353/+files/curtin-nvme.sh
>
> [Fix]
>
> Fix by making the call to bch_prio_write() non-blocking, so that
> bch_allocator_thread() never waits on itself. Moreover, make sure to
> wake up the garbage collector thread when bch_prio_write() is failing to
> allocate buckets to increase the chance of freeing up more buckets.
>
> In addition to that it would be safer to also import other upstream
> bcache fixes (all clean cherry picks):
>
> eb8cbb6df38f6e5124a3d5f1f8a3dbf519537c60 bcache: improve bcache_reboot()
> 9951379b0ca88c95876ad9778b9099e19a95d566 bcache: never writeback a discard operation
>
> [Regression Potential]
>
> The upstream fixes are all clean cherry picks from stable (most of them
> are small cleanups), so regression potential is minimal.
>
> The only special patch is "UBUNTU: SAUCE: bcache: fix deadlock in
> bcache_allocator" that is addressing the main deadlock bug (that seems
> to be a mainline bug - not fixed yet). We should spend more time trying
> to reproduce this deadlock with a mainline kernel and post the patch to
> the LKML for review / feedback.
>
> However, considering that this patch seems to fix/prevent the specific
> deadlock problem reported in this bug (tested on the affected platform),
> it should be considered safe to apply it as it is for now, to prevent
> potential hung task timeout conditions.
>
> Changes in v2:
>  - fix potential buckets leak in "UBUNTU: SAUCE: bcache: fix deadlock in
>    bcache_allocator"
>
> ----------------------------------------------------------------
> Andrea Righi (1):
>       UBUNTU: SAUCE: bcache: fix deadlock in bcache_allocator
>
> Coly Li (1):
>       bcache: improve bcache_reboot()
>
> Daniel Axtens (1):
>       bcache: never writeback a discard operation
>
>  drivers/md/bcache/alloc.c     |  5 ++++-
>  drivers/md/bcache/bcache.h    |  2 +-
>  drivers/md/bcache/super.c     | 39 +++++++++++++++++++++++++++++++--------
>  drivers/md/bcache/writeback.h |  3 +++
>  4 files changed, 39 insertions(+), 10 deletions(-)
>
>
> --
> kernel-team mailing list
> [hidden email]
> https://lists.ubuntu.com/mailman/listinfo/kernel-team

--
kernel-team mailing list
[hidden email]
https://lists.ubuntu.com/mailman/listinfo/kernel-team