[SRU X][PATCH 0/6] NFS FSCache Fixes: LP: #1774336, #1776277, #1776254

Previous Topic Next Topic
 
classic Classic list List threaded Threaded
9 messages Options
Reply | Threaded
Open this post in threaded view
|

[SRU X][PATCH 0/6] NFS FSCache Fixes: LP: #1774336, #1776277, #1776254

Daniel Axtens
Hi all,

This is a series of patches that solves a set of bugs experienced when
a NFS share with a FSCache/CacheFiles backend experiences heavy
load. There are 3 relevant LP bugs:

LP: #1774336 - FS-Cache: Assertion failed: FS-Cache: 6 == 5 is false -
We included a sauce patch to fix this because upstream hadn't done
anything in months. Then upstream acted, and they took a different
patch to the one I proposed. Revert that patch (#1), and bring in the
upstream replacement (#2, #3)

LP: #1776277 - error when tearing down a NFS share clashes with an
attempt to use a fscache object.  This is fixed by #4.

LP: #1776254 - race where a new cookie for an object could clash with
a cookie in the process of being destroyed. Fixed in #5 and #6.

SRU Justifications:
 - LP: #1774336 - inlined below.
 - LP: #1776277 - inlined in patch 4
 - LP: #1776254 - inlined in patch 5

Apologies that this is a bit messy. I didn't want to split the series
because it went in to upstream as one series, but I can split it to
match LP bugs and submit it piecewise if anyone would prefer.

== SRU Justification (LP: #177336) ==

[Impact]
Oops during heavy NFS + FSCache use:

[81738.886634] FS-Cache:
[81738.888281] FS-Cache: Assertion failed
[81738.889461] FS-Cache: 6 == 5 is false
[81738.890625] ------------[ cut here ]------------
[81738.891706] kernel BUG at /build/linux-hVVhWi/linux-4.4.0/fs/fscache/operation.c:494!

6 == 5 represents an operation being DEAD when it was not expected to be.

[Cause]
There is a race in fscache and cachefiles.

One thread is in cachefiles_read_waiter:
 1) object->work_lock is taken.
 2) the operation is added to the to_do list.
 3) the work lock is dropped.
 4) fscache_enqueue_retrieval is called, which takes a reference.

Another thread is in cachefiles_read_copier:
 1) object->work_lock is taken
 2) an item is popped off the to_do list.
 3) object->work_lock is dropped.
 4) some processing is done on the item, and fscache_put_retrieval() is called, dropping a reference.

Now if the this process in cachefiles_read_copier takes place *between* steps 3 and 4 in cachefiles_read_waiter, a reference will be dropped before it is taken, which leads to the objects reference count hitting zero, which leads to lifecycle events for the object happening too soon, leading to the assertion failure later on.

(This is simplified and clarified from the original upstream analysis for this patch at https://www.redhat.com/archives/linux-cachefs/2018-February/msg00001.html and from a similar patch with a different approach to fixing the bug at https://www.redhat.com/archives/linux-cachefs/2017-June/msg00002.html)

[Fix]

 (Old sauce patch being reverted) Move fscache_enqueue_retrieval under the lock in cachefiles_read_waiter. This means that the object cannot be popped off the to_do list until it is in a fully consistent state with the reference taken.

 (New upstream patch) Explicitly take a reference to the object while it is being enqueued. Adjust another part of the code to deal with the greater range of object states this exposes.

[Testcase]
A user has run ~100 hours of NFS stress tests and not seen this bug recur.

[Regression Potential]
 - Limited to fscache/cachefiles.
 - The change makes things more conservative (taking more references) so that's reassuring.
 - There may be performance impacts but none have been observed so far.

=================

Daniel Axtens (1):
  Revert "UBUNTU: SAUCE: CacheFiles: fix a read_waiter/read_copier race"

Kiran Kumar Modukuri (5):
  fscache: Allow cancelled operations to be enqueued
  cachefiles: Fix refcounting bug in backing-file read monitoring
  fscache: Fix reference overput in fscache_attach_object() error
    handling
  cachefiles: Fix missing clear of the CACHEFILES_OBJECT_ACTIVE flag
  cachefiles: Wait rather than BUG'ing on "Unexpected object collision"

 fs/cachefiles/bind.c   |  3 ++-
 fs/cachefiles/namei.c  |  4 ++--
 fs/cachefiles/rdwr.c   | 17 ++++++++++++-----
 fs/fscache/cache.c     |  2 +-
 fs/fscache/cookie.c    |  7 ++++---
 fs/fscache/object.c    |  1 +
 fs/fscache/operation.c |  6 ++++--
 7 files changed, 26 insertions(+), 14 deletions(-)

--
2.17.1


--
kernel-team mailing list
[hidden email]
https://lists.ubuntu.com/mailman/listinfo/kernel-team
Reply | Threaded
Open this post in threaded view
|

[SRU X][PATCH 1/6] Revert "UBUNTU: SAUCE: CacheFiles: fix a read_waiter/read_copier race"

Daniel Axtens
BugLink: https://bugs.launchpad.net/bugs/1774336

Revert 9bd2e21fbe3786e23ce8a23d3e51f1e04f1f7800 in Xenial.
Upstream has taken a different solution, which we're about to apply.

Signed-off-by: Daniel Axtens <[hidden email]>
---
 fs/cachefiles/rdwr.c | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/fs/cachefiles/rdwr.c b/fs/cachefiles/rdwr.c
index d9bb47d1e16d..c0f3da3926a0 100644
--- a/fs/cachefiles/rdwr.c
+++ b/fs/cachefiles/rdwr.c
@@ -58,9 +58,9 @@ static int cachefiles_read_waiter(wait_queue_t *wait, unsigned mode,
 
  spin_lock(&object->work_lock);
  list_add_tail(&monitor->op_link, &monitor->op->to_do);
- fscache_enqueue_retrieval(monitor->op);
  spin_unlock(&object->work_lock);
 
+ fscache_enqueue_retrieval(monitor->op);
  return 0;
 }
 
--
2.17.1


--
kernel-team mailing list
[hidden email]
https://lists.ubuntu.com/mailman/listinfo/kernel-team
Reply | Threaded
Open this post in threaded view
|

[SRU X][PATCH 2/6] fscache: Allow cancelled operations to be enqueued

Daniel Axtens
In reply to this post by Daniel Axtens
From: Kiran Kumar Modukuri <[hidden email]>

BugLink: https://bugs.launchpad.net/bugs/1774336

Alter the state-check assertion in fscache_enqueue_operation() to allow
cancelled operations to be given processing time so they can be cleaned up.

Also fix a debugging statement that was requiring such operations to have
an object assigned.

Fixes: 9ae326a69004 ("CacheFiles: A cache that backs onto a mounted filesystem")
Reported-by: Kiran Kumar Modukuri <[hidden email]>
Signed-off-by: David Howells <[hidden email]>
(cherry picked from commit d0eb06afe712b7b103b6361f40a9a0c638524669)
Signed-off-by: Daniel Axtens <[hidden email]>
---
 fs/fscache/operation.c | 6 ++++--
 1 file changed, 4 insertions(+), 2 deletions(-)

diff --git a/fs/fscache/operation.c b/fs/fscache/operation.c
index de67745e1cd7..77946d6f617d 100644
--- a/fs/fscache/operation.c
+++ b/fs/fscache/operation.c
@@ -66,7 +66,8 @@ void fscache_enqueue_operation(struct fscache_operation *op)
  ASSERT(op->processor != NULL);
  ASSERT(fscache_object_is_available(op->object));
  ASSERTCMP(atomic_read(&op->usage), >, 0);
- ASSERTCMP(op->state, ==, FSCACHE_OP_ST_IN_PROGRESS);
+ ASSERTIFCMP(op->state != FSCACHE_OP_ST_IN_PROGRESS,
+    op->state, ==,  FSCACHE_OP_ST_CANCELLED);
 
  fscache_stat(&fscache_n_op_enqueue);
  switch (op->flags & FSCACHE_OP_TYPE) {
@@ -481,7 +482,8 @@ void fscache_put_operation(struct fscache_operation *op)
  struct fscache_cache *cache;
 
  _enter("{OBJ%x OP%x,%d}",
-       op->object->debug_id, op->debug_id, atomic_read(&op->usage));
+       op->object ? op->object->debug_id : 0,
+       op->debug_id, atomic_read(&op->usage));
 
  ASSERTCMP(atomic_read(&op->usage), >, 0);
 
--
2.17.1


--
kernel-team mailing list
[hidden email]
https://lists.ubuntu.com/mailman/listinfo/kernel-team
Reply | Threaded
Open this post in threaded view
|

[SRU X][PATCH 3/6] cachefiles: Fix refcounting bug in backing-file read monitoring

Daniel Axtens
In reply to this post by Daniel Axtens
From: Kiran Kumar Modukuri <[hidden email]>

BugLink: https://bugs.launchpad.net/bugs/1774336

cachefiles_read_waiter() has the right to access a 'monitor' object by
virtue of being called under the waitqueue lock for one of the pages in its
purview.  However, it has no ref on that monitor object or on the
associated operation.

What it is allowed to do is to move the monitor object to the operation's
to_do list, but once it drops the work_lock, it's actually no longer
permitted to access that object.  However, it is trying to enqueue the
retrieval operation for processing - but it can only do this via a pointer
in the monitor object, something it shouldn't be doing.

If it doesn't enqueue the operation, the operation may not get processed.
If the order is flipped so that the enqueue is first, then it's possible
for the work processor to look at the to_do list before the monitor is
enqueued upon it.

Fix this by getting a ref on the operation so that we can trust that it
will still be there once we've added the monitor to the to_do list and
dropped the work_lock.  The op can then be enqueued after the lock is
dropped.

The bug can manifest in one of a couple of ways.  The first manifestation
looks like:

 FS-Cache:
 FS-Cache: Assertion failed
 FS-Cache: 6 == 5 is false
 ------------[ cut here ]------------
 kernel BUG at fs/fscache/operation.c:494!
 RIP: 0010:fscache_put_operation+0x1e3/0x1f0
 ...
 fscache_op_work_func+0x26/0x50
 process_one_work+0x131/0x290
 worker_thread+0x45/0x360
 kthread+0xf8/0x130
 ? create_worker+0x190/0x190
 ? kthread_cancel_work_sync+0x10/0x10
 ret_from_fork+0x1f/0x30

This is due to the operation being in the DEAD state (6) rather than
INITIALISED, COMPLETE or CANCELLED (5) because it's already passed through
fscache_put_operation().

The bug can also manifest like the following:

 kernel BUG at fs/fscache/operation.c:69!
 ...
    [exception RIP: fscache_enqueue_operation+246]
 ...
 #7 [ffff883fff083c10] fscache_enqueue_operation at ffffffffa0b793c6
 #8 [ffff883fff083c28] cachefiles_read_waiter at ffffffffa0b15a48
 #9 [ffff883fff083c48] __wake_up_common at ffffffff810af028

I'm not entirely certain as to which is line 69 in Lei's kernel, so I'm not
entirely clear which assertion failed.

Fixes: 9ae326a69004 ("CacheFiles: A cache that backs onto a mounted filesystem")
Reported-by: Lei Xue <[hidden email]>
Reported-by: Vegard Nossum <[hidden email]>
Reported-by: Anthony DeRobertis <[hidden email]>
Reported-by: NeilBrown <[hidden email]>
Reported-by: Daniel Axtens <[hidden email]>
Reported-by: Kiran Kumar Modukuri <[hidden email]>
Signed-off-by: David Howells <[hidden email]>
Reviewed-by: Daniel Axtens <[hidden email]>
(cherry picked from commit 934140ab028713a61de8bca58c05332416d037d1)
Signed-off-by: Daniel Axtens <[hidden email]>
---
 fs/cachefiles/rdwr.c | 17 ++++++++++++-----
 1 file changed, 12 insertions(+), 5 deletions(-)

diff --git a/fs/cachefiles/rdwr.c b/fs/cachefiles/rdwr.c
index c0f3da3926a0..5b68cf526887 100644
--- a/fs/cachefiles/rdwr.c
+++ b/fs/cachefiles/rdwr.c
@@ -27,6 +27,7 @@ static int cachefiles_read_waiter(wait_queue_t *wait, unsigned mode,
  struct cachefiles_one_read *monitor =
  container_of(wait, struct cachefiles_one_read, monitor);
  struct cachefiles_object *object;
+ struct fscache_retrieval *op = monitor->op;
  struct wait_bit_key *key = _key;
  struct page *page = wait->private;
 
@@ -51,16 +52,22 @@ static int cachefiles_read_waiter(wait_queue_t *wait, unsigned mode,
  list_del(&wait->task_list);
 
  /* move onto the action list and queue for FS-Cache thread pool */
- ASSERT(monitor->op);
+ ASSERT(op);
 
- object = container_of(monitor->op->op.object,
-      struct cachefiles_object, fscache);
+ /* We need to temporarily bump the usage count as we don't own a ref
+ * here otherwise cachefiles_read_copier() may free the op between the
+ * monitor being enqueued on the op->to_do list and the op getting
+ * enqueued on the work queue.
+ */
+ fscache_get_retrieval(op);
 
+ object = container_of(op->op.object, struct cachefiles_object, fscache);
  spin_lock(&object->work_lock);
- list_add_tail(&monitor->op_link, &monitor->op->to_do);
+ list_add_tail(&monitor->op_link, &op->to_do);
  spin_unlock(&object->work_lock);
 
- fscache_enqueue_retrieval(monitor->op);
+ fscache_enqueue_retrieval(op);
+ fscache_put_retrieval(op);
  return 0;
 }
 
--
2.17.1


--
kernel-team mailing list
[hidden email]
https://lists.ubuntu.com/mailman/listinfo/kernel-team
Reply | Threaded
Open this post in threaded view
|

[SRU X][PATCH 4/6] fscache: Fix reference overput in fscache_attach_object() error handling

Daniel Axtens
In reply to this post by Daniel Axtens
From: Kiran Kumar Modukuri <[hidden email]>

BugLink: https://bugs.launchpad.net/bugs/1776277

When a cookie is allocated that causes fscache_object structs to be
allocated, those objects are initialised with the cookie pointer, but
aren't blessed with a ref on that cookie unless the attachment is
successfully completed in fscache_attach_object().

If attachment fails because the parent object was dying or there was a
collision, fscache_attach_object() returns without incrementing the cookie
counter - but upon failure of this function, the object is released which
then puts the cookie, whether or not a ref was taken on the cookie.

Fix this by taking a ref on the cookie when it is assigned in
fscache_object_init(), even when we're creating a root object.

Analysis from Kiran Kumar:

This bug has been seen in 4.4.0-124-generic #148-Ubuntu kernel

BugLink: https://bugs.launchpad.net/ubuntu/+source/linux/+bug/1776277

fscache cookie ref count updated incorrectly during fscache object
allocation resulting in following Oops.

kernel BUG at /build/linux-Y09MKI/linux-4.4.0/fs/fscache/internal.h:321!
kernel BUG at /build/linux-Y09MKI/linux-4.4.0/fs/fscache/cookie.c:639!

[Cause]
Two threads are trying to do operate on a cookie and two objects.

(1) One thread tries to unmount the filesystem and in process goes over a
    huge list of objects marking them dead and deleting the objects.
    cookie->usage is also decremented in following path:

      nfs_fscache_release_super_cookie
       -> __fscache_relinquish_cookie
        ->__fscache_cookie_put
        ->BUG_ON(atomic_read(&cookie->usage) <= 0);

(2) A second thread tries to lookup an object for reading data in following
    path:

    fscache_alloc_object
    1) cachefiles_alloc_object
        -> fscache_object_init
           -> assign cookie, but usage not bumped.
    2) fscache_attach_object -> fails in cant_attach_object because the
         cookie's backing object or cookie's->parent object are going away
    3) fscache_put_object
        -> cachefiles_put_object
          ->fscache_object_destroy
            ->fscache_cookie_put
               ->BUG_ON(atomic_read(&cookie->usage) <= 0);

[NOTE from dhowells] It's unclear as to the circumstances in which (2) can
take place, given that thread (1) is in nfs_kill_super(), however a
conflicting NFS mount with slightly different parameters that creates a
different superblock would do it.  A backtrace from Kiran seems to show
that this is a possibility:

    kernel BUG at/build/linux-Y09MKI/linux-4.4.0/fs/fscache/cookie.c:639!
    ...
    RIP: __fscache_cookie_put+0x3a/0x40 [fscache]
    Call Trace:
     __fscache_relinquish_cookie+0x87/0x120 [fscache]
     nfs_fscache_release_super_cookie+0x2d/0xb0 [nfs]
     nfs_kill_super+0x29/0x40 [nfs]
     deactivate_locked_super+0x48/0x80
     deactivate_super+0x5c/0x60
     cleanup_mnt+0x3f/0x90
     __cleanup_mnt+0x12/0x20
     task_work_run+0x86/0xb0
     exit_to_usermode_loop+0xc2/0xd0
     syscall_return_slowpath+0x4e/0x60
     int_ret_from_sys_call+0x25/0x9f

[Fix] Bump up the cookie usage in fscache_object_init, when it is first
being assigned a cookie atomically such that the cookie is added and bumped
up if its refcount is not zero.  Remove the assignment in
fscache_attach_object().

[Testcase]
I have run ~100 hours of NFS stress tests and not seen this bug recur.

[Regression Potential]
 - Limited to fscache/cachefiles.

Fixes: ccc4fc3d11e9 ("FS-Cache: Implement the cookie management part of the netfs API")
Signed-off-by: Kiran Kumar Modukuri <[hidden email]>
Signed-off-by: David Howells <[hidden email]>
(backported from commit f29507ce66701084c39aeb1b0ae71690cbff3554)
Signed-off-by: Daniel Axtens <[hidden email]>
---
 fs/cachefiles/bind.c | 3 ++-
 fs/fscache/cache.c   | 2 +-
 fs/fscache/cookie.c  | 7 ++++---
 fs/fscache/object.c  | 1 +
 4 files changed, 8 insertions(+), 5 deletions(-)

diff --git a/fs/cachefiles/bind.c b/fs/cachefiles/bind.c
index 6af790fc3df8..c49944b6d9fe 100644
--- a/fs/cachefiles/bind.c
+++ b/fs/cachefiles/bind.c
@@ -218,7 +218,8 @@ static int cachefiles_daemon_add_cache(struct cachefiles_cache *cache)
    "%s",
    fsdef->dentry->d_sb->s_id);
 
- fscache_object_init(&fsdef->fscache, NULL, &cache->cache);
+ fscache_object_init(&fsdef->fscache, &fscache_fsdef_index,
+    &cache->cache);
 
  ret = fscache_add_cache(&cache->cache, &fsdef->fscache, cache->tag);
  if (ret < 0)
diff --git a/fs/fscache/cache.c b/fs/fscache/cache.c
index 56cce7fdd39e..a8fe6e63ca23 100644
--- a/fs/fscache/cache.c
+++ b/fs/fscache/cache.c
@@ -220,6 +220,7 @@ int fscache_add_cache(struct fscache_cache *cache,
 {
  struct fscache_cache_tag *tag;
 
+ ASSERTCMP(ifsdef->cookie, ==, &fscache_fsdef_index);
  BUG_ON(!cache->ops);
  BUG_ON(!ifsdef);
 
@@ -248,7 +249,6 @@ int fscache_add_cache(struct fscache_cache *cache,
  if (!cache->kobj)
  goto error;
 
- ifsdef->cookie = &fscache_fsdef_index;
  ifsdef->cache = cache;
  cache->fsdef = ifsdef;
 
diff --git a/fs/fscache/cookie.c b/fs/fscache/cookie.c
index 40d61077bead..91f56e066971 100644
--- a/fs/fscache/cookie.c
+++ b/fs/fscache/cookie.c
@@ -302,6 +302,7 @@ static int fscache_alloc_object(struct fscache_cache *cache,
  goto error;
  }
 
+ ASSERTCMP(object->cookie, ==, cookie);
  fscache_stat(&fscache_n_object_alloc);
 
  object->debug_id = atomic_inc_return(&fscache_object_debug_id);
@@ -357,6 +358,8 @@ static int fscache_attach_object(struct fscache_cookie *cookie,
 
  _enter("{%s},{OBJ%x}", cookie->def->name, object->debug_id);
 
+ ASSERTCMP(object->cookie, ==, cookie);
+
  spin_lock(&cookie->lock);
 
  /* there may be multiple initial creations of this object, but we only
@@ -396,9 +399,7 @@ static int fscache_attach_object(struct fscache_cookie *cookie,
  spin_unlock(&cache->object_list_lock);
  }
 
- /* attach to the cookie */
- object->cookie = cookie;
- atomic_inc(&cookie->usage);
+ /* Attach to the cookie.  The object already has a ref on it. */
  hlist_add_head(&object->cookie_link, &cookie->backing_objects);
 
  fscache_objlist_add(object);
diff --git a/fs/fscache/object.c b/fs/fscache/object.c
index 7a182c87f378..4ae441f65af2 100644
--- a/fs/fscache/object.c
+++ b/fs/fscache/object.c
@@ -318,6 +318,7 @@ void fscache_object_init(struct fscache_object *object,
  object->store_limit_l = 0;
  object->cache = cache;
  object->cookie = cookie;
+ atomic_inc(&cookie->usage);
  object->parent = NULL;
 #ifdef CONFIG_FSCACHE_OBJECT_LIST
  RB_CLEAR_NODE(&object->objlist_link);
--
2.17.1


--
kernel-team mailing list
[hidden email]
https://lists.ubuntu.com/mailman/listinfo/kernel-team
Reply | Threaded
Open this post in threaded view
|

[SRU X][PATCH 5/6] cachefiles: Fix missing clear of the CACHEFILES_OBJECT_ACTIVE flag

Daniel Axtens
In reply to this post by Daniel Axtens
From: Kiran Kumar Modukuri <[hidden email]>

BugLink: https://bugs.launchpad.net/bugs/1776254

In cachefiles_mark_object_active(), the new object is marked active and
then we try to add it to the active object tree.  If a conflicting object
is already present, we want to wait for that to go away.  After the wait,
we go round again and try to re-mark the object as being active - but it's
already marked active from the first time we went through and a BUG is
issued.

Fix this by clearing the CACHEFILES_OBJECT_ACTIVE flag before we try again.

Analysis from Kiran Kumar Modukuri:

[Impact]
Oops during heavy NFS + FSCache + Cachefiles

CacheFiles: Error: Overlong wait for old active object to go away.

BUG: unable to handle kernel NULL pointer dereference at 0000000000000002

CacheFiles: Error: Object already active kernel BUG at
fs/cachefiles/namei.c:163!

[Cause]
In a heavily loaded system with big files being read and truncated, an
fscache object for a cookie is being dropped and a new object being
looked. The new object being looked for has to wait for the old object
to go away before the new object is moved to active state.

[Fix]
Clear the flag 'CACHEFILES_OBJECT_ACTIVE' for the new object when
retrying the object lookup.

[Testcase]
Have run ~100 hours of NFS stress tests and have not seen this bug recur.

[Regression Potential]
 - Limited to fscache/cachefiles.

Fixes: 9ae326a69004 ("CacheFiles: A cache that backs onto a mounted filesystem")
Signed-off-by: Kiran Kumar Modukuri <[hidden email]>
Signed-off-by: David Howells <[hidden email]>
(backported from commit 5ce83d4bb7d8e11e8c1c687d09f4b5ae67ef3ce3)
Signed-off-by: Daniel Axtens <[hidden email]>
---
 fs/cachefiles/namei.c | 3 ++-
 1 file changed, 2 insertions(+), 1 deletion(-)

diff --git a/fs/cachefiles/namei.c b/fs/cachefiles/namei.c
index c4b893453e0e..2bb1bae8d3bc 100644
--- a/fs/cachefiles/namei.c
+++ b/fs/cachefiles/namei.c
@@ -190,6 +190,8 @@ try_again:
  /* an old object from a previous incarnation is hogging the slot - we
  * need to wait for it to be destroyed */
 wait_for_old_object:
+ clear_bit(CACHEFILES_OBJECT_ACTIVE, &object->flags);
+
  if (fscache_object_is_live(&xobject->fscache)) {
  pr_err("\n");
  pr_err("Error: Unexpected object collision\n");
@@ -251,7 +253,6 @@ wait_for_old_object:
  goto try_again;
 
 requeue:
- clear_bit(CACHEFILES_OBJECT_ACTIVE, &object->flags);
  cache->cache.ops->put_object(&xobject->fscache);
  _leave(" = -ETIMEDOUT");
  return -ETIMEDOUT;
--
2.17.1


--
kernel-team mailing list
[hidden email]
https://lists.ubuntu.com/mailman/listinfo/kernel-team
Reply | Threaded
Open this post in threaded view
|

[SRU X][PATCH 6/6] cachefiles: Wait rather than BUG'ing on "Unexpected object collision"

Daniel Axtens
In reply to this post by Daniel Axtens
From: Kiran Kumar Modukuri <[hidden email]>

BugLink: https://bugs.launchpad.net/bugs/1776254

If we meet a conflicting object that is marked FSCACHE_OBJECT_IS_LIVE in
the active object tree, we have been emitting a BUG after logging
information about it and the new object.

Instead, we should wait for the CACHEFILES_OBJECT_ACTIVE flag to be cleared
on the old object (or return an error).  The ACTIVE flag should be cleared
after it has been removed from the active object tree.  A timeout of 60s is
used in the wait, so we shouldn't be able to get stuck there.

Fixes: 9ae326a69004 ("CacheFiles: A cache that backs onto a mounted filesystem")
Signed-off-by: Kiran Kumar Modukuri <[hidden email]>
Signed-off-by: David Howells <[hidden email]>
(cherry picked from commit c2412ac45a8f8f1cd582723c1a139608694d410d)
Signed-off-by: Daniel Axtens <[hidden email]>
---
 fs/cachefiles/namei.c | 1 -
 1 file changed, 1 deletion(-)

diff --git a/fs/cachefiles/namei.c b/fs/cachefiles/namei.c
index 2bb1bae8d3bc..2cd5b7ac3613 100644
--- a/fs/cachefiles/namei.c
+++ b/fs/cachefiles/namei.c
@@ -196,7 +196,6 @@ wait_for_old_object:
  pr_err("\n");
  pr_err("Error: Unexpected object collision\n");
  cachefiles_printk_object(object, xobject);
- BUG();
  }
  atomic_inc(&xobject->usage);
  write_unlock(&cache->active_lock);
--
2.17.1


--
kernel-team mailing list
[hidden email]
https://lists.ubuntu.com/mailman/listinfo/kernel-team
Reply | Threaded
Open this post in threaded view
|

ACK: [SRU X][PATCH 0/6] NFS FSCache Fixes: LP: #1774336, #1776277, #1776254

Kamal Mostafa-2
In reply to this post by Daniel Axtens

Acked-by: Kamal Mostafa <[hidden email]>

On Thu, Aug 02, 2018 at 02:18:04PM +1000, Daniel Axtens wrote:

> Hi all,
>
> This is a series of patches that solves a set of bugs experienced when
> a NFS share with a FSCache/CacheFiles backend experiences heavy
> load. There are 3 relevant LP bugs:
>
> LP: #1774336 - FS-Cache: Assertion failed: FS-Cache: 6 == 5 is false -
> We included a sauce patch to fix this because upstream hadn't done
> anything in months. Then upstream acted, and they took a different
> patch to the one I proposed. Revert that patch (#1), and bring in the
> upstream replacement (#2, #3)
>
> LP: #1776277 - error when tearing down a NFS share clashes with an
> attempt to use a fscache object.  This is fixed by #4.
>
> LP: #1776254 - race where a new cookie for an object could clash with
> a cookie in the process of being destroyed. Fixed in #5 and #6.
>
> SRU Justifications:
>  - LP: #1774336 - inlined below.
>  - LP: #1776277 - inlined in patch 4
>  - LP: #1776254 - inlined in patch 5
>
> Apologies that this is a bit messy. I didn't want to split the series
> because it went in to upstream as one series, but I can split it to
> match LP bugs and submit it piecewise if anyone would prefer.
>
> == SRU Justification (LP: #177336) ==
>
> [Impact]
> Oops during heavy NFS + FSCache use:
>
> [81738.886634] FS-Cache:
> [81738.888281] FS-Cache: Assertion failed
> [81738.889461] FS-Cache: 6 == 5 is false
> [81738.890625] ------------[ cut here ]------------
> [81738.891706] kernel BUG at /build/linux-hVVhWi/linux-4.4.0/fs/fscache/operation.c:494!
>
> 6 == 5 represents an operation being DEAD when it was not expected to be.
>
> [Cause]
> There is a race in fscache and cachefiles.
>
> One thread is in cachefiles_read_waiter:
>  1) object->work_lock is taken.
>  2) the operation is added to the to_do list.
>  3) the work lock is dropped.
>  4) fscache_enqueue_retrieval is called, which takes a reference.
>
> Another thread is in cachefiles_read_copier:
>  1) object->work_lock is taken
>  2) an item is popped off the to_do list.
>  3) object->work_lock is dropped.
>  4) some processing is done on the item, and fscache_put_retrieval() is called, dropping a reference.
>
> Now if the this process in cachefiles_read_copier takes place *between* steps 3 and 4 in cachefiles_read_waiter, a reference will be dropped before it is taken, which leads to the objects reference count hitting zero, which leads to lifecycle events for the object happening too soon, leading to the assertion failure later on.
>
> (This is simplified and clarified from the original upstream analysis for this patch at https://www.redhat.com/archives/linux-cachefs/2018-February/msg00001.html and from a similar patch with a different approach to fixing the bug at https://www.redhat.com/archives/linux-cachefs/2017-June/msg00002.html)
>
> [Fix]
>
>  (Old sauce patch being reverted) Move fscache_enqueue_retrieval under the lock in cachefiles_read_waiter. This means that the object cannot be popped off the to_do list until it is in a fully consistent state with the reference taken.
>
>  (New upstream patch) Explicitly take a reference to the object while it is being enqueued. Adjust another part of the code to deal with the greater range of object states this exposes.
>
> [Testcase]
> A user has run ~100 hours of NFS stress tests and not seen this bug recur.
>
> [Regression Potential]
>  - Limited to fscache/cachefiles.
>  - The change makes things more conservative (taking more references) so that's reassuring.
>  - There may be performance impacts but none have been observed so far.
>
> =================
>
> Daniel Axtens (1):
>   Revert "UBUNTU: SAUCE: CacheFiles: fix a read_waiter/read_copier race"
>
> Kiran Kumar Modukuri (5):
>   fscache: Allow cancelled operations to be enqueued
>   cachefiles: Fix refcounting bug in backing-file read monitoring
>   fscache: Fix reference overput in fscache_attach_object() error
>     handling
>   cachefiles: Fix missing clear of the CACHEFILES_OBJECT_ACTIVE flag
>   cachefiles: Wait rather than BUG'ing on "Unexpected object collision"
>
>  fs/cachefiles/bind.c   |  3 ++-
>  fs/cachefiles/namei.c  |  4 ++--
>  fs/cachefiles/rdwr.c   | 17 ++++++++++++-----
>  fs/fscache/cache.c     |  2 +-
>  fs/fscache/cookie.c    |  7 ++++---
>  fs/fscache/object.c    |  1 +
>  fs/fscache/operation.c |  6 ++++--
>  7 files changed, 26 insertions(+), 14 deletions(-)
>
> --
> 2.17.1
>
>
> --
> kernel-team mailing list
> [hidden email]
> https://lists.ubuntu.com/mailman/listinfo/kernel-team

--
kernel-team mailing list
[hidden email]
https://lists.ubuntu.com/mailman/listinfo/kernel-team
Reply | Threaded
Open this post in threaded view
|

ACK: [SRU X][PATCH 0/6] NFS FSCache Fixes: LP: #1774336, #1776277, #1776254

Khaled Elmously
In reply to this post by Daniel Axtens
On 2018-08-02 14:18:04 , Daniel Axtens wrote:

> Hi all,
>
> This is a series of patches that solves a set of bugs experienced when
> a NFS share with a FSCache/CacheFiles backend experiences heavy
> load. There are 3 relevant LP bugs:
>
> LP: #1774336 - FS-Cache: Assertion failed: FS-Cache: 6 == 5 is false -
> We included a sauce patch to fix this because upstream hadn't done
> anything in months. Then upstream acted, and they took a different
> patch to the one I proposed. Revert that patch (#1), and bring in the
> upstream replacement (#2, #3)
>
> LP: #1776277 - error when tearing down a NFS share clashes with an
> attempt to use a fscache object.  This is fixed by #4.
>
> LP: #1776254 - race where a new cookie for an object could clash with
> a cookie in the process of being destroyed. Fixed in #5 and #6.
>
> SRU Justifications:
>  - LP: #1774336 - inlined below.
>  - LP: #1776277 - inlined in patch 4
>  - LP: #1776254 - inlined in patch 5
>
> Apologies that this is a bit messy. I didn't want to split the series
> because it went in to upstream as one series, but I can split it to
> match LP bugs and submit it piecewise if anyone would prefer.
>
> == SRU Justification (LP: #177336) ==
>
> [Impact]
> Oops during heavy NFS + FSCache use:
>
> [81738.886634] FS-Cache:
> [81738.888281] FS-Cache: Assertion failed
> [81738.889461] FS-Cache: 6 == 5 is false
> [81738.890625] ------------[ cut here ]------------
> [81738.891706] kernel BUG at /build/linux-hVVhWi/linux-4.4.0/fs/fscache/operation.c:494!
>
> 6 == 5 represents an operation being DEAD when it was not expected to be.
>
> [Cause]
> There is a race in fscache and cachefiles.
>
> One thread is in cachefiles_read_waiter:
>  1) object->work_lock is taken.
>  2) the operation is added to the to_do list.
>  3) the work lock is dropped.
>  4) fscache_enqueue_retrieval is called, which takes a reference.
>
> Another thread is in cachefiles_read_copier:
>  1) object->work_lock is taken
>  2) an item is popped off the to_do list.
>  3) object->work_lock is dropped.
>  4) some processing is done on the item, and fscache_put_retrieval() is called, dropping a reference.
>
> Now if the this process in cachefiles_read_copier takes place *between* steps 3 and 4 in cachefiles_read_waiter, a reference will be dropped before it is taken, which leads to the objects reference count hitting zero, which leads to lifecycle events for the object happening too soon, leading to the assertion failure later on.
>
> (This is simplified and clarified from the original upstream analysis for this patch at https://www.redhat.com/archives/linux-cachefs/2018-February/msg00001.html and from a similar patch with a different approach to fixing the bug at https://www.redhat.com/archives/linux-cachefs/2017-June/msg00002.html)
>
> [Fix]
>
>  (Old sauce patch being reverted) Move fscache_enqueue_retrieval under the lock in cachefiles_read_waiter. This means that the object cannot be popped off the to_do list until it is in a fully consistent state with the reference taken.
>
>  (New upstream patch) Explicitly take a reference to the object while it is being enqueued. Adjust another part of the code to deal with the greater range of object states this exposes.
>
> [Testcase]
> A user has run ~100 hours of NFS stress tests and not seen this bug recur.
>
> [Regression Potential]
>  - Limited to fscache/cachefiles.
>  - The change makes things more conservative (taking more references) so that's reassuring.
>  - There may be performance impacts but none have been observed so far.
>
> =================
>
> Daniel Axtens (1):
>   Revert "UBUNTU: SAUCE: CacheFiles: fix a read_waiter/read_copier race"
>
> Kiran Kumar Modukuri (5):
>   fscache: Allow cancelled operations to be enqueued
>   cachefiles: Fix refcounting bug in backing-file read monitoring
>   fscache: Fix reference overput in fscache_attach_object() error
>     handling
>   cachefiles: Fix missing clear of the CACHEFILES_OBJECT_ACTIVE flag
>   cachefiles: Wait rather than BUG'ing on "Unexpected object collision"
>
>  fs/cachefiles/bind.c   |  3 ++-
>  fs/cachefiles/namei.c  |  4 ++--
>  fs/cachefiles/rdwr.c   | 17 ++++++++++++-----
>  fs/fscache/cache.c     |  2 +-
>  fs/fscache/cookie.c    |  7 ++++---
>  fs/fscache/object.c    |  1 +
>  fs/fscache/operation.c |  6 ++++--
>  7 files changed, 26 insertions(+), 14 deletions(-)
>

Acked-by: Khalid Elmously <[hidden email]>


--
kernel-team mailing list
[hidden email]
https://lists.ubuntu.com/mailman/listinfo/kernel-team