[SRU X][PATCH v2 0/1] Fix LP: #1793430 - cachefiles page leak

classic Classic list List threaded Threaded
5 messages Options
Reply | Threaded
Open this post in threaded view
|

[SRU X][PATCH v2 0/1] Fix LP: #1793430 - cachefiles page leak

Daniel Axtens
SRU Justification
-----------------

[Description]

In a heavily loaded system where the system pagecache is nearing
memory limits and fscache is enabled, pages can be leaked by fscache
while trying read pages from cachefiles backend. This can happen
because two applications can be reading same page from a single mount,
two threads can be trying to read the backing page at same time. This
results in one of the thread finding that a page for the backing file
or netfs file is already in the radix tree. During the error handling
cachefiles does not cleanup the reference on backing page, leading to
page leak.

[Fix]
The fix is straightforward, to decrement the reference when error is encounterd.

[Testing]
A user has tested the fix using following method for 12+ hrs.

    1) mkdir -p /mnt/nfs ; mount -o vers=3,fsc <server_ip>:/export /mnt/nfs
    2) create 10000 files of 2.8MB in a NFS mount.
    3) start a thread to simulate heavy VM presssure
       (while true ; do echo 3 > /proc/sys/vm/drop_caches ; sleep 1 ; done)&
    4) start multiple parallel reader for data set at same time
       find /mnt/nfs -type f | xargs -P 80 cat > /dev/null &
       find /mnt/nfs -type f | xargs -P 80 cat > /dev/null &
       find /mnt/nfs -type f | xargs -P 80 cat > /dev/null &
       ..
       ..
       find /mnt/nfs -type f | xargs -P 80 cat > /dev/null &
       find /mnt/nfs -type f | xargs -P 80 cat > /dev/null &
    5) finally check using cat /proc/fs/fscache/stats | grep -i pages ;
       free -h , cat /proc/meminfo and page-types -r -b lru
       to ensure all pages are freed.

[Regression Potential]
Limited to cachefiles.

[History]
v2: Address Seth's review. The Bionic/Cosmic version was posted as v3
upstream.

Kiran Kumar Modukuri (1):
  UBUNTU: SAUCE: cachefiles: Page leaking in
    cachefiles_read_backing_file while vmscan is active

 fs/cachefiles/rdwr.c | 8 ++++++++
 1 file changed, 8 insertions(+)

--
2.17.1


--
kernel-team mailing list
[hidden email]
https://lists.ubuntu.com/mailman/listinfo/kernel-team
Reply | Threaded
Open this post in threaded view
|

[SRU X][PATCH v2 1/1] UBUNTU: SAUCE: cachefiles: Page leaking in cachefiles_read_backing_file while vmscan is active

Daniel Axtens
From: Kiran Kumar Modukuri <[hidden email]>

BugLink: https://bugs.launchpad.net/bugs/1793430

[Description]
In a heavily loaded system where the system pagecache is nearing memory limits and fscache is enabled,
pages can be leaked by fscache while trying read pages from cachefiles backend.
This can happen because two applications can be reading same page from a single mount,
two threads can be trying to read the backing page at same time. This results in one of the thread
finding that a page for the backing file or netfs file is already in the radix tree. During the error
handling cachefiles does not cleanup the reference on backing page, leading to page leak.

[Fix]
The fix is straightforward, to decrement the reference when error is encounterd.

[Testing]
I have tested the fix using following method for 12+ hrs.

1) mkdir -p /mnt/nfs ; mount -o vers=3,fsc <server_ip>:/export /mnt/nfs
2) create 10000 files of 2.8MB in a NFS mount.
3) start a thread to simulate heavy VM presssure
   (while true ; do echo 3 > /proc/sys/vm/drop_caches ; sleep 1 ; done)&
4) start multiple parallel reader for data set at same time
   find /mnt/nfs -type f | xargs -P 80 cat > /dev/null &
   find /mnt/nfs -type f | xargs -P 80 cat > /dev/null &
   find /mnt/nfs -type f | xargs -P 80 cat > /dev/null &
   ..
   ..
   find /mnt/nfs -type f | xargs -P 80 cat > /dev/null &
   find /mnt/nfs -type f | xargs -P 80 cat > /dev/null &
5) finally check using cat /proc/fs/fscache/stats | grep -i pages ;
   free -h , cat /proc/meminfo and page-types -r -b lru
   to ensure all pages are freed.

Reviewed-by: Daniel Axtens <[hidden email]>
Signed-off-by: Shantanu Goel <[hidden email]>
Signed-off-by: Kiran Kumar Modukuri <[hidden email]>
[dja: forward ported to current upstream]
Signed-off-by: Daniel Axtens <[hidden email]>
[backported from
 https://www.redhat.com/archives/linux-cachefs/2018-September/msg00002.html
 This is v3 of the patch. v2 has sat on the list for weeks without
 any response or forward progress. v1 was first posted in 2014 and
 was reposted this August.]
Signed-off-by: Daniel Axtens <[hidden email]>

---

Canonical v2/Upstream v3: Clean up per Seth's feedback.
---
 fs/cachefiles/rdwr.c | 8 ++++++++
 1 file changed, 8 insertions(+)

diff --git a/fs/cachefiles/rdwr.c b/fs/cachefiles/rdwr.c
index 5b68cf526887..98e2b89dc47e 100644
--- a/fs/cachefiles/rdwr.c
+++ b/fs/cachefiles/rdwr.c
@@ -513,6 +513,8 @@ static int cachefiles_read_backing_file(struct cachefiles_object *object,
  goto installed_new_backing_page;
  if (ret != -EEXIST)
  goto nomem;
+ page_cache_release(newpage);
+ newpage = NULL;
  }
 
  /* we've installed a new backing page, so now we need
@@ -537,7 +539,10 @@ static int cachefiles_read_backing_file(struct cachefiles_object *object,
     netpage->index, cachefiles_gfp);
  if (ret < 0) {
  if (ret == -EEXIST) {
+ page_cache_release(backpage);
+ backpage = NULL;
  page_cache_release(netpage);
+ netpage = NULL;
  fscache_retrieval_complete(op, 1);
  continue;
  }
@@ -610,7 +615,10 @@ static int cachefiles_read_backing_file(struct cachefiles_object *object,
     netpage->index, cachefiles_gfp);
  if (ret < 0) {
  if (ret == -EEXIST) {
+ page_cache_release(backpage);
+ backpage = NULL;
  page_cache_release(netpage);
+ netpage = NULL;
  fscache_retrieval_complete(op, 1);
  continue;
  }
--
2.17.1


--
kernel-team mailing list
[hidden email]
https://lists.ubuntu.com/mailman/listinfo/kernel-team
Reply | Threaded
Open this post in threaded view
|

ACK: [SRU X][PATCH v2 0/1] Fix LP: #1793430 - cachefiles page leak

Kleber Souza
In reply to this post by Daniel Axtens
On 09/24/18 04:20, Daniel Axtens wrote:

> SRU Justification
> -----------------
>
> [Description]
>
> In a heavily loaded system where the system pagecache is nearing
> memory limits and fscache is enabled, pages can be leaked by fscache
> while trying read pages from cachefiles backend. This can happen
> because two applications can be reading same page from a single mount,
> two threads can be trying to read the backing page at same time. This
> results in one of the thread finding that a page for the backing file
> or netfs file is already in the radix tree. During the error handling
> cachefiles does not cleanup the reference on backing page, leading to
> page leak.
>
> [Fix]
> The fix is straightforward, to decrement the reference when error is encounterd.
>
> [Testing]
> A user has tested the fix using following method for 12+ hrs.
>
>     1) mkdir -p /mnt/nfs ; mount -o vers=3,fsc <server_ip>:/export /mnt/nfs
>     2) create 10000 files of 2.8MB in a NFS mount.
>     3) start a thread to simulate heavy VM presssure
>        (while true ; do echo 3 > /proc/sys/vm/drop_caches ; sleep 1 ; done)&
>     4) start multiple parallel reader for data set at same time
>        find /mnt/nfs -type f | xargs -P 80 cat > /dev/null &
>        find /mnt/nfs -type f | xargs -P 80 cat > /dev/null &
>        find /mnt/nfs -type f | xargs -P 80 cat > /dev/null &
>        ..
>        ..
>        find /mnt/nfs -type f | xargs -P 80 cat > /dev/null &
>        find /mnt/nfs -type f | xargs -P 80 cat > /dev/null &
>     5) finally check using cat /proc/fs/fscache/stats | grep -i pages ;
>        free -h , cat /proc/meminfo and page-types -r -b lru
>        to ensure all pages are freed.
>
> [Regression Potential]
> Limited to cachefiles.
>
> [History]
> v2: Address Seth's review. The Bionic/Cosmic version was posted as v3
> upstream.
>
> Kiran Kumar Modukuri (1):
>   UBUNTU: SAUCE: cachefiles: Page leaking in
>     cachefiles_read_backing_file while vmscan is active
>
>  fs/cachefiles/rdwr.c | 8 ++++++++
>  1 file changed, 8 insertions(+)
>

Same comments as Bionic's:

Acked-by: Kleber Sacilotto de Souza <[hidden email]>

--
kernel-team mailing list
[hidden email]
https://lists.ubuntu.com/mailman/listinfo/kernel-team
Reply | Threaded
Open this post in threaded view
|

ACK/Cmnt: [SRU X][PATCH v2 1/1] UBUNTU: SAUCE: cachefiles: Page leaking in cachefiles_read_backing_file while vmscan is active

Stefan Bader-2
In reply to this post by Daniel Axtens
On 24.09.2018 04:20, Daniel Axtens wrote:

> From: Kiran Kumar Modukuri <[hidden email]>
>
> BugLink: https://bugs.launchpad.net/bugs/1793430
>
> [Description]
> In a heavily loaded system where the system pagecache is nearing memory limits and fscache is enabled,
> pages can be leaked by fscache while trying read pages from cachefiles backend.
> This can happen because two applications can be reading same page from a single mount,
> two threads can be trying to read the backing page at same time. This results in one of the thread
> finding that a page for the backing file or netfs file is already in the radix tree. During the error
> handling cachefiles does not cleanup the reference on backing page, leading to page leak.
>
> [Fix]
> The fix is straightforward, to decrement the reference when error is encounterd.
>
> [Testing]
> I have tested the fix using following method for 12+ hrs.
>
> 1) mkdir -p /mnt/nfs ; mount -o vers=3,fsc <server_ip>:/export /mnt/nfs
> 2) create 10000 files of 2.8MB in a NFS mount.
> 3) start a thread to simulate heavy VM presssure
>    (while true ; do echo 3 > /proc/sys/vm/drop_caches ; sleep 1 ; done)&
> 4) start multiple parallel reader for data set at same time
>    find /mnt/nfs -type f | xargs -P 80 cat > /dev/null &
>    find /mnt/nfs -type f | xargs -P 80 cat > /dev/null &
>    find /mnt/nfs -type f | xargs -P 80 cat > /dev/null &
>    ..
>    ..
>    find /mnt/nfs -type f | xargs -P 80 cat > /dev/null &
>    find /mnt/nfs -type f | xargs -P 80 cat > /dev/null &
> 5) finally check using cat /proc/fs/fscache/stats | grep -i pages ;
>    free -h , cat /proc/meminfo and page-types -r -b lru
>    to ensure all pages are freed.
>
> Reviewed-by: Daniel Axtens <[hidden email]>
> Signed-off-by: Shantanu Goel <[hidden email]>
> Signed-off-by: Kiran Kumar Modukuri <[hidden email]>
> [dja: forward ported to current upstream]
> Signed-off-by: Daniel Axtens <[hidden email]>
> [backported from
>  https://www.redhat.com/archives/linux-cachefs/2018-September/msg00002.html
>  This is v3 of the patch. v2 has sat on the list for weeks without
>  any response or forward progress. v1 was first posted in 2014 and
>  was reposted this August.]
> Signed-off-by: Daniel Axtens <[hidden email]>
Acked-by: Stefan Bader <[hidden email]>
>
> ---

Looks sensible so far and claims to have well testing. Would be slightly more
nice if we had something upstream to compare agains, but meh.

-Stefan

>
> Canonical v2/Upstream v3: Clean up per Seth's feedback.
> ---
>  fs/cachefiles/rdwr.c | 8 ++++++++
>  1 file changed, 8 insertions(+)
>
> diff --git a/fs/cachefiles/rdwr.c b/fs/cachefiles/rdwr.c
> index 5b68cf526887..98e2b89dc47e 100644
> --- a/fs/cachefiles/rdwr.c
> +++ b/fs/cachefiles/rdwr.c
> @@ -513,6 +513,8 @@ static int cachefiles_read_backing_file(struct cachefiles_object *object,
>   goto installed_new_backing_page;
>   if (ret != -EEXIST)
>   goto nomem;
> + page_cache_release(newpage);
> + newpage = NULL;
>   }
>  
>   /* we've installed a new backing page, so now we need
> @@ -537,7 +539,10 @@ static int cachefiles_read_backing_file(struct cachefiles_object *object,
>      netpage->index, cachefiles_gfp);
>   if (ret < 0) {
>   if (ret == -EEXIST) {
> + page_cache_release(backpage);
> + backpage = NULL;
>   page_cache_release(netpage);
> + netpage = NULL;
>   fscache_retrieval_complete(op, 1);
>   continue;
>   }
> @@ -610,7 +615,10 @@ static int cachefiles_read_backing_file(struct cachefiles_object *object,
>      netpage->index, cachefiles_gfp);
>   if (ret < 0) {
>   if (ret == -EEXIST) {
> + page_cache_release(backpage);
> + backpage = NULL;
>   page_cache_release(netpage);
> + netpage = NULL;
>   fscache_retrieval_complete(op, 1);
>   continue;
>   }
>


--
kernel-team mailing list
[hidden email]
https://lists.ubuntu.com/mailman/listinfo/kernel-team

signature.asc (836 bytes) Download Attachment
Reply | Threaded
Open this post in threaded view
|

APPLIED: [SRU X][PATCH v2 0/1] Fix LP: #1793430 - cachefiles page leak

Kleber Souza
In reply to this post by Daniel Axtens
On 09/24/18 04:20, Daniel Axtens wrote:

> SRU Justification
> -----------------
>
> [Description]
>
> In a heavily loaded system where the system pagecache is nearing
> memory limits and fscache is enabled, pages can be leaked by fscache
> while trying read pages from cachefiles backend. This can happen
> because two applications can be reading same page from a single mount,
> two threads can be trying to read the backing page at same time. This
> results in one of the thread finding that a page for the backing file
> or netfs file is already in the radix tree. During the error handling
> cachefiles does not cleanup the reference on backing page, leading to
> page leak.
>
> [Fix]
> The fix is straightforward, to decrement the reference when error is encounterd.
>
> [Testing]
> A user has tested the fix using following method for 12+ hrs.
>
>     1) mkdir -p /mnt/nfs ; mount -o vers=3,fsc <server_ip>:/export /mnt/nfs
>     2) create 10000 files of 2.8MB in a NFS mount.
>     3) start a thread to simulate heavy VM presssure
>        (while true ; do echo 3 > /proc/sys/vm/drop_caches ; sleep 1 ; done)&
>     4) start multiple parallel reader for data set at same time
>        find /mnt/nfs -type f | xargs -P 80 cat > /dev/null &
>        find /mnt/nfs -type f | xargs -P 80 cat > /dev/null &
>        find /mnt/nfs -type f | xargs -P 80 cat > /dev/null &
>        ..
>        ..
>        find /mnt/nfs -type f | xargs -P 80 cat > /dev/null &
>        find /mnt/nfs -type f | xargs -P 80 cat > /dev/null &
>     5) finally check using cat /proc/fs/fscache/stats | grep -i pages ;
>        free -h , cat /proc/meminfo and page-types -r -b lru
>        to ensure all pages are freed.
>
> [Regression Potential]
> Limited to cachefiles.
>
> [History]
> v2: Address Seth's review. The Bionic/Cosmic version was posted as v3
> upstream.
>
> Kiran Kumar Modukuri (1):
>   UBUNTU: SAUCE: cachefiles: Page leaking in
>     cachefiles_read_backing_file while vmscan is active
>
>  fs/cachefiles/rdwr.c | 8 ++++++++
>  1 file changed, 8 insertions(+)
>

Applied to xenial/master-next branch.

Thanks,
Kleber

--
kernel-team mailing list
[hidden email]
https://lists.ubuntu.com/mailman/listinfo/kernel-team