[PATCH 0/4][SRU][B-HWE][D][V3] fix i386 boot crashes, LP: #1838115

classic Classic list List threaded Threaded
11 messages Options
Reply | Threaded
Open this post in threaded view
|

[PATCH 0/4][SRU][B-HWE][D][V3] fix i386 boot crashes, LP: #1838115

Colin Ian King-2
From: Colin Ian King <[hidden email]>

BugLink: https://bugs.launchpad.net/bugs/1838115

Fix i386 boot crashes and i915 graphics corruption. Upstream fix
3f8fd02b1bf1d7b (" mm/vmalloc: Sync unmappings in
__purge_vmap_area_lazy()") fixes the core issue, and pull in 3
other prerequesits to allow patch to apply cleanly w/o any
backporting.

Tested on Ubuntu Bionic i386 and amd64 with some stress-testing
with stress-ng --vm 4 --brk 1 for 5 minutes.

Joerg Roedel (3):
  x86/mm: Check for pfn instead of page in vmalloc_sync_one()
  x86/mm: Sync also unmappings in vmalloc_sync_all()
  mm/vmalloc: Sync unmappings in __purge_vmap_area_lazy()

Uladzislau Rezki (Sony) (1):
  mm/vmalloc.c: add priority threshold to __purge_vmap_area_lazy()

 arch/x86/mm/fault.c | 15 ++++++---------
 mm/vmalloc.c        | 27 +++++++++++++++++++++------
 2 files changed, 27 insertions(+), 15 deletions(-)

--
2.7.4


--
kernel-team mailing list
[hidden email]
https://lists.ubuntu.com/mailman/listinfo/kernel-team
Reply | Threaded
Open this post in threaded view
|

[PATCH 1/4][SRU][B-HWE][D][V3] x86/mm: Check for pfn instead of page in vmalloc_sync_one()

Colin Ian King-2
From: Joerg Roedel <[hidden email]>

BugLink: https://bugs.launchpad.net/bugs/1838115

Do not require a struct page for the mapped memory location because it
might not exist. This can happen when an ioremapped region is mapped with
2MB pages.

Fixes: 5d72b4fba40ef ('x86, mm: support huge I/O mapping capability I/F')
Signed-off-by: Joerg Roedel <[hidden email]>
Signed-off-by: Thomas Gleixner <[hidden email]>
Reviewed-by: Dave Hansen <[hidden email]>
Link: https://lkml.kernel.org/r/20190719184652.11391-2-joro@...
(cherry picked from commit 51b75b5b563a2637f9d8dc5bd02a31b2ff9e5ea0)
Signed-off-by: Colin Ian King <[hidden email]>
---
 arch/x86/mm/fault.c | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/arch/x86/mm/fault.c b/arch/x86/mm/fault.c
index 9d5c75f..2d61c7b 100644
--- a/arch/x86/mm/fault.c
+++ b/arch/x86/mm/fault.c
@@ -199,7 +199,7 @@ static inline pmd_t *vmalloc_sync_one(pgd_t *pgd, unsigned long address)
  if (!pmd_present(*pmd))
  set_pmd(pmd, *pmd_k);
  else
- BUG_ON(pmd_page(*pmd) != pmd_page(*pmd_k));
+ BUG_ON(pmd_pfn(*pmd) != pmd_pfn(*pmd_k));
 
  return pmd_k;
 }
--
2.7.4


--
kernel-team mailing list
[hidden email]
https://lists.ubuntu.com/mailman/listinfo/kernel-team
Reply | Threaded
Open this post in threaded view
|

[PATCH 2/4][SRU][B-HWE][D][V3] x86/mm: Sync also unmappings in vmalloc_sync_all()

Colin Ian King-2
In reply to this post by Colin Ian King-2
From: Joerg Roedel <[hidden email]>

BugLink: https://bugs.launchpad.net/bugs/1838115

With huge-page ioremap areas the unmappings also need to be synced between
all page-tables. Otherwise it can cause data corruption when a region is
unmapped and later re-used.

Make the vmalloc_sync_one() function ready to sync unmappings and make sure
vmalloc_sync_all() iterates over all page-tables even when an unmapped PMD
is found.

Fixes: 5d72b4fba40ef ('x86, mm: support huge I/O mapping capability I/F')
Signed-off-by: Joerg Roedel <[hidden email]>
Signed-off-by: Thomas Gleixner <[hidden email]>
Reviewed-by: Dave Hansen <[hidden email]>
Link: https://lkml.kernel.org/r/20190719184652.11391-3-joro@...
(cherry picked from commit 8e998fc24de47c55b47a887f6c95ab91acd4a720)
Signed-off-by: Colin Ian King <[hidden email]>
---
 arch/x86/mm/fault.c | 13 +++++--------
 1 file changed, 5 insertions(+), 8 deletions(-)

diff --git a/arch/x86/mm/fault.c b/arch/x86/mm/fault.c
index 2d61c7b..df25c9c 100644
--- a/arch/x86/mm/fault.c
+++ b/arch/x86/mm/fault.c
@@ -193,11 +193,12 @@ static inline pmd_t *vmalloc_sync_one(pgd_t *pgd, unsigned long address)
 
  pmd = pmd_offset(pud, address);
  pmd_k = pmd_offset(pud_k, address);
- if (!pmd_present(*pmd_k))
- return NULL;
 
- if (!pmd_present(*pmd))
+ if (pmd_present(*pmd) != pmd_present(*pmd_k))
  set_pmd(pmd, *pmd_k);
+
+ if (!pmd_present(*pmd_k))
+ return NULL;
  else
  BUG_ON(pmd_pfn(*pmd) != pmd_pfn(*pmd_k));
 
@@ -219,17 +220,13 @@ void vmalloc_sync_all(void)
  spin_lock(&pgd_lock);
  list_for_each_entry(page, &pgd_list, lru) {
  spinlock_t *pgt_lock;
- pmd_t *ret;
 
  /* the pgt_lock only for Xen */
  pgt_lock = &pgd_page_get_mm(page)->page_table_lock;
 
  spin_lock(pgt_lock);
- ret = vmalloc_sync_one(page_address(page), address);
+ vmalloc_sync_one(page_address(page), address);
  spin_unlock(pgt_lock);
-
- if (!ret)
- break;
  }
  spin_unlock(&pgd_lock);
  }
--
2.7.4


--
kernel-team mailing list
[hidden email]
https://lists.ubuntu.com/mailman/listinfo/kernel-team
Reply | Threaded
Open this post in threaded view
|

[PATCH 3/4][SRU][B-HWE][D][V3] mm/vmalloc.c: add priority threshold to __purge_vmap_area_lazy()

Colin Ian King-2
In reply to this post by Colin Ian King-2
From: "Uladzislau Rezki (Sony)" <[hidden email]>

BugLink: https://bugs.launchpad.net/bugs/1838115

Commit 763b218ddfaf ("mm: add preempt points into __purge_vmap_area_lazy()")
introduced some preempt points, one of those is making an allocation
more prioritized over lazy free of vmap areas.

Prioritizing an allocation over freeing does not work well all the time,
i.e.  it should be rather a compromise.

1) Number of lazy pages directly influences the busy list length thus
   on operations like: allocation, lookup, unmap, remove, etc.

2) Under heavy stress of vmalloc subsystem I run into a situation when
   memory usage gets increased hitting out_of_memory -> panic state due to
   completely blocking of logic that frees vmap areas in the
   __purge_vmap_area_lazy() function.

Establish a threshold passing which the freeing is prioritized back over
allocation creating a balance between each other.

Using vmalloc test driver in "stress mode", i.e.  When all available
test cases are run simultaneously on all online CPUs applying a
pressure on the vmalloc subsystem, my HiKey 960 board runs out of
memory due to the fact that __purge_vmap_area_lazy() logic simply is
not able to free pages in time.

How I run it:

1) You should build your kernel with CONFIG_TEST_VMALLOC=m
2) ./tools/testing/selftests/vm/test_vmalloc.sh stress

During this test "vmap_lazy_nr" pages will go far beyond acceptable
lazy_max_pages() threshold, that will lead to enormous busy list size
and other problems including allocation time and so on.

Link: http://lkml.kernel.org/r/20190124115648.9433-3-urezki@...
Signed-off-by: Uladzislau Rezki (Sony) <[hidden email]>
Reviewed-by: Andrew Morton <[hidden email]>
Cc: Michal Hocko <[hidden email]>
Cc: Matthew Wilcox <[hidden email]>
Cc: Thomas Garnier <[hidden email]>
Cc: Oleksiy Avramchenko <[hidden email]>
Cc: Steven Rostedt <[hidden email]>
Cc: Joel Fernandes <[hidden email]>
Cc: Thomas Gleixner <[hidden email]>
Cc: Ingo Molnar <[hidden email]>
Cc: Tejun Heo <[hidden email]>
Cc: Joel Fernandes <[hidden email]>
Signed-off-by: Andrew Morton <[hidden email]>
Signed-off-by: Linus Torvalds <[hidden email]>
(cherry picked from commit 68571be99f323c3c3db62a8513a43380ccefe97c)
Signed-off-by: Colin Ian King <[hidden email]>
---
 mm/vmalloc.c | 18 ++++++++++++------
 1 file changed, 12 insertions(+), 6 deletions(-)

diff --git a/mm/vmalloc.c b/mm/vmalloc.c
index 1bf8fa9..07c9a18 100644
--- a/mm/vmalloc.c
+++ b/mm/vmalloc.c
@@ -661,23 +661,27 @@ static bool __purge_vmap_area_lazy(unsigned long start, unsigned long end)
  struct llist_node *valist;
  struct vmap_area *va;
  struct vmap_area *n_va;
- bool do_free = false;
+ int resched_threshold;
 
  lockdep_assert_held(&vmap_purge_lock);
 
  valist = llist_del_all(&vmap_purge_list);
+ if (unlikely(valist == NULL))
+ return false;
+
+ /*
+ * TODO: to calculate a flush range without looping.
+ * The list can be up to lazy_max_pages() elements.
+ */
  llist_for_each_entry(va, valist, purge_list) {
  if (va->va_start < start)
  start = va->va_start;
  if (va->va_end > end)
  end = va->va_end;
- do_free = true;
  }
 
- if (!do_free)
- return false;
-
  flush_tlb_kernel_range(start, end);
+ resched_threshold = (int) lazy_max_pages() << 1;
 
  spin_lock(&vmap_area_lock);
  llist_for_each_entry_safe(va, n_va, valist, purge_list) {
@@ -685,7 +689,9 @@ static bool __purge_vmap_area_lazy(unsigned long start, unsigned long end)
 
  __free_vmap_area(va);
  atomic_sub(nr, &vmap_lazy_nr);
- cond_resched_lock(&vmap_area_lock);
+
+ if (atomic_read(&vmap_lazy_nr) < resched_threshold)
+ cond_resched_lock(&vmap_area_lock);
  }
  spin_unlock(&vmap_area_lock);
  return true;
--
2.7.4


--
kernel-team mailing list
[hidden email]
https://lists.ubuntu.com/mailman/listinfo/kernel-team
Reply | Threaded
Open this post in threaded view
|

[PATCH 4/4][SRU][B-HWE][D][V3] mm/vmalloc: Sync unmappings in __purge_vmap_area_lazy()

Colin Ian King-2
In reply to this post by Colin Ian King-2
From: Joerg Roedel <[hidden email]>

BugLink: https://bugs.launchpad.net/bugs/1838115

On x86-32 with PTI enabled, parts of the kernel page-tables are not shared
between processes. This can cause mappings in the vmalloc/ioremap area to
persist in some page-tables after the region is unmapped and released.

When the region is re-used the processes with the old mappings do not fault
in the new mappings but still access the old ones.

This causes undefined behavior, in reality often data corruption, kernel
oopses and panics and even spontaneous reboots.

Fix this problem by activly syncing unmaps in the vmalloc/ioremap area to
all page-tables in the system before the regions can be re-used.

References: https://bugzilla.suse.com/show_bug.cgi?id=1118689
Fixes: 5d72b4fba40ef ('x86, mm: support huge I/O mapping capability I/F')
Signed-off-by: Joerg Roedel <[hidden email]>
Signed-off-by: Thomas Gleixner <[hidden email]>
Reviewed-by: Dave Hansen <[hidden email]>
Link: https://lkml.kernel.org/r/20190719184652.11391-4-joro@...
(cherry picked from commit 3f8fd02b1bf1d7ba964485a56f2f4b53ae88c167)
Signed-off-by: Colin Ian King <[hidden email]>
---
 mm/vmalloc.c | 9 +++++++++
 1 file changed, 9 insertions(+)

diff --git a/mm/vmalloc.c b/mm/vmalloc.c
index 07c9a18..ddf5ee0 100644
--- a/mm/vmalloc.c
+++ b/mm/vmalloc.c
@@ -670,6 +670,12 @@ static bool __purge_vmap_area_lazy(unsigned long start, unsigned long end)
  return false;
 
  /*
+ * First make sure the mappings are removed from all page-tables
+ * before they are freed.
+ */
+ vmalloc_sync_all();
+
+ /*
  * TODO: to calculate a flush range without looping.
  * The list can be up to lazy_max_pages() elements.
  */
@@ -2308,6 +2314,9 @@ EXPORT_SYMBOL(remap_vmalloc_range);
 /*
  * Implement a stub for vmalloc_sync_all() if the architecture chose not to
  * have one.
+ *
+ * The purpose of this function is to make sure the vmalloc area
+ * mappings are identical in all page-tables in the system.
  */
 void __weak vmalloc_sync_all(void)
 {
--
2.7.4


--
kernel-team mailing list
[hidden email]
https://lists.ubuntu.com/mailman/listinfo/kernel-team
Reply | Threaded
Open this post in threaded view
|

ACK/Cmnt: [PATCH 0/4][SRU][B-HWE][D][V3] fix i386 boot crashes, LP: #1838115

Stefan Bader-2
In reply to this post by Colin Ian King-2
On 29.07.19 13:19, Colin King wrote:

> From: Colin Ian King <[hidden email]>
>
> BugLink: https://bugs.launchpad.net/bugs/1838115
>
> Fix i386 boot crashes and i915 graphics corruption. Upstream fix
> 3f8fd02b1bf1d7b (" mm/vmalloc: Sync unmappings in
> __purge_vmap_area_lazy()") fixes the core issue, and pull in 3
> other prerequesits to allow patch to apply cleanly w/o any
> backporting.
>
> Tested on Ubuntu Bionic i386 and amd64 with some stress-testing
> with stress-ng --vm 4 --brk 1 for 5 minutes.
>
> Joerg Roedel (3):
>   x86/mm: Check for pfn instead of page in vmalloc_sync_one()
>   x86/mm: Sync also unmappings in vmalloc_sync_all()
>   mm/vmalloc: Sync unmappings in __purge_vmap_area_lazy()
>
> Uladzislau Rezki (Sony) (1):
>   mm/vmalloc.c: add priority threshold to __purge_vmap_area_lazy()
>
>  arch/x86/mm/fault.c | 15 ++++++---------
>  mm/vmalloc.c        | 27 +++++++++++++++++++++------
>  2 files changed, 27 insertions(+), 15 deletions(-)
>
I guess that additional pick is not directly related but a little bit feels like
it could help to make kernels survive stress testing somewhat better.
All looking ok to me.

Acked-by: Stefan Bader <[hidden email]>


--
kernel-team mailing list
[hidden email]
https://lists.ubuntu.com/mailman/listinfo/kernel-team

signature.asc (849 bytes) Download Attachment
Reply | Threaded
Open this post in threaded view
|

[Acked] [PATCH 0/4][SRU][B-HWE][D][V3] fix i386 boot crashes, LP: #1838115

Andy Whitcroft-3
In reply to this post by Colin Ian King-2
On Mon, Jul 29, 2019 at 12:19:18PM +0100, Colin King wrote:

> From: Colin Ian King <[hidden email]>
>
> BugLink: https://bugs.launchpad.net/bugs/1838115
>
> Fix i386 boot crashes and i915 graphics corruption. Upstream fix
> 3f8fd02b1bf1d7b (" mm/vmalloc: Sync unmappings in
> __purge_vmap_area_lazy()") fixes the core issue, and pull in 3
> other prerequesits to allow patch to apply cleanly w/o any
> backporting.
>
> Tested on Ubuntu Bionic i386 and amd64 with some stress-testing
> with stress-ng --vm 4 --brk 1 for 5 minutes.
>
> Joerg Roedel (3):
>   x86/mm: Check for pfn instead of page in vmalloc_sync_one()
>   x86/mm: Sync also unmappings in vmalloc_sync_all()
>   mm/vmalloc: Sync unmappings in __purge_vmap_area_lazy()
>
> Uladzislau Rezki (Sony) (1):
>   mm/vmalloc.c: add priority threshold to __purge_vmap_area_lazy()
>
>  arch/x86/mm/fault.c | 15 ++++++---------
>  mm/vmalloc.c        | 27 +++++++++++++++++++++------
>  2 files changed, 27 insertions(+), 15 deletions(-)
>
> --
> 2.7.4
>
>
> --
> kernel-team mailing list
> [hidden email]
> https://lists.ubuntu.com/mailman/listinfo/kernel-team

Testing is good and these are all now clean cherry-picks so:

Acked-by: Andy Whitcroft <[hidden email]>

-apw

--
kernel-team mailing list
[hidden email]
https://lists.ubuntu.com/mailman/listinfo/kernel-team
Reply | Threaded
Open this post in threaded view
|

APPLIED: [PATCH 0/4][SRU][B-HWE][D][V3] fix i386 boot crashes, LP: #1838115

Stefan Bader-2
In reply to this post by Colin Ian King-2
On 29.07.19 13:19, Colin King wrote:

> From: Colin Ian King <[hidden email]>
>
> BugLink: https://bugs.launchpad.net/bugs/1838115
>
> Fix i386 boot crashes and i915 graphics corruption. Upstream fix
> 3f8fd02b1bf1d7b (" mm/vmalloc: Sync unmappings in
> __purge_vmap_area_lazy()") fixes the core issue, and pull in 3
> other prerequesits to allow patch to apply cleanly w/o any
> backporting.
>
> Tested on Ubuntu Bionic i386 and amd64 with some stress-testing
> with stress-ng --vm 4 --brk 1 for 5 minutes.
>
> Joerg Roedel (3):
>   x86/mm: Check for pfn instead of page in vmalloc_sync_one()
>   x86/mm: Sync also unmappings in vmalloc_sync_all()
>   mm/vmalloc: Sync unmappings in __purge_vmap_area_lazy()
>
> Uladzislau Rezki (Sony) (1):
>   mm/vmalloc.c: add priority threshold to __purge_vmap_area_lazy()
>
>  arch/x86/mm/fault.c | 15 ++++++---------
>  mm/vmalloc.c        | 27 +++++++++++++++++++++------
>  2 files changed, 27 insertions(+), 15 deletions(-)
>
Applied to disco/master for re-spin (2019.07.01-13). Thanks.

-Stefan


--
kernel-team mailing list
[hidden email]
https://lists.ubuntu.com/mailman/listinfo/kernel-team

signature.asc (849 bytes) Download Attachment
Reply | Threaded
Open this post in threaded view
|

Re: [PATCH 0/4][SRU][B-HWE][D][V3] fix i386 boot crashes, LP: #1838115

Colin Ian King-2
In reply to this post by Colin Ian King-2
And also for Bionic (4.15) too please.

On 29/07/2019 12:19, Colin King wrote:

> From: Colin Ian King <[hidden email]>
>
> BugLink: https://bugs.launchpad.net/bugs/1838115
>
> Fix i386 boot crashes and i915 graphics corruption. Upstream fix
> 3f8fd02b1bf1d7b (" mm/vmalloc: Sync unmappings in
> __purge_vmap_area_lazy()") fixes the core issue, and pull in 3
> other prerequesits to allow patch to apply cleanly w/o any
> backporting.
>
> Tested on Ubuntu Bionic i386 and amd64 with some stress-testing
> with stress-ng --vm 4 --brk 1 for 5 minutes.
>
> Joerg Roedel (3):
>   x86/mm: Check for pfn instead of page in vmalloc_sync_one()
>   x86/mm: Sync also unmappings in vmalloc_sync_all()
>   mm/vmalloc: Sync unmappings in __purge_vmap_area_lazy()
>
> Uladzislau Rezki (Sony) (1):
>   mm/vmalloc.c: add priority threshold to __purge_vmap_area_lazy()
>
>  arch/x86/mm/fault.c | 15 ++++++---------
>  mm/vmalloc.c        | 27 +++++++++++++++++++++------
>  2 files changed, 27 insertions(+), 15 deletions(-)
>


--
kernel-team mailing list
[hidden email]
https://lists.ubuntu.com/mailman/listinfo/kernel-team
Reply | Threaded
Open this post in threaded view
|

APPLIED[E]: [PATCH 0/4][SRU][B-HWE][D][V3] fix i386 boot crashes, LP: #1838115

Seth Forshee
In reply to this post by Colin Ian King-2
On Mon, Jul 29, 2019 at 12:19:18PM +0100, Colin King wrote:

> From: Colin Ian King <[hidden email]>
>
> BugLink: https://bugs.launchpad.net/bugs/1838115
>
> Fix i386 boot crashes and i915 graphics corruption. Upstream fix
> 3f8fd02b1bf1d7b (" mm/vmalloc: Sync unmappings in
> __purge_vmap_area_lazy()") fixes the core issue, and pull in 3
> other prerequesits to allow patch to apply cleanly w/o any
> backporting.
>
> Tested on Ubuntu Bionic i386 and amd64 with some stress-testing
> with stress-ng --vm 4 --brk 1 for 5 minutes.

Applied patches 1, 2, and 4 to eoan/master-next. Patch 3 was in 5.2 and
thus already present in eoan. Thanks!

--
kernel-team mailing list
[hidden email]
https://lists.ubuntu.com/mailman/listinfo/kernel-team
Reply | Threaded
Open this post in threaded view
|

APPLIED/Cmnt: [PATCH 0/4][SRU][B-HWE][D][V3] fix i386 boot crashes, LP: #1838115

Stefan Bader-2
In reply to this post by Colin Ian King-2
On 30.07.19 10:36, Colin Ian King wrote:
> And also for Bionic (4.15) too please.

Nearly missed that. Maybe next time ping someone to add a reminder for the next
cycle. Or submit again for the older release(s). Looking at the bug report there
are people claiming this should go into Xenial, too.

Applied to bionic/master-next. Thanks.

-Stefan

>
> On 29/07/2019 12:19, Colin King wrote:
>> From: Colin Ian King <[hidden email]>
>>
>> BugLink: https://bugs.launchpad.net/bugs/1838115
>>
>> Fix i386 boot crashes and i915 graphics corruption. Upstream fix
>> 3f8fd02b1bf1d7b (" mm/vmalloc: Sync unmappings in
>> __purge_vmap_area_lazy()") fixes the core issue, and pull in 3
>> other prerequesits to allow patch to apply cleanly w/o any
>> backporting.
>>
>> Tested on Ubuntu Bionic i386 and amd64 with some stress-testing
>> with stress-ng --vm 4 --brk 1 for 5 minutes.
>>
>> Joerg Roedel (3):
>>   x86/mm: Check for pfn instead of page in vmalloc_sync_one()
>>   x86/mm: Sync also unmappings in vmalloc_sync_all()
>>   mm/vmalloc: Sync unmappings in __purge_vmap_area_lazy()
>>
>> Uladzislau Rezki (Sony) (1):
>>   mm/vmalloc.c: add priority threshold to __purge_vmap_area_lazy()
>>
>>  arch/x86/mm/fault.c | 15 ++++++---------
>>  mm/vmalloc.c        | 27 +++++++++++++++++++++------
>>  2 files changed, 27 insertions(+), 15 deletions(-)
>>
>
>


--
kernel-team mailing list
[hidden email]
https://lists.ubuntu.com/mailman/listinfo/kernel-team

signature.asc (849 bytes) Download Attachment