[PATCH 0/5][Yakkety SRU] Additional POWER9 support patches

classic Classic list List threaded Threaded
9 messages Options
Reply | Threaded
Open this post in threaded view
|

[PATCH 0/5][Yakkety SRU] Additional POWER9 support patches

Seth Forshee
BugLink: http://bugs.launchpad.net/bugs/1667116

Patches requested by IBM for POWER9 support. All are clean cherry picks
from linux-next except for the last which is a patch from the
linuxppc-dev list.

Thanks,
Seth


Aneesh Kumar K.V (5):
  powerpc/mm: Update PROTFAULT handling in the page fault path
  powerpc/mm/radix: Update pte update sequence for pte clear case
  powerpc/mm/radix: Use ptep_get_and_clear_full when clearing pte for
    full mm
  powerpc/mm/radix: Skip ptesync in pte update helpers
  UBUNTU: SAUCE: powerpc/mm/hash: Always clear UPRT and Host Radix bits
    when setting up CPU

 arch/powerpc/include/asm/book3s/64/pgtable.h | 17 +++++++++++
 arch/powerpc/include/asm/book3s/64/radix.h   | 36 +++++++++++++++--------
 arch/powerpc/kernel/cpu_setup_power.S        |  4 +++
 arch/powerpc/mm/copro_fault.c                | 10 ++++---
 arch/powerpc/mm/fault.c                      | 43 +++++++++++++++++++++-------
 5 files changed, 84 insertions(+), 26 deletions(-)


--
kernel-team mailing list
[hidden email]
https://lists.ubuntu.com/mailman/listinfo/kernel-team
Reply | Threaded
Open this post in threaded view
|

[PATCH 1/5][Yakkety SRU] powerpc/mm: Update PROTFAULT handling in the page fault path

Seth Forshee
From: "Aneesh Kumar K.V" <[hidden email]>

BugLink: http://bugs.launchpad.net/bugs/1667116

With radix, we can get page fault with DSISR_PROTFAULT value set in case of
PROT_NONE or autonuma mapping. The PROT_NONE case in handled by the vma check
where we consider the access bad. For autonuma we should fall through and fixup
the access mask correctly.

Without this patch we trigger the WARN_ON() on radix. This code moves that
WARN_ON() within a radix_enabled() check. I also moved the WARN_ON() outside
the if condition making it apply for all type of faults (exec/write/read). It
is also conditionalized for book3s, because BOOK3E can also get a PROTFAULT to
handle the D/I cache sync.

Signed-off-by: Aneesh Kumar K.V <[hidden email]>
Signed-off-by: Michael Ellerman <[hidden email]>
(cherry picked from linux-next commit 18061c17c8ecdbdbf1e7d1695ec44e7388b4f601)
Signed-off-by: Seth Forshee <[hidden email]>
---
 arch/powerpc/mm/copro_fault.c | 10 ++++++----
 arch/powerpc/mm/fault.c       | 43 +++++++++++++++++++++++++++++++++----------
 2 files changed, 39 insertions(+), 14 deletions(-)

diff --git a/arch/powerpc/mm/copro_fault.c b/arch/powerpc/mm/copro_fault.c
index 362954f98029..b5d0a21473fa 100644
--- a/arch/powerpc/mm/copro_fault.c
+++ b/arch/powerpc/mm/copro_fault.c
@@ -67,11 +67,13 @@ int copro_handle_mm_fault(struct mm_struct *mm, unsigned long ea,
  if (!(vma->vm_flags & (VM_READ | VM_EXEC)))
  goto out_unlock;
  /*
- * protfault should only happen due to us
- * mapping a region readonly temporarily. PROT_NONE
- * is also covered by the VMA check above.
+ * PROT_NONE is covered by the VMA check above.
+ * and hash should get a NOHPTE fault instead of
+ * a PROTFAULT in case fixup is needed for things
+ * like autonuma.
  */
- WARN_ON_ONCE(dsisr & DSISR_PROTFAULT);
+ if (!radix_enabled())
+ WARN_ON_ONCE(dsisr & DSISR_PROTFAULT);
  }
 
  ret = 0;
diff --git a/arch/powerpc/mm/fault.c b/arch/powerpc/mm/fault.c
index 6c8683d30f7d..bbc698afd7d3 100644
--- a/arch/powerpc/mm/fault.c
+++ b/arch/powerpc/mm/fault.c
@@ -407,15 +407,6 @@ good_area:
     (cpu_has_feature(CPU_FTR_NOEXECUTE) ||
      !(vma->vm_flags & (VM_READ | VM_WRITE))))
  goto bad_area;
-
-#ifdef CONFIG_PPC_STD_MMU
- /*
- * protfault should only happen due to us
- * mapping a region readonly temporarily. PROT_NONE
- * is also covered by the VMA check above.
- */
- WARN_ON_ONCE(error_code & DSISR_PROTFAULT);
-#endif /* CONFIG_PPC_STD_MMU */
  /* a write */
  } else if (is_write) {
  if (!(vma->vm_flags & VM_WRITE))
@@ -425,8 +416,40 @@ good_area:
  } else {
  if (!(vma->vm_flags & (VM_READ | VM_EXEC | VM_WRITE)))
  goto bad_area;
- WARN_ON_ONCE(error_code & DSISR_PROTFAULT);
  }
+#ifdef CONFIG_PPC_STD_MMU
+ /*
+ * For hash translation mode, we should never get a
+ * PROTFAULT. Any update to pte to reduce access will result in us
+ * removing the hash page table entry, thus resulting in a DSISR_NOHPTE
+ * fault instead of DSISR_PROTFAULT.
+ *
+ * A pte update to relax the access will not result in a hash page table
+ * entry invalidate and hence can result in DSISR_PROTFAULT.
+ * ptep_set_access_flags() doesn't do a hpte flush. This is why we have
+ * the special !is_write in the below conditional.
+ *
+ * For platforms that doesn't supports coherent icache and do support
+ * per page noexec bit, we do setup things such that we do the
+ * sync between D/I cache via fault. But that is handled via low level
+ * hash fault code (hash_page_do_lazy_icache()) and we should not reach
+ * here in such case.
+ *
+ * For wrong access that can result in PROTFAULT, the above vma->vm_flags
+ * check should handle those and hence we should fall to the bad_area
+ * handling correctly.
+ *
+ * For embedded with per page exec support that doesn't support coherent
+ * icache we do get PROTFAULT and we handle that D/I cache sync in
+ * set_pte_at while taking the noexec/prot fault. Hence this is WARN_ON
+ * is conditional for server MMU.
+ *
+ * For radix, we can get prot fault for autonuma case, because radix
+ * page table will have them marked noaccess for user.
+ */
+ if (!radix_enabled() && !is_write)
+ WARN_ON_ONCE(error_code & DSISR_PROTFAULT);
+#endif /* CONFIG_PPC_STD_MMU */
 
  /*
  * If for any reason at all we couldn't handle the fault,
--
2.7.4


--
kernel-team mailing list
[hidden email]
https://lists.ubuntu.com/mailman/listinfo/kernel-team
Reply | Threaded
Open this post in threaded view
|

[PATCH 2/5][Yakkety SRU] powerpc/mm/radix: Update pte update sequence for pte clear case

Seth Forshee
In reply to this post by Seth Forshee
From: "Aneesh Kumar K.V" <[hidden email]>

BugLink: http://bugs.launchpad.net/bugs/1667116

In the kernel we do follow the below sequence in different code paths.
pte = ptep_get_clear(ptep)
....
set_pte_at(ptep, pte)

We do that for mremap, autonuma protection update and softdirty clearing. This
implies our optimization to skip a tlb flush when clearing a pte update is
not valid, because for DD1 system that followup set_pte_at will be done witout
doing the required tlbflush. Fix that by always doing the dd1 style pte update
irrespective of new_pte value. In a later patch we will optimize the application
exit case.

Signed-off-by: Benjamin Herrenschmidt <[hidden email]>
Signed-off-by: Aneesh Kumar K.V <[hidden email]>
Tested-by: Michael Neuling <[hidden email]>
Signed-off-by: Michael Ellerman <[hidden email]>
(cherry picked from linux-next commit ca94573b9c69d224e50e1084a2776772f4ea030d)
Signed-off-by: Seth Forshee <[hidden email]>
---
 arch/powerpc/include/asm/book3s/64/radix.h | 12 +++---------
 1 file changed, 3 insertions(+), 9 deletions(-)

diff --git a/arch/powerpc/include/asm/book3s/64/radix.h b/arch/powerpc/include/asm/book3s/64/radix.h
index b4d1302387a3..70a3cdcdbe47 100644
--- a/arch/powerpc/include/asm/book3s/64/radix.h
+++ b/arch/powerpc/include/asm/book3s/64/radix.h
@@ -144,16 +144,10 @@ static inline unsigned long radix__pte_update(struct mm_struct *mm,
  * new value of pte
  */
  new_pte = (old_pte | set) & ~clr;
- /*
- * If we are trying to clear the pte, we can skip
- * the below sequence and batch the tlb flush. The
- * tlb flush batching is done by mmu gather code
- */
- if (new_pte) {
- asm volatile("ptesync" : : : "memory");
- radix__flush_tlb_pte_p9_dd1(old_pte, mm, addr);
+ asm volatile("ptesync" : : : "memory");
+ radix__flush_tlb_pte_p9_dd1(old_pte, mm, addr);
+ if (new_pte)
  __radix_pte_update(ptep, 0, new_pte);
- }
  } else
  old_pte = __radix_pte_update(ptep, clr, set);
  asm volatile("ptesync" : : : "memory");
--
2.7.4


--
kernel-team mailing list
[hidden email]
https://lists.ubuntu.com/mailman/listinfo/kernel-team
Reply | Threaded
Open this post in threaded view
|

[PATCH 3/5][Yakkety SRU] powerpc/mm/radix: Use ptep_get_and_clear_full when clearing pte for full mm

Seth Forshee
In reply to this post by Seth Forshee
From: "Aneesh Kumar K.V" <[hidden email]>

BugLink: http://bugs.launchpad.net/bugs/1667116

This helps us to do some optimization for application exit case, where we can
skip the DD1 style pte update sequence.

Signed-off-by: Aneesh Kumar K.V <[hidden email]>
Tested-by: Michael Neuling <[hidden email]>
Signed-off-by: Michael Ellerman <[hidden email]>
(cherry picked from linux-next commit f4894b80b1ddfef00d4d2e5c58613ccef358a1b2)
Signed-off-by: Seth Forshee <[hidden email]>
---
 arch/powerpc/include/asm/book3s/64/pgtable.h | 17 +++++++++++++++++
 arch/powerpc/include/asm/book3s/64/radix.h   | 23 ++++++++++++++++++++++-
 2 files changed, 39 insertions(+), 1 deletion(-)

diff --git a/arch/powerpc/include/asm/book3s/64/pgtable.h b/arch/powerpc/include/asm/book3s/64/pgtable.h
index bbea0040320a..6b2a58fb2cf5 100644
--- a/arch/powerpc/include/asm/book3s/64/pgtable.h
+++ b/arch/powerpc/include/asm/book3s/64/pgtable.h
@@ -371,6 +371,23 @@ static inline pte_t ptep_get_and_clear(struct mm_struct *mm,
  return __pte(old);
 }
 
+#define __HAVE_ARCH_PTEP_GET_AND_CLEAR_FULL
+static inline pte_t ptep_get_and_clear_full(struct mm_struct *mm,
+    unsigned long addr,
+    pte_t *ptep, int full)
+{
+ if (full && radix_enabled()) {
+ /*
+ * Let's skip the DD1 style pte update here. We know that
+ * this is a full mm pte clear and hence can be sure there is
+ * no parallel set_pte.
+ */
+ return radix__ptep_get_and_clear_full(mm, addr, ptep, full);
+ }
+ return ptep_get_and_clear(mm, addr, ptep);
+}
+
+
 static inline void pte_clear(struct mm_struct *mm, unsigned long addr,
      pte_t * ptep)
 {
diff --git a/arch/powerpc/include/asm/book3s/64/radix.h b/arch/powerpc/include/asm/book3s/64/radix.h
index 70a3cdcdbe47..fcf822d6c204 100644
--- a/arch/powerpc/include/asm/book3s/64/radix.h
+++ b/arch/powerpc/include/asm/book3s/64/radix.h
@@ -139,7 +139,7 @@ static inline unsigned long radix__pte_update(struct mm_struct *mm,
 
  unsigned long new_pte;
 
- old_pte = __radix_pte_update(ptep, ~0, 0);
+ old_pte = __radix_pte_update(ptep, ~0ul, 0);
  /*
  * new value of pte
  */
@@ -157,6 +157,27 @@ static inline unsigned long radix__pte_update(struct mm_struct *mm,
  return old_pte;
 }
 
+static inline pte_t radix__ptep_get_and_clear_full(struct mm_struct *mm,
+   unsigned long addr,
+   pte_t *ptep, int full)
+{
+ unsigned long old_pte;
+
+ if (full) {
+ /*
+ * If we are trying to clear the pte, we can skip
+ * the DD1 pte update sequence and batch the tlb flush. The
+ * tlb flush batching is done by mmu gather code. We
+ * still keep the cmp_xchg update to make sure we get
+ * correct R/C bit which might be updated via Nest MMU.
+ */
+ old_pte = __radix_pte_update(ptep, ~0ul, 0);
+ } else
+ old_pte = radix__pte_update(mm, addr, ptep, ~0ul, 0, 0);
+
+ return __pte(old_pte);
+}
+
 /*
  * Set the dirty and/or accessed bits atomically in a linux PTE, this
  * function doesn't need to invalidate tlb.
--
2.7.4


--
kernel-team mailing list
[hidden email]
https://lists.ubuntu.com/mailman/listinfo/kernel-team
Reply | Threaded
Open this post in threaded view
|

[PATCH 4/5][Yakkety SRU] powerpc/mm/radix: Skip ptesync in pte update helpers

Seth Forshee
In reply to this post by Seth Forshee
From: "Aneesh Kumar K.V" <[hidden email]>

BugLink: http://bugs.launchpad.net/bugs/1667116

We do them at the start of tlb flush, and we are sure a pte update will be
followed by a tlbflush. Hence we can skip the ptesync in pte update helpers.

Signed-off-by: Aneesh Kumar K.V <[hidden email]>
Tested-by: Michael Neuling <[hidden email]>
Signed-off-by: Michael Ellerman <[hidden email]>
(cherry picked from linux-next commit 438e69b52be776c035aa2a851ccc1709033d729b)
Signed-off-by: Seth Forshee <[hidden email]>
---
 arch/powerpc/include/asm/book3s/64/radix.h | 3 ---
 1 file changed, 3 deletions(-)

diff --git a/arch/powerpc/include/asm/book3s/64/radix.h b/arch/powerpc/include/asm/book3s/64/radix.h
index fcf822d6c204..77e590c77299 100644
--- a/arch/powerpc/include/asm/book3s/64/radix.h
+++ b/arch/powerpc/include/asm/book3s/64/radix.h
@@ -144,13 +144,11 @@ static inline unsigned long radix__pte_update(struct mm_struct *mm,
  * new value of pte
  */
  new_pte = (old_pte | set) & ~clr;
- asm volatile("ptesync" : : : "memory");
  radix__flush_tlb_pte_p9_dd1(old_pte, mm, addr);
  if (new_pte)
  __radix_pte_update(ptep, 0, new_pte);
  } else
  old_pte = __radix_pte_update(ptep, clr, set);
- asm volatile("ptesync" : : : "memory");
  if (!huge)
  assert_pte_locked(mm, addr);
 
@@ -195,7 +193,6 @@ static inline void radix__ptep_set_access_flags(struct mm_struct *mm,
  unsigned long old_pte, new_pte;
 
  old_pte = __radix_pte_update(ptep, ~0, 0);
- asm volatile("ptesync" : : : "memory");
  /*
  * new value of pte
  */
--
2.7.4


--
kernel-team mailing list
[hidden email]
https://lists.ubuntu.com/mailman/listinfo/kernel-team
Reply | Threaded
Open this post in threaded view
|

[PATCH 5/5][Yakkety SRU] UBUNTU: SAUCE: powerpc/mm/hash: Always clear UPRT and Host Radix bits when setting up CPU

Seth Forshee
In reply to this post by Seth Forshee
From: "Aneesh Kumar K.V" <[hidden email]>

BugLink: http://bugs.launchpad.net/bugs/1667116

We will set LPCR with correct value for radix during int. This make sure we
start with a sanitized value of LPCR. In case of kexec, cpus can have LPCR
value based on the previous translation mode we were running.

Fixes: fe036a0605d60 ("powerpc/64/kexec: Fix MMU cleanup on radix")
Cc: [hidden email] # v4.9+
Acked-by: Michael Neuling <[hidden email]>
Signed-off-by: Aneesh Kumar K.V <[hidden email]>
Signed-off-by: Seth Forshee <[hidden email]>
---
 arch/powerpc/kernel/cpu_setup_power.S | 4 ++++
 1 file changed, 4 insertions(+)

diff --git a/arch/powerpc/kernel/cpu_setup_power.S b/arch/powerpc/kernel/cpu_setup_power.S
index 011093e6312a..95688b55b395 100644
--- a/arch/powerpc/kernel/cpu_setup_power.S
+++ b/arch/powerpc/kernel/cpu_setup_power.S
@@ -101,6 +101,8 @@ _GLOBAL(__setup_cpu_power9)
  mfspr r3,SPRN_LPCR
  LOAD_REG_IMMEDIATE(r4, LPCR_PECEDH | LPCR_PECE_HVEE | LPCR_HVICE)
  or r3, r3, r4
+ LOAD_REG_IMMEDIATE(r4, LPCR_UPRT | LPCR_HR)
+ andc r3, r3, r4
  bl __init_LPCR
  bl __init_HFSCR
  bl __init_tlb_power9
@@ -122,6 +124,8 @@ _GLOBAL(__restore_cpu_power9)
  mfspr   r3,SPRN_LPCR
  LOAD_REG_IMMEDIATE(r4, LPCR_PECEDH | LPCR_PECE_HVEE | LPCR_HVICE)
  or r3, r3, r4
+ LOAD_REG_IMMEDIATE(r4, LPCR_UPRT | LPCR_HR)
+ andc r3, r3, r4
  bl __init_LPCR
  bl __init_HFSCR
  bl __init_tlb_power9
--
2.7.4


--
kernel-team mailing list
[hidden email]
https://lists.ubuntu.com/mailman/listinfo/kernel-team
Reply | Threaded
Open this post in threaded view
|

ACK: [PATCH 0/5][Yakkety SRU] Additional POWER9 support patches

Stefan Bader-2
In reply to this post by Seth Forshee



--
kernel-team mailing list
[hidden email]
https://lists.ubuntu.com/mailman/listinfo/kernel-team

signature.asc (836 bytes) Download Attachment
Reply | Threaded
Open this post in threaded view
|

ACK: [PATCH 0/5][Yakkety SRU] Additional POWER9 support patches

Marcelo Henrique Cerri
In reply to this post by Seth Forshee
--
kernel-team mailing list
[hidden email]
https://lists.ubuntu.com/mailman/listinfo/kernel-team

signature.asc (484 bytes) Download Attachment
Reply | Threaded
Open this post in threaded view
|

Applied: [PATCH 0/5][Yakkety SRU] Additional POWER9 support patches

brad.figg
In reply to this post by Seth Forshee

Applied to the master-next branch of Yakkety

--
Brad Figg [hidden email] http://www.canonical.com

--
kernel-team mailing list
[hidden email]
https://lists.ubuntu.com/mailman/listinfo/kernel-team