[SRU][Bionic][PATCH 1/1] powerpc/powernv: Fix concurrency issue with npu->mmio_atsd_usage

classic Classic list List threaded Threaded
4 messages Options
Reply | Threaded
Open this post in threaded view
|

[SRU][Bionic][PATCH 1/1] powerpc/powernv: Fix concurrency issue with npu->mmio_atsd_usage

Khaled Elmously
From: Reza Arbab <[hidden email]>

BugLink: https://bugs.launchpad.net/bugs/1788097

We've encountered a performance issue when multiple processors stress
{get,put}_mmio_atsd_reg(). These functions contend for
mmio_atsd_usage, an unsigned long used as a bitmask.

The accesses to mmio_atsd_usage are done using test_and_set_bit_lock()
and clear_bit_unlock(). As implemented, both of these will require
a (successful) stwcx to that same cache line.

What we end up with is thread A, attempting to unlock, being slowed by
other threads repeatedly attempting to lock. A's stwcx instructions
fail and retry because the memory reservation is lost every time a
different thread beats it to the punch.

There may be a long-term way to fix this at a larger scale, but for
now resolve the immediate problem by gating our call to
test_and_set_bit_lock() with one to test_bit(), which is obviously
implemented without using a store.

Fixes: 1ab66d1fbada ("powerpc/powernv: Introduce address translation services for Nvlink2")
Signed-off-by: Reza Arbab <[hidden email]>
Acked-by: Alistair Popple <[hidden email]>
Signed-off-by: Michael Ellerman <[hidden email]>
(cherry-picked from 9eab9901b015f489199105c470de1ffc337cfabb)
Signed-off-by: Khalid Elmously <[hidden email]>
---
 arch/powerpc/platforms/powernv/npu-dma.c | 5 +++--
 1 file changed, 3 insertions(+), 2 deletions(-)

diff --git a/arch/powerpc/platforms/powernv/npu-dma.c b/arch/powerpc/platforms/powernv/npu-dma.c
index 6c8e168e6571..18226895681e 100644
--- a/arch/powerpc/platforms/powernv/npu-dma.c
+++ b/arch/powerpc/platforms/powernv/npu-dma.c
@@ -434,8 +434,9 @@ static int get_mmio_atsd_reg(struct npu *npu)
  int i;
 
  for (i = 0; i < npu->mmio_atsd_count; i++) {
- if (!test_and_set_bit_lock(i, &npu->mmio_atsd_usage))
- return i;
+ if (!test_bit(i, &npu->mmio_atsd_usage))
+ if (!test_and_set_bit_lock(i, &npu->mmio_atsd_usage))
+ return i;
  }
 
  return -ENOSPC;
--
2.17.1


--
kernel-team mailing list
[hidden email]
https://lists.ubuntu.com/mailman/listinfo/kernel-team
Reply | Threaded
Open this post in threaded view
|

ACK: [SRU][Bionic][PATCH 1/1] powerpc/powernv: Fix concurrency issue with npu->mmio_atsd_usage

Stefan Bader-2
On 30.08.2018 21:44, Khalid Elmously wrote:

> From: Reza Arbab <[hidden email]>
>
> BugLink: https://bugs.launchpad.net/bugs/1788097
>
> We've encountered a performance issue when multiple processors stress
> {get,put}_mmio_atsd_reg(). These functions contend for
> mmio_atsd_usage, an unsigned long used as a bitmask.
>
> The accesses to mmio_atsd_usage are done using test_and_set_bit_lock()
> and clear_bit_unlock(). As implemented, both of these will require
> a (successful) stwcx to that same cache line.
>
> What we end up with is thread A, attempting to unlock, being slowed by
> other threads repeatedly attempting to lock. A's stwcx instructions
> fail and retry because the memory reservation is lost every time a
> different thread beats it to the punch.
>
> There may be a long-term way to fix this at a larger scale, but for
> now resolve the immediate problem by gating our call to
> test_and_set_bit_lock() with one to test_bit(), which is obviously
> implemented without using a store.
>
> Fixes: 1ab66d1fbada ("powerpc/powernv: Introduce address translation services for Nvlink2")
> Signed-off-by: Reza Arbab <[hidden email]>
> Acked-by: Alistair Popple <[hidden email]>
> Signed-off-by: Michael Ellerman <[hidden email]>
> (cherry-picked from 9eab9901b015f489199105c470de1ffc337cfabb)
> Signed-off-by: Khalid Elmously <[hidden email]>
Acked-by: Stefan Bader <[hidden email]>

> ---
>  arch/powerpc/platforms/powernv/npu-dma.c | 5 +++--
>  1 file changed, 3 insertions(+), 2 deletions(-)
>
> diff --git a/arch/powerpc/platforms/powernv/npu-dma.c b/arch/powerpc/platforms/powernv/npu-dma.c
> index 6c8e168e6571..18226895681e 100644
> --- a/arch/powerpc/platforms/powernv/npu-dma.c
> +++ b/arch/powerpc/platforms/powernv/npu-dma.c
> @@ -434,8 +434,9 @@ static int get_mmio_atsd_reg(struct npu *npu)
>   int i;
>  
>   for (i = 0; i < npu->mmio_atsd_count; i++) {
> - if (!test_and_set_bit_lock(i, &npu->mmio_atsd_usage))
> - return i;
> + if (!test_bit(i, &npu->mmio_atsd_usage))
> + if (!test_and_set_bit_lock(i, &npu->mmio_atsd_usage))
> + return i;
>   }
>  
>   return -ENOSPC;
>


--
kernel-team mailing list
[hidden email]
https://lists.ubuntu.com/mailman/listinfo/kernel-team

signature.asc (836 bytes) Download Attachment
Reply | Threaded
Open this post in threaded view
|

ACK / APPLIED[C]: [SRU][Bionic][PATCH 1/1] powerpc/powernv: Fix concurrency issue with npu->mmio_atsd_usage

Seth Forshee
In reply to this post by Khaled Elmously
On Thu, Aug 30, 2018 at 03:44:44PM -0400, Khalid Elmously wrote:

> From: Reza Arbab <[hidden email]>
>
> BugLink: https://bugs.launchpad.net/bugs/1788097
>
> We've encountered a performance issue when multiple processors stress
> {get,put}_mmio_atsd_reg(). These functions contend for
> mmio_atsd_usage, an unsigned long used as a bitmask.
>
> The accesses to mmio_atsd_usage are done using test_and_set_bit_lock()
> and clear_bit_unlock(). As implemented, both of these will require
> a (successful) stwcx to that same cache line.
>
> What we end up with is thread A, attempting to unlock, being slowed by
> other threads repeatedly attempting to lock. A's stwcx instructions
> fail and retry because the memory reservation is lost every time a
> different thread beats it to the punch.
>
> There may be a long-term way to fix this at a larger scale, but for
> now resolve the immediate problem by gating our call to
> test_and_set_bit_lock() with one to test_bit(), which is obviously
> implemented without using a store.
>
> Fixes: 1ab66d1fbada ("powerpc/powernv: Introduce address translation services for Nvlink2")
> Signed-off-by: Reza Arbab <[hidden email]>
> Acked-by: Alistair Popple <[hidden email]>
> Signed-off-by: Michael Ellerman <[hidden email]>
> (cherry-picked from 9eab9901b015f489199105c470de1ffc337cfabb)
> Signed-off-by: Khalid Elmously <[hidden email]>

Acked-by: Seth Forshee <[hidden email]>

Applied to cosmic/master-next, thanks!

--
kernel-team mailing list
[hidden email]
https://lists.ubuntu.com/mailman/listinfo/kernel-team
Reply | Threaded
Open this post in threaded view
|

APPLIED: [SRU][Bionic][PATCH 1/1] powerpc/powernv: Fix concurrency issue with npu->mmio_atsd_usage

Kleber Souza
In reply to this post by Khaled Elmously
On 08/30/18 21:44, Khalid Elmously wrote:

> From: Reza Arbab <[hidden email]>
>
> BugLink: https://bugs.launchpad.net/bugs/1788097
>
> We've encountered a performance issue when multiple processors stress
> {get,put}_mmio_atsd_reg(). These functions contend for
> mmio_atsd_usage, an unsigned long used as a bitmask.
>
> The accesses to mmio_atsd_usage are done using test_and_set_bit_lock()
> and clear_bit_unlock(). As implemented, both of these will require
> a (successful) stwcx to that same cache line.
>
> What we end up with is thread A, attempting to unlock, being slowed by
> other threads repeatedly attempting to lock. A's stwcx instructions
> fail and retry because the memory reservation is lost every time a
> different thread beats it to the punch.
>
> There may be a long-term way to fix this at a larger scale, but for
> now resolve the immediate problem by gating our call to
> test_and_set_bit_lock() with one to test_bit(), which is obviously
> implemented without using a store.
>
> Fixes: 1ab66d1fbada ("powerpc/powernv: Introduce address translation services for Nvlink2")
> Signed-off-by: Reza Arbab <[hidden email]>
> Acked-by: Alistair Popple <[hidden email]>
> Signed-off-by: Michael Ellerman <[hidden email]>
> (cherry-picked from 9eab9901b015f489199105c470de1ffc337cfabb)
> Signed-off-by: Khalid Elmously <[hidden email]>
> ---
>  arch/powerpc/platforms/powernv/npu-dma.c | 5 +++--
>  1 file changed, 3 insertions(+), 2 deletions(-)
>
> diff --git a/arch/powerpc/platforms/powernv/npu-dma.c b/arch/powerpc/platforms/powernv/npu-dma.c
> index 6c8e168e6571..18226895681e 100644
> --- a/arch/powerpc/platforms/powernv/npu-dma.c
> +++ b/arch/powerpc/platforms/powernv/npu-dma.c
> @@ -434,8 +434,9 @@ static int get_mmio_atsd_reg(struct npu *npu)
>   int i;
>  
>   for (i = 0; i < npu->mmio_atsd_count; i++) {
> - if (!test_and_set_bit_lock(i, &npu->mmio_atsd_usage))
> - return i;
> + if (!test_bit(i, &npu->mmio_atsd_usage))
> + if (!test_and_set_bit_lock(i, &npu->mmio_atsd_usage))
> + return i;
>   }
>  
>   return -ENOSPC;
>

Applied to bionic/master-next branch.

Thanks,
Kleber

--
kernel-team mailing list
[hidden email]
https://lists.ubuntu.com/mailman/listinfo/kernel-team