[SRU B][PATCH 0/1] powerpc: Work around PCI boot races - LP: #1805245

classic Classic list List threaded Threaded
5 messages Options
Reply | Threaded
Open this post in threaded view
|

[SRU B][PATCH 0/1] powerpc: Work around PCI boot races - LP: #1805245

Daniel Axtens
SRU Justification
=================

[Impact]

An IBM OpenPower partner reports their system with a bunch of NVMe
drives fails the NVMe init due to some drives taking PCIe EEH errors.

[Fix]

Pick patch db2173198b9513f7add8009f225afa1f1c79bcc6 upstream.

[Testing]

IBM reports that this patch fixes the user's issue.

[Regression Potential]

The patch is already in Cosmic
(db33bbe77b9594133fecf0dc290322437170627f) and in some stable trees
(1eb08e7b192d2c412175f607cf51449c916abd57 in 4.14.y).  It only affects
PowerPC.

Benjamin Herrenschmidt (1):
  powerpc/powernv/pci: Work around races in PCI bridge enabling

 arch/powerpc/platforms/powernv/pci-ioda.c | 37 +++++++++++++++++++++++
 1 file changed, 37 insertions(+)

--
2.17.1


--
kernel-team mailing list
[hidden email]
https://lists.ubuntu.com/mailman/listinfo/kernel-team
Reply | Threaded
Open this post in threaded view
|

[SRU B][PATCH 1/1] powerpc/powernv/pci: Work around races in PCI bridge enabling

Daniel Axtens
From: Benjamin Herrenschmidt <[hidden email]>

BugLink: https://bugs.launchpad.net/bugs/1805245

The generic code is racy when multiple children of a PCI bridge try to
enable it simultaneously.

This leads to drivers trying to access a device through a
not-yet-enabled bridge, and this EEH errors under various
circumstances when using parallel driver probing.

There is work going on to fix that properly in the PCI core but it
will take some time.

x86 gets away with it because (outside of hotplug), the BIOS enables
all the bridges at boot time.

This patch does the same thing on powernv by enabling all bridges that
have child devices at boot time, thus avoiding subsequent races. It's
suitable for backporting to stable and distros, while the proper PCI
fix will probably be significantly more invasive.

Signed-off-by: Benjamin Herrenschmidt <[hidden email]>
Cc: [hidden email]
Signed-off-by: Michael Ellerman <[hidden email]>
(backported from commit db2173198b9513f7add8009f225afa1f1c79bcc6
 - changed pci_err() -> dev_err())
Signed-off-by: Michael Neuling <[hidden email]>
Signed-off-by: Daniel Axtens <[hidden email]>
---
 arch/powerpc/platforms/powernv/pci-ioda.c | 37 +++++++++++++++++++++++
 1 file changed, 37 insertions(+)

diff --git a/arch/powerpc/platforms/powernv/pci-ioda.c b/arch/powerpc/platforms/powernv/pci-ioda.c
index d5af700820f3..ea4630f17f74 100644
--- a/arch/powerpc/platforms/powernv/pci-ioda.c
+++ b/arch/powerpc/platforms/powernv/pci-ioda.c
@@ -3312,12 +3312,49 @@ static void pnv_pci_ioda_create_dbgfs(void)
 #endif /* CONFIG_DEBUG_FS */
 }
 
+static void pnv_pci_enable_bridge(struct pci_bus *bus)
+{
+ struct pci_dev *dev = bus->self;
+ struct pci_bus *child;
+
+ /* Empty bus ? bail */
+ if (list_empty(&bus->devices))
+ return;
+
+ /*
+ * If there's a bridge associated with that bus enable it. This works
+ * around races in the generic code if the enabling is done during
+ * parallel probing. This can be removed once those races have been
+ * fixed.
+ */
+ if (dev) {
+ int rc = pci_enable_device(dev);
+ if (rc)
+ dev_err(&dev->dev, "Error enabling bridge (%d)\n", rc);
+ pci_set_master(dev);
+ }
+
+ /* Perform the same to child busses */
+ list_for_each_entry(child, &bus->children, node)
+ pnv_pci_enable_bridge(child);
+}
+
+static void pnv_pci_enable_bridges(void)
+{
+ struct pci_controller *hose;
+
+ list_for_each_entry(hose, &hose_list, list_node)
+ pnv_pci_enable_bridge(hose->bus);
+}
+
 static void pnv_pci_ioda_fixup(void)
 {
  pnv_pci_ioda_setup_PEs();
  pnv_pci_ioda_setup_iommu_api();
  pnv_pci_ioda_create_dbgfs();
 
+ pnv_pci_enable_bridges();
+
 #ifdef CONFIG_EEH
  pnv_eeh_post_init();
 #endif
--
2.17.1


--
kernel-team mailing list
[hidden email]
https://lists.ubuntu.com/mailman/listinfo/kernel-team
Reply | Threaded
Open this post in threaded view
|

ACK/cmnt: [SRU B][PATCH 1/1] powerpc/powernv/pci: Work around races in PCI bridge enabling

Kleber Souza
On 11/26/18 11:08 PM, Daniel Axtens wrote:

> From: Benjamin Herrenschmidt <[hidden email]>
>
> BugLink: https://bugs.launchpad.net/bugs/1805245
>
> The generic code is racy when multiple children of a PCI bridge try to
> enable it simultaneously.
>
> This leads to drivers trying to access a device through a
> not-yet-enabled bridge, and this EEH errors under various
> circumstances when using parallel driver probing.
>
> There is work going on to fix that properly in the PCI core but it
> will take some time.
>
> x86 gets away with it because (outside of hotplug), the BIOS enables
> all the bridges at boot time.
>
> This patch does the same thing on powernv by enabling all bridges that
> have child devices at boot time, thus avoiding subsequent races. It's
> suitable for backporting to stable and distros, while the proper PCI
> fix will probably be significantly more invasive.
>
> Signed-off-by: Benjamin Herrenschmidt <[hidden email]>
> Cc: [hidden email]
> Signed-off-by: Michael Ellerman <[hidden email]>
> (backported from commit db2173198b9513f7add8009f225afa1f1c79bcc6
>  - changed pci_err() -> dev_err())

Please add backporting comments in square brackets below the original
sha1, so it would be like the following:

(backported from commit db2173198b9513f7add8009f225afa1f1c79bcc6)
[ changed pci_err() -> dev_err() ]


We can fix it when applying the patch.

> Signed-off-by: Michael Neuling <[hidden email]>
> Signed-off-by: Daniel Axtens <[hidden email]>

Straightforward backport, tested by the vendor.

With the fix mentioned above:

Acked-by: Kleber Sacilotto de Souza <[hidden email]>

> ---
>  arch/powerpc/platforms/powernv/pci-ioda.c | 37 +++++++++++++++++++++++
>  1 file changed, 37 insertions(+)
>
> diff --git a/arch/powerpc/platforms/powernv/pci-ioda.c b/arch/powerpc/platforms/powernv/pci-ioda.c
> index d5af700820f3..ea4630f17f74 100644
> --- a/arch/powerpc/platforms/powernv/pci-ioda.c
> +++ b/arch/powerpc/platforms/powernv/pci-ioda.c
> @@ -3312,12 +3312,49 @@ static void pnv_pci_ioda_create_dbgfs(void)
>  #endif /* CONFIG_DEBUG_FS */
>  }
>  
> +static void pnv_pci_enable_bridge(struct pci_bus *bus)
> +{
> + struct pci_dev *dev = bus->self;
> + struct pci_bus *child;
> +
> + /* Empty bus ? bail */
> + if (list_empty(&bus->devices))
> + return;
> +
> + /*
> + * If there's a bridge associated with that bus enable it. This works
> + * around races in the generic code if the enabling is done during
> + * parallel probing. This can be removed once those races have been
> + * fixed.
> + */
> + if (dev) {
> + int rc = pci_enable_device(dev);
> + if (rc)
> + dev_err(&dev->dev, "Error enabling bridge (%d)\n", rc);
> + pci_set_master(dev);
> + }
> +
> + /* Perform the same to child busses */
> + list_for_each_entry(child, &bus->children, node)
> + pnv_pci_enable_bridge(child);
> +}
> +
> +static void pnv_pci_enable_bridges(void)
> +{
> + struct pci_controller *hose;
> +
> + list_for_each_entry(hose, &hose_list, list_node)
> + pnv_pci_enable_bridge(hose->bus);
> +}
> +
>  static void pnv_pci_ioda_fixup(void)
>  {
>   pnv_pci_ioda_setup_PEs();
>   pnv_pci_ioda_setup_iommu_api();
>   pnv_pci_ioda_create_dbgfs();
>  
> + pnv_pci_enable_bridges();
> +
>  #ifdef CONFIG_EEH
>   pnv_eeh_post_init();
>  #endif




--
kernel-team mailing list
[hidden email]
https://lists.ubuntu.com/mailman/listinfo/kernel-team
Reply | Threaded
Open this post in threaded view
|

ACK/Cmnt: [SRU B][PATCH 1/1] powerpc/powernv/pci: Work around races in PCI bridge enabling

Stefan Bader-2
In reply to this post by Daniel Axtens
On 26.11.18 23:08, Daniel Axtens wrote:

> From: Benjamin Herrenschmidt <[hidden email]>
>
> BugLink: https://bugs.launchpad.net/bugs/1805245
>
> The generic code is racy when multiple children of a PCI bridge try to
> enable it simultaneously.
>
> This leads to drivers trying to access a device through a
> not-yet-enabled bridge, and this EEH errors under various
> circumstances when using parallel driver probing.
>
> There is work going on to fix that properly in the PCI core but it
> will take some time.
>
> x86 gets away with it because (outside of hotplug), the BIOS enables
> all the bridges at boot time.
>
> This patch does the same thing on powernv by enabling all bridges that
> have child devices at boot time, thus avoiding subsequent races. It's
> suitable for backporting to stable and distros, while the proper PCI
> fix will probably be significantly more invasive.
>
> Signed-off-by: Benjamin Herrenschmidt <[hidden email]>
> Cc: [hidden email]
> Signed-off-by: Michael Ellerman <[hidden email]>
> (backported from commit db2173198b9513f7add8009f225afa1f1c79bcc6
>  - changed pci_err() -> dev_err())
> Signed-off-by: Michael Neuling <[hidden email]>
> Signed-off-by: Daniel Axtens <[hidden email]>
Acked-by: Stefan Bader <[hidden email]>
> ---

With changes suggested by Kleber.

>  arch/powerpc/platforms/powernv/pci-ioda.c | 37 +++++++++++++++++++++++
>  1 file changed, 37 insertions(+)
>
> diff --git a/arch/powerpc/platforms/powernv/pci-ioda.c b/arch/powerpc/platforms/powernv/pci-ioda.c
> index d5af700820f3..ea4630f17f74 100644
> --- a/arch/powerpc/platforms/powernv/pci-ioda.c
> +++ b/arch/powerpc/platforms/powernv/pci-ioda.c
> @@ -3312,12 +3312,49 @@ static void pnv_pci_ioda_create_dbgfs(void)
>  #endif /* CONFIG_DEBUG_FS */
>  }
>  
> +static void pnv_pci_enable_bridge(struct pci_bus *bus)
> +{
> + struct pci_dev *dev = bus->self;
> + struct pci_bus *child;
> +
> + /* Empty bus ? bail */
> + if (list_empty(&bus->devices))
> + return;
> +
> + /*
> + * If there's a bridge associated with that bus enable it. This works
> + * around races in the generic code if the enabling is done during
> + * parallel probing. This can be removed once those races have been
> + * fixed.
> + */
> + if (dev) {
> + int rc = pci_enable_device(dev);
> + if (rc)
> + dev_err(&dev->dev, "Error enabling bridge (%d)\n", rc);
> + pci_set_master(dev);
> + }
> +
> + /* Perform the same to child busses */
> + list_for_each_entry(child, &bus->children, node)
> + pnv_pci_enable_bridge(child);
> +}
> +
> +static void pnv_pci_enable_bridges(void)
> +{
> + struct pci_controller *hose;
> +
> + list_for_each_entry(hose, &hose_list, list_node)
> + pnv_pci_enable_bridge(hose->bus);
> +}
> +
>  static void pnv_pci_ioda_fixup(void)
>  {
>   pnv_pci_ioda_setup_PEs();
>   pnv_pci_ioda_setup_iommu_api();
>   pnv_pci_ioda_create_dbgfs();
>  
> + pnv_pci_enable_bridges();
> +
>  #ifdef CONFIG_EEH
>   pnv_eeh_post_init();
>  #endif
>


--
kernel-team mailing list
[hidden email]
https://lists.ubuntu.com/mailman/listinfo/kernel-team

signature.asc (849 bytes) Download Attachment
Reply | Threaded
Open this post in threaded view
|

APPLIED/cmnt: [SRU B][PATCH 1/1] powerpc/powernv/pci: Work around races in PCI bridge enabling

Kleber Souza
In reply to this post by Daniel Axtens
On 11/26/18 11:08 PM, Daniel Axtens wrote:

> From: Benjamin Herrenschmidt <[hidden email]>
>
> BugLink: https://bugs.launchpad.net/bugs/1805245
>
> The generic code is racy when multiple children of a PCI bridge try to
> enable it simultaneously.
>
> This leads to drivers trying to access a device through a
> not-yet-enabled bridge, and this EEH errors under various
> circumstances when using parallel driver probing.
>
> There is work going on to fix that properly in the PCI core but it
> will take some time.
>
> x86 gets away with it because (outside of hotplug), the BIOS enables
> all the bridges at boot time.
>
> This patch does the same thing on powernv by enabling all bridges that
> have child devices at boot time, thus avoiding subsequent races. It's
> suitable for backporting to stable and distros, while the proper PCI
> fix will probably be significantly more invasive.
>
> Signed-off-by: Benjamin Herrenschmidt <[hidden email]>
> Cc: [hidden email]
> Signed-off-by: Michael Ellerman <[hidden email]>
> (backported from commit db2173198b9513f7add8009f225afa1f1c79bcc6
>  - changed pci_err() -> dev_err())
> Signed-off-by: Michael Neuling <[hidden email]>
> Signed-off-by: Daniel Axtens <[hidden email]>
> ---
>  arch/powerpc/platforms/powernv/pci-ioda.c | 37 +++++++++++++++++++++++
>  1 file changed, 37 insertions(+)
>
> diff --git a/arch/powerpc/platforms/powernv/pci-ioda.c b/arch/powerpc/platforms/powernv/pci-ioda.c
> index d5af700820f3..ea4630f17f74 100644
> --- a/arch/powerpc/platforms/powernv/pci-ioda.c
> +++ b/arch/powerpc/platforms/powernv/pci-ioda.c
> @@ -3312,12 +3312,49 @@ static void pnv_pci_ioda_create_dbgfs(void)
>  #endif /* CONFIG_DEBUG_FS */
>  }
>  
> +static void pnv_pci_enable_bridge(struct pci_bus *bus)
> +{
> + struct pci_dev *dev = bus->self;
> + struct pci_bus *child;
> +
> + /* Empty bus ? bail */
> + if (list_empty(&bus->devices))
> + return;
> +
> + /*
> + * If there's a bridge associated with that bus enable it. This works
> + * around races in the generic code if the enabling is done during
> + * parallel probing. This can be removed once those races have been
> + * fixed.
> + */
> + if (dev) {
> + int rc = pci_enable_device(dev);
> + if (rc)
> + dev_err(&dev->dev, "Error enabling bridge (%d)\n", rc);
> + pci_set_master(dev);
> + }
> +
> + /* Perform the same to child busses */
> + list_for_each_entry(child, &bus->children, node)
> + pnv_pci_enable_bridge(child);
> +}
> +
> +static void pnv_pci_enable_bridges(void)
> +{
> + struct pci_controller *hose;
> +
> + list_for_each_entry(hose, &hose_list, list_node)
> + pnv_pci_enable_bridge(hose->bus);
> +}
> +
>  static void pnv_pci_ioda_fixup(void)
>  {
>   pnv_pci_ioda_setup_PEs();
>   pnv_pci_ioda_setup_iommu_api();
>   pnv_pci_ioda_create_dbgfs();
>  
> + pnv_pci_enable_bridges();
> +
>  #ifdef CONFIG_EEH
>   pnv_eeh_post_init();
>  #endif

Applied to bionic/master-next branch, with the backport comments fixed.

Thanks,
Kleber


--
kernel-team mailing list
[hidden email]
https://lists.ubuntu.com/mailman/listinfo/kernel-team