[SRU][Artful] using vfio-pci on a combination of cn8xxx and some PCI devices results in a kernel panic

Previous Topic Next Topic
 
classic Classic list List threaded Threaded
7 messages Options
Reply | Threaded
Open this post in threaded view
|

[SRU][Artful] using vfio-pci on a combination of cn8xxx and some PCI devices results in a kernel panic

Manoj Iyer
Bug: https://launchpad.net/bugs/1770254

Please consider these three patches cherry-picked cleanly from upstream
for SRU to Artful. On Cavium cn8xxx systems, using vfio-pci on a
combination of cn8xxx and some PCI devices results in a kernel panic.
This is triggered by issuing a bus or a slot reset on the PCI device,
and observed when you pass through a PCI device from host to guest.

These patches are already in Bionic, we need these patches in Artful so
that xenial linux-hwe also has the fix. The platforms of interest were
certified with Xenial and linux-hwe, so we are interested in fixing this
only in Artful.

A test kernel is available in ppa:manjo/lp1770254 and was tested by me
on a Cavium CN88XX system, and results are posted to the bug report.




--
kernel-team mailing list
[hidden email]
https://lists.ubuntu.com/mailman/listinfo/kernel-team
Reply | Threaded
Open this post in threaded view
|

[PATCH 1/3] PCI: Avoid bus reset if bridge itself is broken

Manoj Iyer
From: David Daney <[hidden email]>

When checking to see if a PCI bus can safely be reset, we previously
checked to see if any of the children had their PCI_DEV_FLAGS_NO_BUS_RESET
flag set.  Children marked with that flag are known not to behave well
after a bus reset.

Some PCIe root port bridges also do not behave well after a bus reset,
sometimes causing the devices behind the bridge to become unusable.

Add a check for PCI_DEV_FLAGS_NO_BUS_RESET being set in the bridge device
to allow these bridges to be flagged, and prevent their secondary buses
from being reset.

BugLink: https://launchpad.net/bugs/1770254

Signed-off-by: David Daney <[hidden email]>
[[hidden email]: fixed typo]
Signed-off-by: Jan Glauber <[hidden email]>
Signed-off-by: Bjorn Helgaas <[hidden email]>
Reviewed-by: Alex Williamson <[hidden email]>
(cherry picked from commit 357027786f3523d26f42391aa4c075b8495e5d28)
Signed-off-by: Manoj Iyer <[hidden email]>
---
 drivers/pci/pci.c | 4 ++++
 1 file changed, 4 insertions(+)

diff --git a/drivers/pci/pci.c b/drivers/pci/pci.c
index a47f55e3057a..2cce730f8ce9 100644
--- a/drivers/pci/pci.c
+++ b/drivers/pci/pci.c
@@ -4362,6 +4362,10 @@ static bool pci_bus_resetable(struct pci_bus *bus)
 {
  struct pci_dev *dev;
 
+
+ if (bus->self && (bus->self->dev_flags & PCI_DEV_FLAGS_NO_BUS_RESET))
+ return false;
+
  list_for_each_entry(dev, &bus->devices, bus_list) {
  if (dev->dev_flags & PCI_DEV_FLAGS_NO_BUS_RESET ||
     (dev->subordinate && !pci_bus_resetable(dev->subordinate)))
--
2.17.0


--
kernel-team mailing list
[hidden email]
https://lists.ubuntu.com/mailman/listinfo/kernel-team
Reply | Threaded
Open this post in threaded view
|

[PATCH 2/3] PCI: Mark Cavium CN8xxx to avoid bus reset

Manoj Iyer
In reply to this post by Manoj Iyer
From: David Daney <[hidden email]>

Root ports of cn8xxx do not function after bus reset when used with some
e1000e and LSI HBA devices.  Add a quirk to prevent bus reset on these root
ports.

BugLink: https://launchpad.net/bugs/1770254

Signed-off-by: David Daney <[hidden email]>
[[hidden email]: fixed typo and whitespaces]
Signed-off-by: Jan Glauber <[hidden email]>
Signed-off-by: Bjorn Helgaas <[hidden email]>
Reviewed-by: Alex Williamson <[hidden email]>
(cherry picked from commit 822155100e589f2a4891b3b2db2f901824d47e69)
Signed-off-by: Manoj Iyer <[hidden email]>
---
 drivers/pci/quirks.c | 7 +++++++
 1 file changed, 7 insertions(+)

diff --git a/drivers/pci/quirks.c b/drivers/pci/quirks.c
index 99eec22d99b7..9dcd5ed5a05b 100644
--- a/drivers/pci/quirks.c
+++ b/drivers/pci/quirks.c
@@ -3380,6 +3380,13 @@ DECLARE_PCI_FIXUP_HEADER(PCI_VENDOR_ID_ATHEROS, 0x0032, quirk_no_bus_reset);
 DECLARE_PCI_FIXUP_HEADER(PCI_VENDOR_ID_ATHEROS, 0x003c, quirk_no_bus_reset);
 DECLARE_PCI_FIXUP_HEADER(PCI_VENDOR_ID_ATHEROS, 0x0033, quirk_no_bus_reset);
 
+/*
+ * Root port on some Cavium CN8xxx chips do not successfully complete a bus
+ * reset when used with certain child devices.  After the reset, config
+ * accesses to the child may fail.
+ */
+DECLARE_PCI_FIXUP_HEADER(PCI_VENDOR_ID_CAVIUM, 0xa100, quirk_no_bus_reset);
+
 static void quirk_no_pm_reset(struct pci_dev *dev)
 {
  /*
--
2.17.0


--
kernel-team mailing list
[hidden email]
https://lists.ubuntu.com/mailman/listinfo/kernel-team
Reply | Threaded
Open this post in threaded view
|

[PATCH 3/3] PCI: Avoid slot reset if bridge itself is broken

Manoj Iyer
In reply to this post by Manoj Iyer
From: Jan Glauber <[hidden email]>

When checking to see if a PCI slot can safely be reset, we previously
checked to see if any of the children had their PCI_DEV_FLAGS_NO_BUS_RESET
flag set.

Some PCIe root port bridges do not behave well after a slot reset, and may
cause the device in the slot to become unusable.

Add a check for PCI_DEV_FLAGS_NO_BUS_RESET being set in the bridge device
to prevent the slot from being reset.

BugLink: https://launchpad.net/bugs/1770254

Signed-off-by: Jan Glauber <[hidden email]>
Signed-off-by: Bjorn Helgaas <[hidden email]>
Reviewed-by: Alex Williamson <[hidden email]>
(cherry picked from commit 33ba90aa4d4432b884fc0ed57ba9dc12eb8fa288)
Signed-off-by: Manoj Iyer <[hidden email]>
---
 drivers/pci/pci.c | 4 ++++
 1 file changed, 4 insertions(+)

diff --git a/drivers/pci/pci.c b/drivers/pci/pci.c
index 2cce730f8ce9..0ce61a52788c 100644
--- a/drivers/pci/pci.c
+++ b/drivers/pci/pci.c
@@ -4430,6 +4430,10 @@ static bool pci_slot_resetable(struct pci_slot *slot)
 {
  struct pci_dev *dev;
 
+ if (slot->bus->self &&
+    (slot->bus->self->dev_flags & PCI_DEV_FLAGS_NO_BUS_RESET))
+ return false;
+
  list_for_each_entry(dev, &slot->bus->devices, bus_list) {
  if (!dev->slot || dev->slot != slot)
  continue;
--
2.17.0


--
kernel-team mailing list
[hidden email]
https://lists.ubuntu.com/mailman/listinfo/kernel-team
Reply | Threaded
Open this post in threaded view
|

ACK: [SRU][Artful] using vfio-pci on a combination of cn8xxx and some PCI devices results in a kernel panic

Kleber Souza
In reply to this post by Manoj Iyer
On 05/10/18 21:32, Manoj Iyer wrote:

> Bug: https://launchpad.net/bugs/1770254
>
> Please consider these three patches cherry-picked cleanly from upstream
> for SRU to Artful. On Cavium cn8xxx systems, using vfio-pci on a
> combination of cn8xxx and some PCI devices results in a kernel panic.
> This is triggered by issuing a bus or a slot reset on the PCI device,
> and observed when you pass through a PCI device from host to guest.
>
> These patches are already in Bionic, we need these patches in Artful so
> that xenial linux-hwe also has the fix. The platforms of interest were
> certified with Xenial and linux-hwe, so we are interested in fixing this
> only in Artful.
>
> A test kernel is available in ppa:manjo/lp1770254 and was tested by me
> on a Cavium CN88XX system, and results are posted to the bug report.
>

Clean cherry-picks, tested and they seem to have a small regression
potential.

Acked-by: Kleber Sacilotto de Souza <[hidden email]>

--
kernel-team mailing list
[hidden email]
https://lists.ubuntu.com/mailman/listinfo/kernel-team
Reply | Threaded
Open this post in threaded view
|

ACK: [SRU][Artful] using vfio-pci on a combination of cn8xxx and some PCI devices results in a kernel panic

Khaled Elmously
In reply to this post by Manoj Iyer
On 2018-05-10 14:32:51 , Manoj Iyer wrote:

> Bug: https://launchpad.net/bugs/1770254
>
> Please consider these three patches cherry-picked cleanly from upstream
> for SRU to Artful. On Cavium cn8xxx systems, using vfio-pci on a
> combination of cn8xxx and some PCI devices results in a kernel panic.
> This is triggered by issuing a bus or a slot reset on the PCI device,
> and observed when you pass through a PCI device from host to guest.
>
> These patches are already in Bionic, we need these patches in Artful so
> that xenial linux-hwe also has the fix. The platforms of interest were
> certified with Xenial and linux-hwe, so we are interested in fixing this
> only in Artful.
>
> A test kernel is available in ppa:manjo/lp1770254 and was tested by me
> on a Cavium CN88XX system, and results are posted to the bug report.
>
>
>
>
Acked-by: Khalid Elmously <[hidden email]>
 

--
kernel-team mailing list
[hidden email]
https://lists.ubuntu.com/mailman/listinfo/kernel-team
Reply | Threaded
Open this post in threaded view
|

APPLIED: [SRU][Artful] using vfio-pci on a combination of cn8xxx and some PCI devices results in a kernel panic

Khaled Elmously
In reply to this post by Manoj Iyer
Applied to A

On 2018-05-10 14:32:51 , Manoj Iyer wrote:

> Bug: https://launchpad.net/bugs/1770254
>
> Please consider these three patches cherry-picked cleanly from upstream
> for SRU to Artful. On Cavium cn8xxx systems, using vfio-pci on a
> combination of cn8xxx and some PCI devices results in a kernel panic.
> This is triggered by issuing a bus or a slot reset on the PCI device,
> and observed when you pass through a PCI device from host to guest.
>
> These patches are already in Bionic, we need these patches in Artful so
> that xenial linux-hwe also has the fix. The platforms of interest were
> certified with Xenial and linux-hwe, so we are interested in fixing this
> only in Artful.
>
> A test kernel is available in ppa:manjo/lp1770254 and was tested by me
> on a Cavium CN88XX system, and results are posted to the bug report.
>
>
>
>
> --
> kernel-team mailing list
> [hidden email]
> https://lists.ubuntu.com/mailman/listinfo/kernel-team

--
kernel-team mailing list
[hidden email]
https://lists.ubuntu.com/mailman/listinfo/kernel-team