SRU: Hardy LP Bug #214814 - PCI/ACPI: "BUG: soft lockup - CPU#0 stuck for 61s!"

classic Classic list List threaded Threaded
4 messages Options
Reply | Threaded
Open this post in threaded view
|

SRU: Hardy LP Bug #214814 - PCI/ACPI: "BUG: soft lockup - CPU#0 stuck for 61s!"

TJ-17
This cherry-pick will fix the issue described in LP Bug #214814 "BUG:
soft lockup - CPU#0 stuck for 61s!". The bug has previously been tagged
for SRU but missed the 8.04.1 milestone. The upstream report is:

http://bugzilla.kernel.org/show_bug.cgi?id=10124

That patch has been tested and confirmed working.

---
commit b87e81e5c6e64ae0eae3b4f61bf07bfeec856184
Author: [hidden email] <[hidden email]>
Date:   Tue Apr 15 14:34:49 2008 -0700

    acpi: unneccessary to scan the PCI bus already scanned
---

Systems based on the Intel 450NX chipset may experience issues where
devices aren't recognised that lead to drivers failing, unhandled IRQs,
and other serious boot failures. The issue is caused because this
chipset has 3 PCI root buses.
When it was first released some operating systems (read: Windows NT)
didn't always correctly discover the 2nd and 3rd PCI buses. As a result
the PCI BIOS tables were 'hacked' to have a fake bridge device on PCI
bus 0 that points to the same bus number as the 1st bus so they would be
scanned correctly by the OS.

As a result, in a well-behaved OS the 2nd and 3rd PCI buses would be
scanned twice. Once as secondaries of the 1st bus, and then as root
buses in their own right. This caused problems with devices being
discovered twice.

Unfortunately, the user's kernel-log report is misleading since the bus
has already been found to be registered and therefore ignored. The
situation can be worked around by booting with "pci=noacpi".

A fix-up for all i450N chipsets was introduced in
arch/i386/pci/fixups.c::pci_fixup_i450nx(). Note: arch/i386 was
refactored to arch/x86/ subsequently. The fix-up checks the PCI config
for the subsidiary buses and if it finds them scans them. This adds them
to the root_pci_bus list. Later in the boot process the ACPI/PCI code
reads the ACPI DSDT table, finds the PCI bus entries (PNP0A03) and tries
to scan them again, leading to the kernel BUG.

This patch ensures that buses already scanned are recognised rather than
ignored.

TJ.
Ubuntu Kernel ACPI Team.


--
kernel-team mailing list
[hidden email]
https://lists.ubuntu.com/mailman/listinfo/kernel-team
Reply | Threaded
Open this post in threaded view
|

Re: SRU: Hardy LP Bug #214814 - PCI/ACPI: "BUG: soft lockup - CPU#0 stuck for 61s!"

Tim Gardner-2
TJ wrote:

> This cherry-pick will fix the issue described in LP Bug #214814 "BUG:
> soft lockup - CPU#0 stuck for 61s!". The bug has previously been tagged
> for SRU but missed the 8.04.1 milestone. The upstream report is:
>
> http://bugzilla.kernel.org/show_bug.cgi?id=10124
>
> That patch has been tested and confirmed working.
>
> ---
> commit b87e81e5c6e64ae0eae3b4f61bf07bfeec856184
> Author: [hidden email] <[hidden email]>
> Date:   Tue Apr 15 14:34:49 2008 -0700
>
>     acpi: unneccessary to scan the PCI bus already scanned
> ---
>
> Systems based on the Intel 450NX chipset may experience issues where
> devices aren't recognised that lead to drivers failing, unhandled IRQs,
> and other serious boot failures. The issue is caused because this
> chipset has 3 PCI root buses.
> When it was first released some operating systems (read: Windows NT)
> didn't always correctly discover the 2nd and 3rd PCI buses. As a result
> the PCI BIOS tables were 'hacked' to have a fake bridge device on PCI
> bus 0 that points to the same bus number as the 1st bus so they would be
> scanned correctly by the OS.
>
> As a result, in a well-behaved OS the 2nd and 3rd PCI buses would be
> scanned twice. Once as secondaries of the 1st bus, and then as root
> buses in their own right. This caused problems with devices being
> discovered twice.
>
> Unfortunately, the user's kernel-log report is misleading since the bus
> has already been found to be registered and therefore ignored. The
> situation can be worked around by booting with "pci=noacpi".
>
> A fix-up for all i450N chipsets was introduced in
> arch/i386/pci/fixups.c::pci_fixup_i450nx(). Note: arch/i386 was
> refactored to arch/x86/ subsequently. The fix-up checks the PCI config
> for the subsidiary buses and if it finds them scans them. This adds them
> to the root_pci_bus list. Later in the boot process the ACPI/PCI code
> reads the ACPI DSDT table, finds the PCI bus entries (PNP0A03) and tries
> to scan them again, leading to the kernel BUG.
>
> This patch ensures that buses already scanned are recognised rather than
> ignored.
>
> TJ.
> Ubuntu Kernel ACPI Team.
>
>
Applied. LP: #258143

http://kernel.ubuntu.com/git?p=ubuntu/ubuntu-hardy.git;a=commit;h=4469fffe7db317faf023fa24fe87a900af0c3524

Note - I mistakenly created a new bug report, but I should have used the
existing #214814 (which I've now marked as a duplicate). Sigh, need to
postpone SRU processing in the morning until after I've had more coffee.
Thats twice this week.

--
Tim Gardner [hidden email]

From 4469fffe7db317faf023fa24fe87a900af0c3524 Mon Sep 17 00:00:00 2001
From: [hidden email] <[hidden email]>
Date: Tue, 15 Apr 2008 14:34:49 -0700
Subject: [PATCH] acpi: unneccessary to scan the PCI bus already scanned
 Bug: #258143
 http://bugzilla.kernel.org/show_bug.cgi?id=10124

this change:

      commit 08f1c192c3c32797068bfe97738babb3295bbf42
      Author: Muli Ben-Yehuda <[hidden email]>
      Date:   Sun Jul 22 00:23:39 2007 +0300

         x86-64: introduce struct pci_sysdata to facilitate sharing of ->sysdata

         This patch introduces struct pci_sysdata to x86 and x86-64, and
         converts the existing two users (NUMA, Calgary) to use it.

         This lays the groundwork for having other users of sysdata, such as
         the PCI domains work.

         The Calgary bits are tested, the NUMA bits just look ok.

replaces pcibios_scan_root by pci_scan_bus_parented...

but in pcibios_scan_root we have a check about scanned busses.

Cc: <[hidden email]>
Cc: Stian Jordet <[hidden email]>
Cc: Len Brown <[hidden email]>
Cc: Greg KH <[hidden email]>
Cc: Ingo Molnar <[hidden email]>
Cc: "Yinghai Lu" <[hidden email]>
Signed-off-by: Andrew Morton <[hidden email]>
Signed-off-by: Linus Torvalds <[hidden email]>
Signed-off-by: Tim Gardner <[hidden email]>
---
 arch/ia64/pci/pci.c |    7 ++++++-
 arch/x86/pci/acpi.c |   17 +++++++++++++++--
 2 files changed, 21 insertions(+), 3 deletions(-)

diff --git a/arch/ia64/pci/pci.c b/arch/ia64/pci/pci.c
index 488e48a..3e14a81 100644
--- a/arch/ia64/pci/pci.c
+++ b/arch/ia64/pci/pci.c
@@ -371,7 +371,12 @@ pci_acpi_scan_root(struct acpi_device *device, int domain, int bus)
  info.name = name;
  acpi_walk_resources(device->handle, METHOD_NAME__CRS, add_window,
  &info);
-
+ /*
+ * See arch/x86/pci/acpi.c.
+ * The desired pci bus might already be scanned in a quirk. We
+ * should handle the case here, but it appears that IA64 hasn't
+ * such quirk. So we just ignore the case now.
+ */
  pbus = pci_scan_bus_parented(NULL, bus, &pci_root_ops, controller);
  if (pbus)
  pcibios_setup_root_windows(pbus, controller);
diff --git a/arch/x86/pci/acpi.c b/arch/x86/pci/acpi.c
index 0234f28..378136f 100644
--- a/arch/x86/pci/acpi.c
+++ b/arch/x86/pci/acpi.c
@@ -219,8 +219,21 @@ struct pci_bus * __devinit pci_acpi_scan_root(struct acpi_device *device, int do
  if (pxm >= 0)
  sd->node = pxm_to_node(pxm);
 #endif
+ /*
+ * Maybe the desired pci bus has been already scanned. In such case
+ * it is unnecessary to scan the pci bus with the given domain,busnum.
+ */
+ bus = pci_find_bus(domain, busnum);
+ if (bus) {
+ /*
+ * If the desired bus exits, the content of bus->sysdata will
+ * be replaced by sd.
+ */
+ memcpy(bus->sysdata, sd, sizeof(*sd));
+ kfree(sd);
+ } else
+ bus = pci_scan_bus_parented(NULL, busnum, &pci_root_ops, sd);
 
- bus = pci_scan_bus_parented(NULL, busnum, &pci_root_ops, sd);
  if (!bus)
  kfree(sd);
 
@@ -228,7 +241,7 @@ struct pci_bus * __devinit pci_acpi_scan_root(struct acpi_device *device, int do
  if (bus != NULL) {
  if (pxm >= 0) {
  printk("bus %d -> pxm %d -> node %d\n",
- busnum, pxm, sd->node);
+ busnum, pxm, pxm_to_node(pxm));
  }
  }
 #endif
--
1.5.4.3


--
kernel-team mailing list
[hidden email]
https://lists.ubuntu.com/mailman/listinfo/kernel-team
Reply | Threaded
Open this post in threaded view
|

Re: SRU: Hardy LP Bug #214814 - PCI/ACPI: "BUG: soft lockup - CPU#0 stuck for 61s!"

Stefan Bader-2
-----BEGIN PGP SIGNED MESSAGE-----
Hash: SHA1

Tim Gardner wrote:

> TJ wrote:
>> This cherry-pick will fix the issue described in LP Bug #214814 "BUG:
>> soft lockup - CPU#0 stuck for 61s!". The bug has previously been tagged
>> for SRU but missed the 8.04.1 milestone. The upstream report is:
>>
>> http://bugzilla.kernel.org/show_bug.cgi?id=10124
>>
>> That patch has been tested and confirmed working.
>>
>> ---
>> commit b87e81e5c6e64ae0eae3b4f61bf07bfeec856184
>> Author: [hidden email] <[hidden email]>
>> Date:   Tue Apr 15 14:34:49 2008 -0700
>>
>>     acpi: unneccessary to scan the PCI bus already scanned
>> ---
>>
>> Systems based on the Intel 450NX chipset may experience issues where
>> devices aren't recognised that lead to drivers failing, unhandled IRQs,
>> and other serious boot failures. The issue is caused because this
>> chipset has 3 PCI root buses.
>> When it was first released some operating systems (read: Windows NT)
>> didn't always correctly discover the 2nd and 3rd PCI buses. As a result
>> the PCI BIOS tables were 'hacked' to have a fake bridge device on PCI
>> bus 0 that points to the same bus number as the 1st bus so they would be
>> scanned correctly by the OS.
>>
>> As a result, in a well-behaved OS the 2nd and 3rd PCI buses would be
>> scanned twice. Once as secondaries of the 1st bus, and then as root
>> buses in their own right. This caused problems with devices being
>> discovered twice.
>>
>> Unfortunately, the user's kernel-log report is misleading since the bus
>> has already been found to be registered and therefore ignored. The
>> situation can be worked around by booting with "pci=noacpi".
>>
>> A fix-up for all i450N chipsets was introduced in
>> arch/i386/pci/fixups.c::pci_fixup_i450nx(). Note: arch/i386 was
>> refactored to arch/x86/ subsequently. The fix-up checks the PCI config
>> for the subsidiary buses and if it finds them scans them. This adds them
>> to the root_pci_bus list. Later in the boot process the ACPI/PCI code
>> reads the ACPI DSDT table, finds the PCI bus entries (PNP0A03) and tries
>> to scan them again, leading to the kernel BUG.
>>
>> This patch ensures that buses already scanned are recognised rather than
>> ignored.
>>
>> TJ.
>> Ubuntu Kernel ACPI Team.
>>
>>
>
> Applied. LP: #258143
>
> http://kernel.ubuntu.com/git?p=ubuntu/ubuntu-hardy.git;a=commit;h=4469fffe7db317faf023fa24fe87a900af0c3524
>
> Note - I mistakenly created a new bug report, but I should have used the
> existing #214814 (which I've now marked as a duplicate). Sigh, need to
> postpone SRU processing in the morning until after I've had more coffee.
> Thats twice this week.
>
>
ACK

- --

When all other means of communication fail, try words!


-----BEGIN PGP SIGNATURE-----
Version: GnuPG v1.4.6 (GNU/Linux)
Comment: Using GnuPG with Mozilla - http://enigmail.mozdev.org

iD8DBQFIpXhbP+TjRTJVqvQRAr6xAKDFfJvnbbLT6PKIGNadFI3X25UY1wCfZ3Kr
Nq/G6+15pqkFb+9n35hjHFE=
=9skO
-----END PGP SIGNATURE-----

--
kernel-team mailing list
[hidden email]
https://lists.ubuntu.com/mailman/listinfo/kernel-team
Reply | Threaded
Open this post in threaded view
|

Re: SRU: Hardy LP Bug #214814 - PCI/ACPI: "BUG: soft lockup - CPU#0 stuck for 61s!"

Ben Collins-4
In reply to this post by Tim Gardner-2
ACK

Plus: "The Calgary bits are tested, the NUMA bits just look ok."

Since we don't have NUMA enabled, I think we are ok with that last bit.


--
kernel-team mailing list
[hidden email]
https://lists.ubuntu.com/mailman/listinfo/kernel-team