[RESEND] [SRU Artful/Zesty][PATCH] ACPI APEI error handling bug fixes

Previous Topic Next Topic
 
classic Classic list List threaded Threaded
4 messages Options
Reply | Threaded
Open this post in threaded view
|

[RESEND] [SRU Artful/Zesty][PATCH] ACPI APEI error handling bug fixes

Manoj Iyer
Please consider the following patches that fix the bug https://launchpad.net/bugs/1732990. The patches were cleanly cherry picked from linus tree and a test kernel is available in the PPA: https://launchpad.net/~centriq-team/+archive/ubuntu/lp1732990.

The kernel was tested by engineers at Qualcomm on a QDF2400 platform and the results are posted to the bug.

Thanks
Manoj Iyer


--
kernel-team mailing list
[hidden email]
https://lists.ubuntu.com/mailman/listinfo/kernel-team
Reply | Threaded
Open this post in threaded view
|

[PATCH 1/2] ACPI: APEI: fix the wrong iteration of generic error status block

Manoj Iyer
From: gengdongjiu <[hidden email]>

The revision 0x300 generic error data entry is different
from the old version, but currently iterating through the
GHES estatus blocks does not take into account this difference.
This will lead to failure to get the right data entry if GHES
has revision 0x300 error data entry.

Update the GHES estatus iteration macro to properly increment using
acpi_hest_get_next(), and correct the iteration termination condition
because the status block data length only includes error data
length.

Convert the CPER estatus checking and printing iteration logic
to use same macro.

BugLink: https://launchpad.net/bugs/1732990

Signed-off-by: Dongjiu Geng <[hidden email]>
Tested-by: Tyler Baicar <[hidden email]>
Reviewed-by: Borislav Petkov <[hidden email]>
Signed-off-by: Rafael J. Wysocki <[hidden email]>
(cherry picked from commit c4335fdd38227788178953c101b77180504d7ea0)
Signed-off-by: Manoj Iyer <[hidden email]>
---
 drivers/acpi/apei/apei-internal.h |  5 -----
 drivers/firmware/efi/cper.c       | 12 ++----------
 include/acpi/ghes.h               |  5 +++++
 3 files changed, 7 insertions(+), 15 deletions(-)

diff --git a/drivers/acpi/apei/apei-internal.h b/drivers/acpi/apei/apei-internal.h
index 6e9f14c0a71b..cb4126051f62 100644
--- a/drivers/acpi/apei/apei-internal.h
+++ b/drivers/acpi/apei/apei-internal.h
@@ -120,11 +120,6 @@ int apei_exec_collect_resources(struct apei_exec_context *ctx,
 struct dentry;
 struct dentry *apei_get_debugfs_dir(void);
 
-#define apei_estatus_for_each_section(estatus, section) \
- for (section = (struct acpi_hest_generic_data *)(estatus + 1); \
-     (void *)section - (void *)estatus < estatus->data_length; \
-     section = (void *)(section+1) + section->error_data_length)
-
 static inline u32 cper_estatus_len(struct acpi_hest_generic_status *estatus)
 {
  if (estatus->raw_data_length)
diff --git a/drivers/firmware/efi/cper.c b/drivers/firmware/efi/cper.c
index 48a8f69da42a..bf3672a81e49 100644
--- a/drivers/firmware/efi/cper.c
+++ b/drivers/firmware/efi/cper.c
@@ -606,7 +606,6 @@ void cper_estatus_print(const char *pfx,
  const struct acpi_hest_generic_status *estatus)
 {
  struct acpi_hest_generic_data *gdata;
- unsigned int data_len;
  int sec_no = 0;
  char newpfx[64];
  __u16 severity;
@@ -617,14 +616,10 @@ void cper_estatus_print(const char *pfx,
        "It has been corrected by h/w "
        "and requires no further action");
  printk("%s""event severity: %s\n", pfx, cper_severity_str(severity));
- data_len = estatus->data_length;
- gdata = (struct acpi_hest_generic_data *)(estatus + 1);
  snprintf(newpfx, sizeof(newpfx), "%s%s", pfx, INDENT_SP);
 
- while (data_len >= acpi_hest_get_size(gdata)) {
+ apei_estatus_for_each_section(estatus, gdata) {
  cper_estatus_print_section(newpfx, gdata, sec_no);
- data_len -= acpi_hest_get_record_size(gdata);
- gdata = acpi_hest_get_next(gdata);
  sec_no++;
  }
 }
@@ -653,15 +648,12 @@ int cper_estatus_check(const struct acpi_hest_generic_status *estatus)
  if (rc)
  return rc;
  data_len = estatus->data_length;
- gdata = (struct acpi_hest_generic_data *)(estatus + 1);
 
- while (data_len >= acpi_hest_get_size(gdata)) {
+ apei_estatus_for_each_section(estatus, gdata) {
  gedata_len = acpi_hest_get_error_length(gdata);
  if (gedata_len > data_len - acpi_hest_get_size(gdata))
  return -EINVAL;
-
  data_len -= acpi_hest_get_record_size(gdata);
- gdata = acpi_hest_get_next(gdata);
  }
  if (data_len)
  return -EINVAL;
diff --git a/include/acpi/ghes.h b/include/acpi/ghes.h
index 9f26e01186ae..9061c5c743b3 100644
--- a/include/acpi/ghes.h
+++ b/include/acpi/ghes.h
@@ -113,6 +113,11 @@ static inline void *acpi_hest_get_next(struct acpi_hest_generic_data *gdata)
  return (void *)(gdata) + acpi_hest_get_record_size(gdata);
 }
 
+#define apei_estatus_for_each_section(estatus, section) \
+ for (section = (struct acpi_hest_generic_data *)(estatus + 1); \
+     (void *)section - (void *)(estatus + 1) < estatus->data_length; \
+     section = acpi_hest_get_next(section))
+
 int ghes_notify_sea(void);
 
 #endif /* GHES_H */
--
2.14.1


--
kernel-team mailing list
[hidden email]
https://lists.ubuntu.com/mailman/listinfo/kernel-team
Reply | Threaded
Open this post in threaded view
|

[PATCH 2/2] ACPI / APEI: clear error status before acknowledging the error

Manoj Iyer
In reply to this post by Manoj Iyer
From: Tyler Baicar <[hidden email]>

Currently we acknowledge errors before clearing the error status.
This could cause a new error to be populated by firmware in-between
the error acknowledgment and the error status clearing which would
cause the second error's status to be cleared without being handled.
So, clear the error status before acknowledging the errors.

Also, make sure to acknowledge the error if the error status read
fails.

BugLink: https://launchpad.net/bugs/1732990

Signed-off-by: Tyler Baicar <[hidden email]>
Reviewed-by: Borislav Petkov <[hidden email]>
Signed-off-by: Rafael J. Wysocki <[hidden email]>
(cherry picked from commit aaf2c2fb0f51f91c699039440862b6ae9c25c10e)
Signed-off-by: Manoj Iyer <[hidden email]>
---
 drivers/acpi/apei/ghes.c | 16 +++++++++-------
 1 file changed, 9 insertions(+), 7 deletions(-)

diff --git a/drivers/acpi/apei/ghes.c b/drivers/acpi/apei/ghes.c
index d661d452b238..8d43b1cecfbe 100644
--- a/drivers/acpi/apei/ghes.c
+++ b/drivers/acpi/apei/ghes.c
@@ -743,17 +743,19 @@ static int ghes_proc(struct ghes *ghes)
  }
  ghes_do_proc(ghes, ghes->estatus);
 
+out:
+ ghes_clear_estatus(ghes);
+
+ if (rc == -ENOENT)
+ return rc;
+
  /*
  * GHESv2 type HEST entries introduce support for error acknowledgment,
  * so only acknowledge the error if this support is present.
  */
- if (is_hest_type_generic_v2(ghes)) {
- rc = ghes_ack_error(ghes->generic_v2);
- if (rc)
- return rc;
- }
-out:
- ghes_clear_estatus(ghes);
+ if (is_hest_type_generic_v2(ghes))
+ return ghes_ack_error(ghes->generic_v2);
+
  return rc;
 }
 
--
2.14.1


--
kernel-team mailing list
[hidden email]
https://lists.ubuntu.com/mailman/listinfo/kernel-team
Reply | Threaded
Open this post in threaded view
|

Re: [RESEND] [SRU Artful/Zesty][PATCH] ACPI APEI error handling bug fixes

Paolo Pisati-5
In reply to this post by Manoj Iyer
Can you resend using the standard SRU format[1], using a cover letter
plus patches, cherry-picking from Linus tree instead of linux-next?

1: https://wiki.ubuntu.com/Kernel/Dev/StablePatchFormat
--
bye,
p.

--
kernel-team mailing list
[hidden email]
https://lists.ubuntu.com/mailman/listinfo/kernel-team