[PATCH] [B/snapdragon] [SRU] Kernel hangs during msm init

classic Classic list List threaded Threaded
5 messages Options
Reply | Threaded
Open this post in threaded view
|

[PATCH] [B/snapdragon] [SRU] Kernel hangs during msm init

Paolo Pisati-5
BugLink: https://bugs.launchpad.net/bugs/1841911

Impact:

Ubuntu-snapdragon-4.15.0-1061.68 hangs during boot around msm init.
Sometimes we get the following stack trace, or the boot completes and the board hangs during reboot:

...
[ 8.113018] msm_dsi_manager_register: failed to register mipi dsi host for DSI 0
[ 8.131081] msm 1a00000.mdss: failed to bind 1a98000.dsi (ops dsi_ops [msm]): -517
[ 8.138234] msm 1a00000.mdss: master bind failed: -517
[ 8.145551] platform 1a01000.mdp: Dropping the link to 1ef0000.iommu
[ 8.150545] iommu: Removing device 1a01000.mdp from group 1
[ 8.157051] ------------[ cut here ]------------
[ 8.162369] WARNING: CPU: 1 PID: 1316 at /build/linux-snapdragon-t5G9R3/linux-snapdragon-4.15.0/drivers/iommu/qcom_iommu.c:336 qcom_iommu_domain_free+0x74/0x88
[ 8.167166] Modules linked in: adv7511_drm cec rc_core msm(+) mdt_loader
[ 8.181137] CPU: 1 PID: 1316 Comm: systemd-udevd Not tainted 4.15.0-1061-snapdragon #68-Ubuntu
[ 8.188079] Hardware name: Qualcomm Technologies, Inc. APQ 8016 SBC (DT)
[ 8.196501] pstate: 60400005 (nZCv daif +PAN -UAO)
[ 8.203356] pc : qcom_iommu_domain_free+0x74/0x88
[ 8.207955] lr : qcom_iommu_domain_free+0x74/0x88
[ 8.212727] sp : ffff00000cbeb680
[ 8.217412] x29: ffff00000cbeb680 x28: ffff8000396d84b8
[ 8.220713] x27: ffff8000396d84b0 x26: ffff8000396d84c0
[ 8.226096] x25: ffff80003d057c10 x24: ffff8000396d8420
[ 8.231391] x23: 0000000000000003 x22: ffff80003ce40258
[ 8.236686] x21: ffff80000203ad00 x20: ffff80000203af30
[ 8.241981] x19: ffff80000203af00 x18: ffffffffffffffff
[ 8.247275] x17: 0000000000000000 x16: 0000000000000004
[ 8.252570] x15: ffff000009549c08 x14: 0720072007200720
[ 8.257866] x13: 0720072007200720 x12: 0720072007200720
[ 8.263161] x11: ffff000009549e80 x10: ffff00000871d340
[ 8.268456] x9 : 0720072007200720 x8 : 0000000000000005
[ 8.273751] x7 : 0720072d072d072d x6 : 000000000000014c
[ 8.279046] x5 : ffff000008610250 x4 : 0000000000000000
[ 8.284345] x3 : 0000000000000000 x2 : a59fa8ece8469a00
[ 8.289637] x1 : 0000000000000000 x0 : 0000000000000024
[ 8.294932] Call trace:
[ 8.300227] qcom_iommu_domain_free+0x74/0x88
[ 8.302400] iommu_group_release+0x54/0x90
[ 8.306914] kobject_put+0x8c/0x1f0
[ 8.310905] kobject_del.part.0+0x3c/0x50
[ 8.314292] kobject_put+0x74/0x1f0
[ 8.318455] iommu_group_remove_device+0x10c/0x198
[ 8.321756] qcom_iommu_remove_device+0x58/0x70
[ 8.326617] iommu_bus_notifier+0xa8/0x120
[ 8.331045] notifier_call_chain+0x5c/0xa0
[ 8.335210] blocking_notifier_call_chain+0x64/0x88
[ 8.339294] device_del+0x234/0x368
[ 8.344066] platform_device_del.part.3+0x2c/0x98
[ 8.347539] platform_device_unregister+0x24/0x38
[ 8.352410] of_platform_device_destroy+0xb8/0xc0
[ 8.357087] device_for_each_child+0x58/0xb0
[ 8.361775] of_platform_depopulate+0x4c/0x68
[ 8.366350] msm_pdev_probe+0x2c4/0x388 [msm]
[ 8.370369] platform_drv_probe+0x60/0xc0
[ 8.374707] driver_probe_device+0x2ec/0x458
[ 8.378701] __driver_attach+0xdc/0x128
[ 8.383042] bus_for_each_dev+0x78/0xd8
[ 8.386598] driver_attach+0x30/0x40
[ 8.390418] bus_add_driver+0x20c/0x2a8
[ 8.394237] driver_register+0x6c/0x110
[ 8.397797] __platform_driver_register+0x54/0x60
[ 8.401841] msm_drm_register+0x54/0x80 [msm]
[ 8.406481] do_one_initcall+0x58/0x160
[ 8.410818] do_init_module+0x64/0x1d8
[ 8.414463] load_module+0x1378/0x15c8
[ 8.418282] SyS_finit_module+0x100/0x118
[ 8.422016] el0_svc_naked+0x30/0x34
[ 8.426095] ---[ end trace 800d0885aa276bfd ]---

Fix:

During the Ubuntu-snapdragon-4.15.0-1061.68 cycle, we picked up one upstream patch that of_platform_depopulate() msm in case of probe deferral (or during the removal), but that patch triggers a WARN_ON() during the wind down of the IOMMU (and the susequent kernel hang) - unless we want to backport the new msm dri driver (and all the relevant dependencies), revert the stable patch that calls of_platform_depopulate().

How to test:

Boot a patched kernel and check if that stracktrace shows up again.

Regression:

None, i'm reverting a patch that wasn't there before and clearly wasn't tested with our downstream BSP.

Paolo Pisati (1):
  Revert "drm/msm: Depopulate platform on probe failure"

 drivers/gpu/drm/msm/msm_drv.c | 14 +++-----------
 1 file changed, 3 insertions(+), 11 deletions(-)

--
2.7.4


--
kernel-team mailing list
[hidden email]
https://lists.ubuntu.com/mailman/listinfo/kernel-team
Reply | Threaded
Open this post in threaded view
|

[PATCH] Revert "drm/msm: Depopulate platform on probe failure"

Paolo Pisati-5
BugLink: https://bugs.launchpad.net/bugs/1841911

This reverts commit 49a03629d321c852553ecbc3f0e71b22d7233b11.

Signed-off-by: Paolo Pisati <[hidden email]>
---
 drivers/gpu/drm/msm/msm_drv.c | 14 +++-----------
 1 file changed, 3 insertions(+), 11 deletions(-)

diff --git a/drivers/gpu/drm/msm/msm_drv.c b/drivers/gpu/drm/msm/msm_drv.c
index 8eb6bfa..0a3ea30 100644
--- a/drivers/gpu/drm/msm/msm_drv.c
+++ b/drivers/gpu/drm/msm/msm_drv.c
@@ -1138,24 +1138,16 @@ static int msm_pdev_probe(struct platform_device *pdev)
 
  ret = add_gpu_components(&pdev->dev, &match);
  if (ret)
- goto fail;
+ return ret;
 
  /* on all devices that I am aware of, iommu's which can map
  * any address the cpu can see are used:
  */
  ret = dma_set_mask_and_coherent(&pdev->dev, ~0);
  if (ret)
- goto fail;
-
- ret = component_master_add_with_match(&pdev->dev, &msm_drm_ops, match);
- if (ret)
- goto fail;
-
- return 0;
+ return ret;
 
-fail:
- of_platform_depopulate(&pdev->dev);
- return ret;
+ return component_master_add_with_match(&pdev->dev, &msm_drm_ops, match);
 }
 
 static int msm_pdev_remove(struct platform_device *pdev)
--
2.7.4


--
kernel-team mailing list
[hidden email]
https://lists.ubuntu.com/mailman/listinfo/kernel-team
Reply | Threaded
Open this post in threaded view
|

ACK: [PATCH] Revert "drm/msm: Depopulate platform on probe failure"

Kleber Souza
On 8/29/19 2:28 PM, Paolo Pisati wrote:
> BugLink: https://bugs.launchpad.net/bugs/1841911
>
> This reverts commit 49a03629d321c852553ecbc3f0e71b22d7233b11.
>
> Signed-off-by: Paolo Pisati <[hidden email]>

Acked-by: Kleber Sacilotto de Souza <[hidden email]>

> ---
>  drivers/gpu/drm/msm/msm_drv.c | 14 +++-----------
>  1 file changed, 3 insertions(+), 11 deletions(-)
>
> diff --git a/drivers/gpu/drm/msm/msm_drv.c b/drivers/gpu/drm/msm/msm_drv.c
> index 8eb6bfa..0a3ea30 100644
> --- a/drivers/gpu/drm/msm/msm_drv.c
> +++ b/drivers/gpu/drm/msm/msm_drv.c
> @@ -1138,24 +1138,16 @@ static int msm_pdev_probe(struct platform_device *pdev)
>  
>   ret = add_gpu_components(&pdev->dev, &match);
>   if (ret)
> - goto fail;
> + return ret;
>  
>   /* on all devices that I am aware of, iommu's which can map
>   * any address the cpu can see are used:
>   */
>   ret = dma_set_mask_and_coherent(&pdev->dev, ~0);
>   if (ret)
> - goto fail;
> -
> - ret = component_master_add_with_match(&pdev->dev, &msm_drm_ops, match);
> - if (ret)
> - goto fail;
> -
> - return 0;
> + return ret;
>  
> -fail:
> - of_platform_depopulate(&pdev->dev);
> - return ret;
> + return component_master_add_with_match(&pdev->dev, &msm_drm_ops, match);
>  }
>  
>  static int msm_pdev_remove(struct platform_device *pdev)
>


--
kernel-team mailing list
[hidden email]
https://lists.ubuntu.com/mailman/listinfo/kernel-team
Reply | Threaded
Open this post in threaded view
|

ACK: [PATCH] [B/snapdragon] [SRU] Kernel hangs during msm init

Khalid Elmously
In reply to this post by Paolo Pisati-5
On 2019-08-29 14:28:25 , Paolo Pisati wrote:

> BugLink: https://bugs.launchpad.net/bugs/1841911
>
> Impact:
>
> Ubuntu-snapdragon-4.15.0-1061.68 hangs during boot around msm init.
> Sometimes we get the following stack trace, or the boot completes and the board hangs during reboot:
>
> ...
> [ 8.113018] msm_dsi_manager_register: failed to register mipi dsi host for DSI 0
> [ 8.131081] msm 1a00000.mdss: failed to bind 1a98000.dsi (ops dsi_ops [msm]): -517
> [ 8.138234] msm 1a00000.mdss: master bind failed: -517
> [ 8.145551] platform 1a01000.mdp: Dropping the link to 1ef0000.iommu
> [ 8.150545] iommu: Removing device 1a01000.mdp from group 1
> [ 8.157051] ------------[ cut here ]------------
> [ 8.162369] WARNING: CPU: 1 PID: 1316 at /build/linux-snapdragon-t5G9R3/linux-snapdragon-4.15.0/drivers/iommu/qcom_iommu.c:336 qcom_iommu_domain_free+0x74/0x88
> [ 8.167166] Modules linked in: adv7511_drm cec rc_core msm(+) mdt_loader
> [ 8.181137] CPU: 1 PID: 1316 Comm: systemd-udevd Not tainted 4.15.0-1061-snapdragon #68-Ubuntu
> [ 8.188079] Hardware name: Qualcomm Technologies, Inc. APQ 8016 SBC (DT)
> [ 8.196501] pstate: 60400005 (nZCv daif +PAN -UAO)
> [ 8.203356] pc : qcom_iommu_domain_free+0x74/0x88
> [ 8.207955] lr : qcom_iommu_domain_free+0x74/0x88
> [ 8.212727] sp : ffff00000cbeb680
> [ 8.217412] x29: ffff00000cbeb680 x28: ffff8000396d84b8
> [ 8.220713] x27: ffff8000396d84b0 x26: ffff8000396d84c0
> [ 8.226096] x25: ffff80003d057c10 x24: ffff8000396d8420
> [ 8.231391] x23: 0000000000000003 x22: ffff80003ce40258
> [ 8.236686] x21: ffff80000203ad00 x20: ffff80000203af30
> [ 8.241981] x19: ffff80000203af00 x18: ffffffffffffffff
> [ 8.247275] x17: 0000000000000000 x16: 0000000000000004
> [ 8.252570] x15: ffff000009549c08 x14: 0720072007200720
> [ 8.257866] x13: 0720072007200720 x12: 0720072007200720
> [ 8.263161] x11: ffff000009549e80 x10: ffff00000871d340
> [ 8.268456] x9 : 0720072007200720 x8 : 0000000000000005
> [ 8.273751] x7 : 0720072d072d072d x6 : 000000000000014c
> [ 8.279046] x5 : ffff000008610250 x4 : 0000000000000000
> [ 8.284345] x3 : 0000000000000000 x2 : a59fa8ece8469a00
> [ 8.289637] x1 : 0000000000000000 x0 : 0000000000000024
> [ 8.294932] Call trace:
> [ 8.300227] qcom_iommu_domain_free+0x74/0x88
> [ 8.302400] iommu_group_release+0x54/0x90
> [ 8.306914] kobject_put+0x8c/0x1f0
> [ 8.310905] kobject_del.part.0+0x3c/0x50
> [ 8.314292] kobject_put+0x74/0x1f0
> [ 8.318455] iommu_group_remove_device+0x10c/0x198
> [ 8.321756] qcom_iommu_remove_device+0x58/0x70
> [ 8.326617] iommu_bus_notifier+0xa8/0x120
> [ 8.331045] notifier_call_chain+0x5c/0xa0
> [ 8.335210] blocking_notifier_call_chain+0x64/0x88
> [ 8.339294] device_del+0x234/0x368
> [ 8.344066] platform_device_del.part.3+0x2c/0x98
> [ 8.347539] platform_device_unregister+0x24/0x38
> [ 8.352410] of_platform_device_destroy+0xb8/0xc0
> [ 8.357087] device_for_each_child+0x58/0xb0
> [ 8.361775] of_platform_depopulate+0x4c/0x68
> [ 8.366350] msm_pdev_probe+0x2c4/0x388 [msm]
> [ 8.370369] platform_drv_probe+0x60/0xc0
> [ 8.374707] driver_probe_device+0x2ec/0x458
> [ 8.378701] __driver_attach+0xdc/0x128
> [ 8.383042] bus_for_each_dev+0x78/0xd8
> [ 8.386598] driver_attach+0x30/0x40
> [ 8.390418] bus_add_driver+0x20c/0x2a8
> [ 8.394237] driver_register+0x6c/0x110
> [ 8.397797] __platform_driver_register+0x54/0x60
> [ 8.401841] msm_drm_register+0x54/0x80 [msm]
> [ 8.406481] do_one_initcall+0x58/0x160
> [ 8.410818] do_init_module+0x64/0x1d8
> [ 8.414463] load_module+0x1378/0x15c8
> [ 8.418282] SyS_finit_module+0x100/0x118
> [ 8.422016] el0_svc_naked+0x30/0x34
> [ 8.426095] ---[ end trace 800d0885aa276bfd ]---
>
> Fix:
>
> During the Ubuntu-snapdragon-4.15.0-1061.68 cycle, we picked up one upstream patch that of_platform_depopulate() msm in case of probe deferral (or during the removal), but that patch triggers a WARN_ON() during the wind down of the IOMMU (and the susequent kernel hang) - unless we want to backport the new msm dri driver (and all the relevant dependencies), revert the stable patch that calls of_platform_depopulate().
>
> How to test:
>
> Boot a patched kernel and check if that stracktrace shows up again.
>
> Regression:
>
> None, i'm reverting a patch that wasn't there before and clearly wasn't tested with our downstream BSP.
>
> Paolo Pisati (1):
>   Revert "drm/msm: Depopulate platform on probe failure"
>
>  drivers/gpu/drm/msm/msm_drv.c | 14 +++-----------
>  1 file changed, 3 insertions(+), 11 deletions(-)
>
> --
> 2.7.4
>
>
> --
> kernel-team mailing list
> [hidden email]
> https://lists.ubuntu.com/mailman/listinfo/kernel-team

--
kernel-team mailing list
[hidden email]
https://lists.ubuntu.com/mailman/listinfo/kernel-team
Reply | Threaded
Open this post in threaded view
|

APPLIED: [PATCH] [B/snapdragon] [SRU] Kernel hangs during msm init

Khalid Elmously
In reply to this post by Paolo Pisati-5
On 2019-08-29 14:28:25 , Paolo Pisati wrote:

> BugLink: https://bugs.launchpad.net/bugs/1841911
>
> Impact:
>
> Ubuntu-snapdragon-4.15.0-1061.68 hangs during boot around msm init.
> Sometimes we get the following stack trace, or the boot completes and the board hangs during reboot:
>
> ...
> [ 8.113018] msm_dsi_manager_register: failed to register mipi dsi host for DSI 0
> [ 8.131081] msm 1a00000.mdss: failed to bind 1a98000.dsi (ops dsi_ops [msm]): -517
> [ 8.138234] msm 1a00000.mdss: master bind failed: -517
> [ 8.145551] platform 1a01000.mdp: Dropping the link to 1ef0000.iommu
> [ 8.150545] iommu: Removing device 1a01000.mdp from group 1
> [ 8.157051] ------------[ cut here ]------------
> [ 8.162369] WARNING: CPU: 1 PID: 1316 at /build/linux-snapdragon-t5G9R3/linux-snapdragon-4.15.0/drivers/iommu/qcom_iommu.c:336 qcom_iommu_domain_free+0x74/0x88
> [ 8.167166] Modules linked in: adv7511_drm cec rc_core msm(+) mdt_loader
> [ 8.181137] CPU: 1 PID: 1316 Comm: systemd-udevd Not tainted 4.15.0-1061-snapdragon #68-Ubuntu
> [ 8.188079] Hardware name: Qualcomm Technologies, Inc. APQ 8016 SBC (DT)
> [ 8.196501] pstate: 60400005 (nZCv daif +PAN -UAO)
> [ 8.203356] pc : qcom_iommu_domain_free+0x74/0x88
> [ 8.207955] lr : qcom_iommu_domain_free+0x74/0x88
> [ 8.212727] sp : ffff00000cbeb680
> [ 8.217412] x29: ffff00000cbeb680 x28: ffff8000396d84b8
> [ 8.220713] x27: ffff8000396d84b0 x26: ffff8000396d84c0
> [ 8.226096] x25: ffff80003d057c10 x24: ffff8000396d8420
> [ 8.231391] x23: 0000000000000003 x22: ffff80003ce40258
> [ 8.236686] x21: ffff80000203ad00 x20: ffff80000203af30
> [ 8.241981] x19: ffff80000203af00 x18: ffffffffffffffff
> [ 8.247275] x17: 0000000000000000 x16: 0000000000000004
> [ 8.252570] x15: ffff000009549c08 x14: 0720072007200720
> [ 8.257866] x13: 0720072007200720 x12: 0720072007200720
> [ 8.263161] x11: ffff000009549e80 x10: ffff00000871d340
> [ 8.268456] x9 : 0720072007200720 x8 : 0000000000000005
> [ 8.273751] x7 : 0720072d072d072d x6 : 000000000000014c
> [ 8.279046] x5 : ffff000008610250 x4 : 0000000000000000
> [ 8.284345] x3 : 0000000000000000 x2 : a59fa8ece8469a00
> [ 8.289637] x1 : 0000000000000000 x0 : 0000000000000024
> [ 8.294932] Call trace:
> [ 8.300227] qcom_iommu_domain_free+0x74/0x88
> [ 8.302400] iommu_group_release+0x54/0x90
> [ 8.306914] kobject_put+0x8c/0x1f0
> [ 8.310905] kobject_del.part.0+0x3c/0x50
> [ 8.314292] kobject_put+0x74/0x1f0
> [ 8.318455] iommu_group_remove_device+0x10c/0x198
> [ 8.321756] qcom_iommu_remove_device+0x58/0x70
> [ 8.326617] iommu_bus_notifier+0xa8/0x120
> [ 8.331045] notifier_call_chain+0x5c/0xa0
> [ 8.335210] blocking_notifier_call_chain+0x64/0x88
> [ 8.339294] device_del+0x234/0x368
> [ 8.344066] platform_device_del.part.3+0x2c/0x98
> [ 8.347539] platform_device_unregister+0x24/0x38
> [ 8.352410] of_platform_device_destroy+0xb8/0xc0
> [ 8.357087] device_for_each_child+0x58/0xb0
> [ 8.361775] of_platform_depopulate+0x4c/0x68
> [ 8.366350] msm_pdev_probe+0x2c4/0x388 [msm]
> [ 8.370369] platform_drv_probe+0x60/0xc0
> [ 8.374707] driver_probe_device+0x2ec/0x458
> [ 8.378701] __driver_attach+0xdc/0x128
> [ 8.383042] bus_for_each_dev+0x78/0xd8
> [ 8.386598] driver_attach+0x30/0x40
> [ 8.390418] bus_add_driver+0x20c/0x2a8
> [ 8.394237] driver_register+0x6c/0x110
> [ 8.397797] __platform_driver_register+0x54/0x60
> [ 8.401841] msm_drm_register+0x54/0x80 [msm]
> [ 8.406481] do_one_initcall+0x58/0x160
> [ 8.410818] do_init_module+0x64/0x1d8
> [ 8.414463] load_module+0x1378/0x15c8
> [ 8.418282] SyS_finit_module+0x100/0x118
> [ 8.422016] el0_svc_naked+0x30/0x34
> [ 8.426095] ---[ end trace 800d0885aa276bfd ]---
>
> Fix:
>
> During the Ubuntu-snapdragon-4.15.0-1061.68 cycle, we picked up one upstream patch that of_platform_depopulate() msm in case of probe deferral (or during the removal), but that patch triggers a WARN_ON() during the wind down of the IOMMU (and the susequent kernel hang) - unless we want to backport the new msm dri driver (and all the relevant dependencies), revert the stable patch that calls of_platform_depopulate().
>
> How to test:
>
> Boot a patched kernel and check if that stracktrace shows up again.
>
> Regression:
>
> None, i'm reverting a patch that wasn't there before and clearly wasn't tested with our downstream BSP.
>
> Paolo Pisati (1):
>   Revert "drm/msm: Depopulate platform on probe failure"
>
>  drivers/gpu/drm/msm/msm_drv.c | 14 +++-----------
>  1 file changed, 3 insertions(+), 11 deletions(-)
>
> --
> 2.7.4
>
>
> --
> kernel-team mailing list
> [hidden email]
> https://lists.ubuntu.com/mailman/listinfo/kernel-team

--
kernel-team mailing list
[hidden email]
https://lists.ubuntu.com/mailman/listinfo/kernel-team