[SRU][Groovy][PATCH] drm/i915: Mark ininitial fb obj as WT on eLLC machines to avoid rcu lockup during fbdev init

Previous Topic Next Topic
 
classic Classic list List threaded Threaded
4 messages Options
Reply | Threaded
Open this post in threaded view
|

[SRU][Groovy][PATCH] drm/i915: Mark ininitial fb obj as WT on eLLC machines to avoid rcu lockup during fbdev init

Kamal Mostafa-2
From: Ville Syrjälä <[hidden email]>

BugLink: https://bugs.launchpad.net/bugs/1903397

Currently we leave the cache_level of the initial fb obj
set to NONE. This means on eLLC machines the first pin_to_display()
will try to switch it to WT which requires a vma unbind+bind.
If that happens during the fbdev initialization rcu does not
seem operational which causes the unbind to get stuck. To
most appearances this looks like a dead machine on boot.

Avoid the unbind by already marking the object cache_level
as WT when creating it. We still do an excplicit ggtt pin
which will rewrite the PTEs anyway, so they will match whatever
cache level we set.

Cc: <[hidden email]> # v5.7+
Suggested-by: Chris Wilson <[hidden email]>
Closes: https://gitlab.freedesktop.org/drm/intel/-/issues/2381
Signed-off-by: Ville Syrjälä <[hidden email]>
Reviewed-by: Chris Wilson <[hidden email]>
Signed-off-by: Chris Wilson <[hidden email]>
Link: https://patchwork.freedesktop.org/patch/msgid/20201007120329.17076-1-ville.syrjala@...
Link: https://patchwork.freedesktop.org/patch/msgid/20201015122138.30161-1-chris@...
(cherry picked from commit d46b60a2e8d246f1f0faa38e52f4f5a73858c338)
Signed-off-by: Rodrigo Vivi <[hidden email]>
(cherry picked from commit 1664ffee760a5d98952318fdd9b198fae396d660)
Signed-off-by: Kamal Mostafa <[hidden email]>
---
 drivers/gpu/drm/i915/display/intel_display.c | 8 ++++++++
 1 file changed, 8 insertions(+)

diff --git a/drivers/gpu/drm/i915/display/intel_display.c b/drivers/gpu/drm/i915/display/intel_display.c
index 26996e1839e2..4bd383ff0880 100644
--- a/drivers/gpu/drm/i915/display/intel_display.c
+++ b/drivers/gpu/drm/i915/display/intel_display.c
@@ -3431,6 +3431,14 @@ initial_plane_vma(struct drm_i915_private *i915,
  if (IS_ERR(obj))
  return NULL;
 
+ /*
+ * Mark it WT ahead of time to avoid changing the
+ * cache_level during fbdev initialization. The
+ * unbind there would get stuck waiting for rcu.
+ */
+ i915_gem_object_set_cache_coherency(obj, HAS_WT(i915) ?
+    I915_CACHE_WT : I915_CACHE_NONE);
+
  switch (plane_config->tiling) {
  case I915_TILING_NONE:
  break;
--
2.17.1


--
kernel-team mailing list
[hidden email]
https://lists.ubuntu.com/mailman/listinfo/kernel-team
Reply | Threaded
Open this post in threaded view
|

ACK: [SRU][Groovy][PATCH] drm/i915: Mark ininitial fb obj as WT on eLLC machines to avoid rcu lockup during fbdev init

Kelsey Skunberg
On 2020-11-20 09:21:53 , Kamal Mostafa wrote:

> From: Ville Syrjälä <[hidden email]>
>
> BugLink: https://bugs.launchpad.net/bugs/1903397
>
> Currently we leave the cache_level of the initial fb obj
> set to NONE. This means on eLLC machines the first pin_to_display()
> will try to switch it to WT which requires a vma unbind+bind.
> If that happens during the fbdev initialization rcu does not
> seem operational which causes the unbind to get stuck. To
> most appearances this looks like a dead machine on boot.
>
> Avoid the unbind by already marking the object cache_level
> as WT when creating it. We still do an excplicit ggtt pin
> which will rewrite the PTEs anyway, so they will match whatever
> cache level we set.
>
> Cc: <[hidden email]> # v5.7+
> Suggested-by: Chris Wilson <[hidden email]>
> Closes: https://gitlab.freedesktop.org/drm/intel/-/issues/2381
> Signed-off-by: Ville Syrjälä <[hidden email]>
> Reviewed-by: Chris Wilson <[hidden email]>
> Signed-off-by: Chris Wilson <[hidden email]>
> Link: https://patchwork.freedesktop.org/patch/msgid/20201007120329.17076-1-ville.syrjala@...
> Link: https://patchwork.freedesktop.org/patch/msgid/20201015122138.30161-1-chris@...
> (cherry picked from commit d46b60a2e8d246f1f0faa38e52f4f5a73858c338)
> Signed-off-by: Rodrigo Vivi <[hidden email]>
> (cherry picked from commit 1664ffee760a5d98952318fdd9b198fae396d660)
> Signed-off-by: Kamal Mostafa <[hidden email]>
> ---

Clean cherry pick. Applies and builds ok. LGTM. Thank you, Kamal!

Acked-by: Kelsey Skunberg <[hidden email]>

>  drivers/gpu/drm/i915/display/intel_display.c | 8 ++++++++
>  1 file changed, 8 insertions(+)
>
> diff --git a/drivers/gpu/drm/i915/display/intel_display.c b/drivers/gpu/drm/i915/display/intel_display.c
> index 26996e1839e2..4bd383ff0880 100644
> --- a/drivers/gpu/drm/i915/display/intel_display.c
> +++ b/drivers/gpu/drm/i915/display/intel_display.c
> @@ -3431,6 +3431,14 @@ initial_plane_vma(struct drm_i915_private *i915,
>   if (IS_ERR(obj))
>   return NULL;
>  
> + /*
> + * Mark it WT ahead of time to avoid changing the
> + * cache_level during fbdev initialization. The
> + * unbind there would get stuck waiting for rcu.
> + */
> + i915_gem_object_set_cache_coherency(obj, HAS_WT(i915) ?
> +    I915_CACHE_WT : I915_CACHE_NONE);
> +
>   switch (plane_config->tiling) {
>   case I915_TILING_NONE:
>   break;
> --
> 2.17.1
>
>
> --
> kernel-team mailing list
> [hidden email]
> https://lists.ubuntu.com/mailman/listinfo/kernel-team

--
kernel-team mailing list
[hidden email]
https://lists.ubuntu.com/mailman/listinfo/kernel-team
Reply | Threaded
Open this post in threaded view
|

ACK/Cmnt: [SRU][Groovy][PATCH] drm/i915: Mark ininitial fb obj as WT on eLLC machines to avoid rcu lockup during fbdev init

Stefan Bader-2
In reply to this post by Kamal Mostafa-2
On 20.11.20 18:21, Kamal Mostafa wrote:

> From: Ville Syrjälä <[hidden email]>
>
> BugLink: https://bugs.launchpad.net/bugs/1903397
>
> Currently we leave the cache_level of the initial fb obj
> set to NONE. This means on eLLC machines the first pin_to_display()
> will try to switch it to WT which requires a vma unbind+bind.
> If that happens during the fbdev initialization rcu does not
> seem operational which causes the unbind to get stuck. To
> most appearances this looks like a dead machine on boot.
>
> Avoid the unbind by already marking the object cache_level
> as WT when creating it. We still do an excplicit ggtt pin
> which will rewrite the PTEs anyway, so they will match whatever
> cache level we set.
>
> Cc: <[hidden email]> # v5.7+
> Suggested-by: Chris Wilson <[hidden email]>
> Closes: https://gitlab.freedesktop.org/drm/intel/-/issues/2381
> Signed-off-by: Ville Syrjälä <[hidden email]>
> Reviewed-by: Chris Wilson <[hidden email]>
> Signed-off-by: Chris Wilson <[hidden email]>
> Link: https://patchwork.freedesktop.org/patch/msgid/20201007120329.17076-1-ville.syrjala@...
> Link: https://patchwork.freedesktop.org/patch/msgid/20201015122138.30161-1-chris@...
> (cherry picked from commit d46b60a2e8d246f1f0faa38e52f4f5a73858c338)
> Signed-off-by: Rodrigo Vivi <[hidden email]>
> (cherry picked from commit 1664ffee760a5d98952318fdd9b198fae396d660)
> Signed-off-by: Kamal Mostafa <[hidden email]>
Acked-by: Stefan Bader <[hidden email]>
> ---

Maybe the bug report (at least the justification) could be clarified to the fact
that this is something appearing to be a hang but in reality it is no video output.

-Stefan

>  drivers/gpu/drm/i915/display/intel_display.c | 8 ++++++++
>  1 file changed, 8 insertions(+)
>
> diff --git a/drivers/gpu/drm/i915/display/intel_display.c b/drivers/gpu/drm/i915/display/intel_display.c
> index 26996e1839e2..4bd383ff0880 100644
> --- a/drivers/gpu/drm/i915/display/intel_display.c
> +++ b/drivers/gpu/drm/i915/display/intel_display.c
> @@ -3431,6 +3431,14 @@ initial_plane_vma(struct drm_i915_private *i915,
>   if (IS_ERR(obj))
>   return NULL;
>  
> + /*
> + * Mark it WT ahead of time to avoid changing the
> + * cache_level during fbdev initialization. The
> + * unbind there would get stuck waiting for rcu.
> + */
> + i915_gem_object_set_cache_coherency(obj, HAS_WT(i915) ?
> +    I915_CACHE_WT : I915_CACHE_NONE);
> +
>   switch (plane_config->tiling) {
>   case I915_TILING_NONE:
>   break;
>


--
kernel-team mailing list
[hidden email]
https://lists.ubuntu.com/mailman/listinfo/kernel-team

signature.asc (849 bytes) Download Attachment
Reply | Threaded
Open this post in threaded view
|

APPLIED: [SRU][Groovy][PATCH] drm/i915: Mark ininitial fb obj as WT on eLLC machines to avoid rcu lockup during fbdev init

Kleber Souza
In reply to this post by Kamal Mostafa-2
On 20.11.20 18:21, Kamal Mostafa wrote:

> From: Ville Syrjälä <[hidden email]>
>
> BugLink: https://bugs.launchpad.net/bugs/1903397
>
> Currently we leave the cache_level of the initial fb obj
> set to NONE. This means on eLLC machines the first pin_to_display()
> will try to switch it to WT which requires a vma unbind+bind.
> If that happens during the fbdev initialization rcu does not
> seem operational which causes the unbind to get stuck. To
> most appearances this looks like a dead machine on boot.
>
> Avoid the unbind by already marking the object cache_level
> as WT when creating it. We still do an excplicit ggtt pin
> which will rewrite the PTEs anyway, so they will match whatever
> cache level we set.
>
> Cc: <[hidden email]> # v5.7+
> Suggested-by: Chris Wilson <[hidden email]>
> Closes: https://gitlab.freedesktop.org/drm/intel/-/issues/2381
> Signed-off-by: Ville Syrjälä <[hidden email]>
> Reviewed-by: Chris Wilson <[hidden email]>
> Signed-off-by: Chris Wilson <[hidden email]>
> Link: https://patchwork.freedesktop.org/patch/msgid/20201007120329.17076-1-ville.syrjala@...
> Link: https://patchwork.freedesktop.org/patch/msgid/20201015122138.30161-1-chris@...
> (cherry picked from commit d46b60a2e8d246f1f0faa38e52f4f5a73858c338)
> Signed-off-by: Rodrigo Vivi <[hidden email]>
> (cherry picked from commit 1664ffee760a5d98952318fdd9b198fae396d660)
> Signed-off-by: Kamal Mostafa <[hidden email]>
> ---
>   drivers/gpu/drm/i915/display/intel_display.c | 8 ++++++++
>   1 file changed, 8 insertions(+)
>
> diff --git a/drivers/gpu/drm/i915/display/intel_display.c b/drivers/gpu/drm/i915/display/intel_display.c
> index 26996e1839e2..4bd383ff0880 100644
> --- a/drivers/gpu/drm/i915/display/intel_display.c
> +++ b/drivers/gpu/drm/i915/display/intel_display.c
> @@ -3431,6 +3431,14 @@ initial_plane_vma(struct drm_i915_private *i915,
>   if (IS_ERR(obj))
>   return NULL;
>  
> + /*
> + * Mark it WT ahead of time to avoid changing the
> + * cache_level during fbdev initialization. The
> + * unbind there would get stuck waiting for rcu.
> + */
> + i915_gem_object_set_cache_coherency(obj, HAS_WT(i915) ?
> +    I915_CACHE_WT : I915_CACHE_NONE);
> +
>   switch (plane_config->tiling) {
>   case I915_TILING_NONE:
>   break;
>

Applied to groovy/linux.

Thanks,
Kleber

--
kernel-team mailing list
[hidden email]
https://lists.ubuntu.com/mailman/listinfo/kernel-team