[SRU][Bionic][B-OEM][PATCH 0/1] fix the hang problem for nvidia p1000 graphic card

classic Classic list List threaded Threaded
3 messages Options
Reply | Threaded
Open this post in threaded view
|

[SRU][Bionic][B-OEM][PATCH 0/1] fix the hang problem for nvidia p1000 graphic card

Hui Wang
BugLink: https://bugs.launchpad.net/bugs/1791569

This patch is in the 4.18 already, no need to send it to cosmic.

Due to the context conflict, if we want to apply this patch as it is, we
need to apply a large amount of patches ahead of this patch, it is possible
to introduce some regression. So I made some change in this patch, it is
some differnt from the orignal patch, but they have the same logic, and it
can be applied to bionic kernel.

[Impact]
We have 2 nvidia graphic cards, and the nouveau driver in the bionic kernel can't
work well with both of these 2 cards, one of the cards hang during the boot
process, we compared the output of lspci and vbios version of these 2 cards, they
are same; and according to nivida's reply, it is possible that they have some
difference on computational units (https://devtalk.nvidia.com/default/topic/1038973/
linux/2-same-quadro-p1000-cards-but-only-one-can-install-ubuntu-/), and kernel-4.18
fixed this problem, through bisect, this patch was found.

[Fix]
backport a upstream patch to fix this problem. without this patch, the number of tpc
(texture process cluster) is hardcoded to be 5 for some nv graphic families, but
in practice, the tpc number of many families is not 5. And the p1000 grphic card belong
to gp107 family, it is 3 intead of 5.


[Test Case]
tested this patch with P1000, P2000, P620 and P500 graphic cards, all work well as before.

[Regression Potential]
Very low, this patch comes from upstream, and I have tested it with many nv graphic
cards, they all worked well as before.



Ben Skeggs (1):
  drm/nouveau/gr/gf100-: virtualise tpc_mask + apply fixes from traces

 drivers/gpu/drm/nouveau/nvkm/engine/gr/ctxgf100.c |  6 ++++++
 drivers/gpu/drm/nouveau/nvkm/engine/gr/ctxgf100.h | 12 +++++++-----
 drivers/gpu/drm/nouveau/nvkm/engine/gr/ctxgm200.c | 17 ++++++++++++++---
 drivers/gpu/drm/nouveau/nvkm/engine/gr/ctxgm20b.c |  2 +-
 drivers/gpu/drm/nouveau/nvkm/engine/gr/ctxgp100.c | 17 +++++++++--------
 drivers/gpu/drm/nouveau/nvkm/engine/gr/ctxgp102.c |  2 ++
 drivers/gpu/drm/nouveau/nvkm/engine/gr/ctxgp107.c |  2 ++
 drivers/gpu/drm/nouveau/nvkm/engine/gr/gf100.h    |  2 ++
 drivers/gpu/drm/nouveau/nvkm/engine/gr/gm200.c    |  1 +
 drivers/gpu/drm/nouveau/nvkm/engine/gr/gp100.c    |  2 ++
 drivers/gpu/drm/nouveau/nvkm/engine/gr/gp102.c    |  2 ++
 drivers/gpu/drm/nouveau/nvkm/engine/gr/gp107.c    |  2 ++
 drivers/gpu/drm/nouveau/nvkm/engine/gr/gp10b.c    |  2 ++
 13 files changed, 52 insertions(+), 17 deletions(-)

--
2.7.4


--
kernel-team mailing list
[hidden email]
https://lists.ubuntu.com/mailman/listinfo/kernel-team
Reply | Threaded
Open this post in threaded view
|

[SRU][Bionic][B-OEM][PATCH 1/1] drm/nouveau/gr/gf100-: virtualise tpc_mask + apply fixes from traces

Hui Wang
From: Ben Skeggs <[hidden email]>

BugLink: http://bugs.launchpad.net/bugs/1791569

We weren't placing higher TPC IDs in the right place on some configurations.

[Due to the context difference, the ctxgm200.c and ctxgp100.c are changed
a bit against the original patch, after this change, they have the same logic
as the original patch. -- Hui's comment]

Signed-off-by: Ben Skeggs <[hidden email]>
(backported from commit fc36076441bae141893bd79899d19aa1b5fdf524)
Signed-off-by: Hui Wang <[hidden email]>
---
 drivers/gpu/drm/nouveau/nvkm/engine/gr/ctxgf100.c |  6 ++++++
 drivers/gpu/drm/nouveau/nvkm/engine/gr/ctxgf100.h | 12 +++++++-----
 drivers/gpu/drm/nouveau/nvkm/engine/gr/ctxgm200.c | 17 ++++++++++++++---
 drivers/gpu/drm/nouveau/nvkm/engine/gr/ctxgm20b.c |  2 +-
 drivers/gpu/drm/nouveau/nvkm/engine/gr/ctxgp100.c | 17 +++++++++--------
 drivers/gpu/drm/nouveau/nvkm/engine/gr/ctxgp102.c |  2 ++
 drivers/gpu/drm/nouveau/nvkm/engine/gr/ctxgp107.c |  2 ++
 drivers/gpu/drm/nouveau/nvkm/engine/gr/gf100.h    |  2 ++
 drivers/gpu/drm/nouveau/nvkm/engine/gr/gm200.c    |  1 +
 drivers/gpu/drm/nouveau/nvkm/engine/gr/gp100.c    |  2 ++
 drivers/gpu/drm/nouveau/nvkm/engine/gr/gp102.c    |  2 ++
 drivers/gpu/drm/nouveau/nvkm/engine/gr/gp107.c    |  2 ++
 drivers/gpu/drm/nouveau/nvkm/engine/gr/gp10b.c    |  2 ++
 13 files changed, 52 insertions(+), 17 deletions(-)

diff --git a/drivers/gpu/drm/nouveau/nvkm/engine/gr/ctxgf100.c b/drivers/gpu/drm/nouveau/nvkm/engine/gr/ctxgf100.c
index 8810150..1540fde 100644
--- a/drivers/gpu/drm/nouveau/nvkm/engine/gr/ctxgf100.c
+++ b/drivers/gpu/drm/nouveau/nvkm/engine/gr/ctxgf100.c
@@ -1200,6 +1200,7 @@ void
 gf100_grctx_generate_r406800(struct gf100_gr *gr)
 {
  struct nvkm_device *device = gr->base.engine.subdev.device;
+ const struct gf100_grctx_func *func = gr->func->grctx;
  u64 tpc_mask = 0, tpc_set = 0;
  u8  tpcnr[GPC_MAX];
  int gpc, tpc;
@@ -1228,6 +1229,11 @@ gf100_grctx_generate_r406800(struct gf100_gr *gr)
  nvkm_wr32(device, 0x406c04 + (i * 0x20), upper_32_bits(tpc_set ^ tpc_mask));
  }
  }
+
+ if (func->tpc_mask)
+ func->tpc_mask(gr);
+ if (func->smid_config)
+ func->smid_config(gr);
 }
 
 void
diff --git a/drivers/gpu/drm/nouveau/nvkm/engine/gr/ctxgf100.h b/drivers/gpu/drm/nouveau/nvkm/engine/gr/ctxgf100.h
index 5199e5a..74a4bd1 100644
--- a/drivers/gpu/drm/nouveau/nvkm/engine/gr/ctxgf100.h
+++ b/drivers/gpu/drm/nouveau/nvkm/engine/gr/ctxgf100.h
@@ -48,6 +48,8 @@ struct gf100_grctx_func {
  u32 attrib_nr;
  u32 alpha_nr_max;
  u32 alpha_nr;
+ void (*tpc_mask)(struct gf100_gr *);
+ void (*smid_config)(struct gf100_gr *);
 };
 
 extern const struct gf100_grctx_func gf100_grctx;
@@ -83,10 +85,6 @@ void gk104_grctx_generate_pagepool(struct gf100_grctx *);
 void gk104_grctx_generate_unkn(struct gf100_gr *);
 void gk104_grctx_generate_r418bb8(struct gf100_gr *);
 
-void gm107_grctx_generate_bundle(struct gf100_grctx *);
-void gm107_grctx_generate_pagepool(struct gf100_grctx *);
-void gm107_grctx_generate_attrib(struct gf100_grctx *);
-
 extern const struct gf100_grctx_func gk110_grctx;
 extern const struct gf100_grctx_func gk110b_grctx;
 extern const struct gf100_grctx_func gk208_grctx;
@@ -95,16 +93,20 @@ extern const struct gf100_grctx_func gm107_grctx;
 void gm107_grctx_generate_bundle(struct gf100_grctx *);
 void gm107_grctx_generate_pagepool(struct gf100_grctx *);
 void gm107_grctx_generate_attrib(struct gf100_grctx *);
+void gm107_grctx_generate_sm_id(struct gf100_gr *, int, int, int);
 
 extern const struct gf100_grctx_func gm200_grctx;
+
 void gm200_grctx_generate_tpcid(struct gf100_gr *);
-void gm200_grctx_generate_405b60(struct gf100_gr *);
+void gm200_grctx_generate_tpc_mask(struct gf100_gr *);
+void gm200_grctx_generate_smid_config(struct gf100_gr *);
 
 extern const struct gf100_grctx_func gm20b_grctx;
 
 extern const struct gf100_grctx_func gp100_grctx;
 void gp100_grctx_generate_main(struct gf100_gr *, struct gf100_grctx *);
 void gp100_grctx_generate_pagepool(struct gf100_grctx *);
+void gp100_grctx_generate_smid_config(struct gf100_gr *);
 
 extern const struct gf100_grctx_func gp102_grctx;
 void gp102_grctx_generate_attrib(struct gf100_grctx *);
diff --git a/drivers/gpu/drm/nouveau/nvkm/engine/gr/ctxgm200.c b/drivers/gpu/drm/nouveau/nvkm/engine/gr/ctxgm200.c
index db209d3..93fae26 100644
--- a/drivers/gpu/drm/nouveau/nvkm/engine/gr/ctxgm200.c
+++ b/drivers/gpu/drm/nouveau/nvkm/engine/gr/ctxgm200.c
@@ -46,7 +46,7 @@ gm200_grctx_generate_tpcid(struct gf100_gr *gr)
 }
 
 void
-gm200_grctx_generate_405b60(struct gf100_gr *gr)
+gm200_grctx_generate_smid_config(struct gf100_gr *gr)
 {
  struct nvkm_device *device = gr->base.engine.subdev.device;
  const u32 dist_nr = DIV_ROUND_UP(gr->tpc_total, 4);
@@ -77,6 +77,15 @@ gm200_grctx_generate_405b60(struct gf100_gr *gr)
  nvkm_wr32(device, 0x405ba0 + (i * 4), gpcs[i]);
 }
 
+void
+gm200_grctx_generate_tpc_mask(struct gf100_gr *gr)
+{
+ u32 tmp, i;
+ for (tmp = 0, i = 0; i < gr->gpc_nr; i++)
+ tmp |= ((1 << gr->tpc_nr[i]) - 1) << (i * gr->func->tpc_nr);
+ nvkm_wr32(gr->base.engine.subdev.device, 0x4041c4, tmp);
+}
+
 static void
 gm200_grctx_generate_main(struct gf100_gr *gr, struct gf100_grctx *info)
 {
@@ -105,10 +114,10 @@ gm200_grctx_generate_main(struct gf100_gr *gr, struct gf100_grctx *info)
  nvkm_wr32(device, 0x405b00, (gr->tpc_total << 8) | gr->gpc_nr);
 
  for (tmp = 0, i = 0; i < gr->gpc_nr; i++)
- tmp |= ((1 << gr->tpc_nr[i]) - 1) << (i * 4);
+ tmp |= ((1 << gr->tpc_nr[i]) - 1) << (i * gr->func->tpc_nr);
  nvkm_wr32(device, 0x4041c4, tmp);
 
- gm200_grctx_generate_405b60(gr);
+ gm200_grctx_generate_smid_config(gr);
 
  gf100_gr_icmd(gr, gr->fuc_bundle);
  nvkm_wr32(device, 0x404154, idle_timeout);
@@ -133,4 +142,6 @@ gm200_grctx = {
  .attrib_nr = 0x400,
  .alpha_nr_max = 0x1800,
  .alpha_nr = 0x1000,
+ .tpc_mask = gm200_grctx_generate_tpc_mask,
+ .smid_config = gm200_grctx_generate_smid_config,
 };
diff --git a/drivers/gpu/drm/nouveau/nvkm/engine/gr/ctxgm20b.c b/drivers/gpu/drm/nouveau/nvkm/engine/gr/ctxgm20b.c
index e5702e3..26b2866 100644
--- a/drivers/gpu/drm/nouveau/nvkm/engine/gr/ctxgm20b.c
+++ b/drivers/gpu/drm/nouveau/nvkm/engine/gr/ctxgm20b.c
@@ -68,7 +68,7 @@ gm20b_grctx_generate_main(struct gf100_gr *gr, struct gf100_grctx *info)
  tmp |= ((1 << gr->tpc_nr[i]) - 1) << (i * 4);
  nvkm_wr32(device, 0x4041c4, tmp);
 
- gm200_grctx_generate_405b60(gr);
+ gm200_grctx_generate_smid_config(gr);
 
  gf100_gr_wait_idle(gr);
 
diff --git a/drivers/gpu/drm/nouveau/nvkm/engine/gr/ctxgp100.c b/drivers/gpu/drm/nouveau/nvkm/engine/gr/ctxgp100.c
index 88ea322..850a1b10 100644
--- a/drivers/gpu/drm/nouveau/nvkm/engine/gr/ctxgp100.c
+++ b/drivers/gpu/drm/nouveau/nvkm/engine/gr/ctxgp100.c
@@ -89,13 +89,12 @@ gp100_grctx_generate_attrib(struct gf100_grctx *info)
  mmio_wr32(info, 0x41befc, 0x00000000);
 }
 
-static void
-gp100_grctx_generate_405b60(struct gf100_gr *gr)
+void
+gp100_grctx_generate_smid_config(struct gf100_gr *gr)
 {
  struct nvkm_device *device = gr->base.engine.subdev.device;
  const u32 dist_nr = DIV_ROUND_UP(gr->tpc_total, 4);
- u32 dist[TPC_MAX / 4] = {};
- u32 gpcs[GPC_MAX * 2] = {};
+ u32 dist[TPC_MAX / 4] = {}, gpcs[16] = {};
  u8  tpcnr[GPC_MAX];
  int tpc, gpc, i;
 
@@ -112,12 +111,12 @@ gp100_grctx_generate_405b60(struct gf100_gr *gr)
  tpc = gr->tpc_nr[gpc] - tpcnr[gpc]--;
 
  dist[i / 4] |= ((gpc << 4) | tpc) << ((i % 4) * 8);
- gpcs[gpc + (gr->gpc_nr * (tpc / 4))] |= i << (tpc * 8);
+ gpcs[gpc + (gr->func->gpc_nr * (tpc / 4))] |= i << (tpc * 8);
  }
 
  for (i = 0; i < dist_nr; i++)
  nvkm_wr32(device, 0x405b60 + (i * 4), dist[i]);
- for (i = 0; i < gr->gpc_nr * 2; i++)
+ for (i = 0; i < ARRAY_SIZE(gpcs); i++)
  nvkm_wr32(device, 0x405ba0 + (i * 4), gpcs[i]);
 }
 
@@ -149,10 +148,10 @@ gp100_grctx_generate_main(struct gf100_gr *gr, struct gf100_grctx *info)
  nvkm_wr32(device, 0x405b00, (gr->tpc_total << 8) | gr->gpc_nr);
 
  for (tmp = 0, i = 0; i < gr->gpc_nr; i++)
- tmp |= ((1 << gr->tpc_nr[i]) - 1) << (i * 5);
+ tmp |= ((1 << gr->tpc_nr[i]) - 1) << (i * gr->func->tpc_nr);
  nvkm_wr32(device, 0x4041c4, tmp);
 
- gp100_grctx_generate_405b60(gr);
+ gp100_grctx_generate_smid_config(gr);
 
  gf100_gr_icmd(gr, gr->fuc_bundle);
  nvkm_wr32(device, 0x404154, idle_timeout);
@@ -174,4 +173,6 @@ gp100_grctx = {
  .attrib_nr = 0x440,
  .alpha_nr_max = 0xc00,
  .alpha_nr = 0x800,
+ .tpc_mask = gm200_grctx_generate_tpc_mask,
+ .smid_config = gp100_grctx_generate_smid_config,
 };
diff --git a/drivers/gpu/drm/nouveau/nvkm/engine/gr/ctxgp102.c b/drivers/gpu/drm/nouveau/nvkm/engine/gr/ctxgp102.c
index 7a66b4c..234b0a8 100644
--- a/drivers/gpu/drm/nouveau/nvkm/engine/gr/ctxgp102.c
+++ b/drivers/gpu/drm/nouveau/nvkm/engine/gr/ctxgp102.c
@@ -94,4 +94,6 @@ gp102_grctx = {
  .attrib_nr = 0x320,
  .alpha_nr_max = 0xc00,
  .alpha_nr = 0x800,
+ .tpc_mask = gm200_grctx_generate_tpc_mask,
+ .smid_config = gp100_grctx_generate_smid_config,
 };
diff --git a/drivers/gpu/drm/nouveau/nvkm/engine/gr/ctxgp107.c b/drivers/gpu/drm/nouveau/nvkm/engine/gr/ctxgp107.c
index 8da91a0..62443c9 100644
--- a/drivers/gpu/drm/nouveau/nvkm/engine/gr/ctxgp107.c
+++ b/drivers/gpu/drm/nouveau/nvkm/engine/gr/ctxgp107.c
@@ -44,4 +44,6 @@ gp107_grctx = {
  .attrib_nr = 0x540,
  .alpha_nr_max = 0xc00,
  .alpha_nr = 0x800,
+ .tpc_mask = gm200_grctx_generate_tpc_mask,
+ .smid_config = gp100_grctx_generate_smid_config,
 };
diff --git a/drivers/gpu/drm/nouveau/nvkm/engine/gr/gf100.h b/drivers/gpu/drm/nouveau/nvkm/engine/gr/gf100.h
index d7c2adb..b9ed3aa 100644
--- a/drivers/gpu/drm/nouveau/nvkm/engine/gr/gf100.h
+++ b/drivers/gpu/drm/nouveau/nvkm/engine/gr/gf100.h
@@ -135,6 +135,8 @@ struct gf100_gr_func {
  struct gf100_gr_ucode *ucode;
  } gpccs;
  int (*rops)(struct gf100_gr *);
+ int gpc_nr;
+ int tpc_nr;
  int ppc_nr;
  const struct gf100_grctx_func *grctx;
  struct nvkm_sclass sclass[];
diff --git a/drivers/gpu/drm/nouveau/nvkm/engine/gr/gm200.c b/drivers/gpu/drm/nouveau/nvkm/engine/gr/gm200.c
index 6435f12..c5cca42 100644
--- a/drivers/gpu/drm/nouveau/nvkm/engine/gr/gm200.c
+++ b/drivers/gpu/drm/nouveau/nvkm/engine/gr/gm200.c
@@ -213,6 +213,7 @@ gm200_gr = {
  .init_rop_active_fbps = gm200_gr_init_rop_active_fbps,
  .init_ppc_exceptions = gk104_gr_init_ppc_exceptions,
  .rops = gm200_gr_rops,
+ .tpc_nr = 4,
  .ppc_nr = 2,
  .grctx = &gm200_grctx,
  .sclass = {
diff --git a/drivers/gpu/drm/nouveau/nvkm/engine/gr/gp100.c b/drivers/gpu/drm/nouveau/nvkm/engine/gr/gp100.c
index 867a5f7..b87d865 100644
--- a/drivers/gpu/drm/nouveau/nvkm/engine/gr/gp100.c
+++ b/drivers/gpu/drm/nouveau/nvkm/engine/gr/gp100.c
@@ -164,6 +164,8 @@ gp100_gr = {
  .init_ppc_exceptions = gk104_gr_init_ppc_exceptions,
  .init_num_active_ltcs = gp100_gr_init_num_active_ltcs,
  .rops = gm200_gr_rops,
+ .gpc_nr = 6,
+ .tpc_nr = 5,
  .ppc_nr = 2,
  .grctx = &gp100_grctx,
  .sclass = {
diff --git a/drivers/gpu/drm/nouveau/nvkm/engine/gr/gp102.c b/drivers/gpu/drm/nouveau/nvkm/engine/gr/gp102.c
index 61e3a0b..8d1b09b 100644
--- a/drivers/gpu/drm/nouveau/nvkm/engine/gr/gp102.c
+++ b/drivers/gpu/drm/nouveau/nvkm/engine/gr/gp102.c
@@ -49,6 +49,8 @@ gp102_gr = {
  .init_swdx_pes_mask = gp102_gr_init_swdx_pes_mask,
  .init_num_active_ltcs = gp100_gr_init_num_active_ltcs,
  .rops = gm200_gr_rops,
+ .gpc_nr = 6,
+ .tpc_nr = 5,
  .ppc_nr = 3,
  .grctx = &gp102_grctx,
  .sclass = {
diff --git a/drivers/gpu/drm/nouveau/nvkm/engine/gr/gp107.c b/drivers/gpu/drm/nouveau/nvkm/engine/gr/gp107.c
index f727232..7ca037e 100644
--- a/drivers/gpu/drm/nouveau/nvkm/engine/gr/gp107.c
+++ b/drivers/gpu/drm/nouveau/nvkm/engine/gr/gp107.c
@@ -35,6 +35,8 @@ gp107_gr = {
  .init_swdx_pes_mask = gp102_gr_init_swdx_pes_mask,
  .init_num_active_ltcs = gp100_gr_init_num_active_ltcs,
  .rops = gm200_gr_rops,
+ .gpc_nr = 2,
+ .tpc_nr = 3,
  .ppc_nr = 1,
  .grctx = &gp107_grctx,
  .sclass = {
diff --git a/drivers/gpu/drm/nouveau/nvkm/engine/gr/gp10b.c b/drivers/gpu/drm/nouveau/nvkm/engine/gr/gp10b.c
index 5f3d161..775c4cf 100644
--- a/drivers/gpu/drm/nouveau/nvkm/engine/gr/gp10b.c
+++ b/drivers/gpu/drm/nouveau/nvkm/engine/gr/gp10b.c
@@ -41,6 +41,8 @@ gp10b_gr = {
  .init_ppc_exceptions = gk104_gr_init_ppc_exceptions,
  .init_num_active_ltcs = gp10b_gr_init_num_active_ltcs,
  .rops = gm200_gr_rops,
+ .gpc_nr = 1,
+ .tpc_nr = 2,
  .ppc_nr = 1,
  .grctx = &gp102_grctx,
  .sclass = {
--
2.7.4


--
kernel-team mailing list
[hidden email]
https://lists.ubuntu.com/mailman/listinfo/kernel-team
Reply | Threaded
Open this post in threaded view
|

ACK/Cmnt: [SRU][Bionic][B-OEM][PATCH 1/1] drm/nouveau/gr/gf100-: virtualise tpc_mask + apply fixes from traces

Stefan Bader-2
On 11.09.2018 07:24, Hui Wang wrote:

> From: Ben Skeggs <[hidden email]>
>
> BugLink: http://bugs.launchpad.net/bugs/1791569
>
> We weren't placing higher TPC IDs in the right place on some configurations.
>
> [Due to the context difference, the ctxgm200.c and ctxgp100.c are changed
> a bit against the original patch, after this change, they have the same logic
> as the original patch. -- Hui's comment]
>
> Signed-off-by: Ben Skeggs <[hidden email]>
> (backported from commit fc36076441bae141893bd79899d19aa1b5fdf524)
> Signed-off-by: Hui Wang <[hidden email]>
Acked-by: Stefan Bader <[hidden email]>
> ---

I see at least some good regression testing with other models of NVidia GPUs.
The impact is otherwise hard to tell from the changes.
I guess when this gets applied to Bionic there is no need to process it
separately for bionic/linux-oem.

-Stefan

>  drivers/gpu/drm/nouveau/nvkm/engine/gr/ctxgf100.c |  6 ++++++
>  drivers/gpu/drm/nouveau/nvkm/engine/gr/ctxgf100.h | 12 +++++++-----
>  drivers/gpu/drm/nouveau/nvkm/engine/gr/ctxgm200.c | 17 ++++++++++++++---
>  drivers/gpu/drm/nouveau/nvkm/engine/gr/ctxgm20b.c |  2 +-
>  drivers/gpu/drm/nouveau/nvkm/engine/gr/ctxgp100.c | 17 +++++++++--------
>  drivers/gpu/drm/nouveau/nvkm/engine/gr/ctxgp102.c |  2 ++
>  drivers/gpu/drm/nouveau/nvkm/engine/gr/ctxgp107.c |  2 ++
>  drivers/gpu/drm/nouveau/nvkm/engine/gr/gf100.h    |  2 ++
>  drivers/gpu/drm/nouveau/nvkm/engine/gr/gm200.c    |  1 +
>  drivers/gpu/drm/nouveau/nvkm/engine/gr/gp100.c    |  2 ++
>  drivers/gpu/drm/nouveau/nvkm/engine/gr/gp102.c    |  2 ++
>  drivers/gpu/drm/nouveau/nvkm/engine/gr/gp107.c    |  2 ++
>  drivers/gpu/drm/nouveau/nvkm/engine/gr/gp10b.c    |  2 ++
>  13 files changed, 52 insertions(+), 17 deletions(-)
>
> diff --git a/drivers/gpu/drm/nouveau/nvkm/engine/gr/ctxgf100.c b/drivers/gpu/drm/nouveau/nvkm/engine/gr/ctxgf100.c
> index 8810150..1540fde 100644
> --- a/drivers/gpu/drm/nouveau/nvkm/engine/gr/ctxgf100.c
> +++ b/drivers/gpu/drm/nouveau/nvkm/engine/gr/ctxgf100.c
> @@ -1200,6 +1200,7 @@ void
>  gf100_grctx_generate_r406800(struct gf100_gr *gr)
>  {
>   struct nvkm_device *device = gr->base.engine.subdev.device;
> + const struct gf100_grctx_func *func = gr->func->grctx;
>   u64 tpc_mask = 0, tpc_set = 0;
>   u8  tpcnr[GPC_MAX];
>   int gpc, tpc;
> @@ -1228,6 +1229,11 @@ gf100_grctx_generate_r406800(struct gf100_gr *gr)
>   nvkm_wr32(device, 0x406c04 + (i * 0x20), upper_32_bits(tpc_set ^ tpc_mask));
>   }
>   }
> +
> + if (func->tpc_mask)
> + func->tpc_mask(gr);
> + if (func->smid_config)
> + func->smid_config(gr);
>  }
>  
>  void
> diff --git a/drivers/gpu/drm/nouveau/nvkm/engine/gr/ctxgf100.h b/drivers/gpu/drm/nouveau/nvkm/engine/gr/ctxgf100.h
> index 5199e5a..74a4bd1 100644
> --- a/drivers/gpu/drm/nouveau/nvkm/engine/gr/ctxgf100.h
> +++ b/drivers/gpu/drm/nouveau/nvkm/engine/gr/ctxgf100.h
> @@ -48,6 +48,8 @@ struct gf100_grctx_func {
>   u32 attrib_nr;
>   u32 alpha_nr_max;
>   u32 alpha_nr;
> + void (*tpc_mask)(struct gf100_gr *);
> + void (*smid_config)(struct gf100_gr *);
>  };
>  
>  extern const struct gf100_grctx_func gf100_grctx;
> @@ -83,10 +85,6 @@ void gk104_grctx_generate_pagepool(struct gf100_grctx *);
>  void gk104_grctx_generate_unkn(struct gf100_gr *);
>  void gk104_grctx_generate_r418bb8(struct gf100_gr *);
>  
> -void gm107_grctx_generate_bundle(struct gf100_grctx *);
> -void gm107_grctx_generate_pagepool(struct gf100_grctx *);
> -void gm107_grctx_generate_attrib(struct gf100_grctx *);
> -
>  extern const struct gf100_grctx_func gk110_grctx;
>  extern const struct gf100_grctx_func gk110b_grctx;
>  extern const struct gf100_grctx_func gk208_grctx;
> @@ -95,16 +93,20 @@ extern const struct gf100_grctx_func gm107_grctx;
>  void gm107_grctx_generate_bundle(struct gf100_grctx *);
>  void gm107_grctx_generate_pagepool(struct gf100_grctx *);
>  void gm107_grctx_generate_attrib(struct gf100_grctx *);
> +void gm107_grctx_generate_sm_id(struct gf100_gr *, int, int, int);
>  
>  extern const struct gf100_grctx_func gm200_grctx;
> +
>  void gm200_grctx_generate_tpcid(struct gf100_gr *);
> -void gm200_grctx_generate_405b60(struct gf100_gr *);
> +void gm200_grctx_generate_tpc_mask(struct gf100_gr *);
> +void gm200_grctx_generate_smid_config(struct gf100_gr *);
>  
>  extern const struct gf100_grctx_func gm20b_grctx;
>  
>  extern const struct gf100_grctx_func gp100_grctx;
>  void gp100_grctx_generate_main(struct gf100_gr *, struct gf100_grctx *);
>  void gp100_grctx_generate_pagepool(struct gf100_grctx *);
> +void gp100_grctx_generate_smid_config(struct gf100_gr *);
>  
>  extern const struct gf100_grctx_func gp102_grctx;
>  void gp102_grctx_generate_attrib(struct gf100_grctx *);
> diff --git a/drivers/gpu/drm/nouveau/nvkm/engine/gr/ctxgm200.c b/drivers/gpu/drm/nouveau/nvkm/engine/gr/ctxgm200.c
> index db209d3..93fae26 100644
> --- a/drivers/gpu/drm/nouveau/nvkm/engine/gr/ctxgm200.c
> +++ b/drivers/gpu/drm/nouveau/nvkm/engine/gr/ctxgm200.c
> @@ -46,7 +46,7 @@ gm200_grctx_generate_tpcid(struct gf100_gr *gr)
>  }
>  
>  void
> -gm200_grctx_generate_405b60(struct gf100_gr *gr)
> +gm200_grctx_generate_smid_config(struct gf100_gr *gr)
>  {
>   struct nvkm_device *device = gr->base.engine.subdev.device;
>   const u32 dist_nr = DIV_ROUND_UP(gr->tpc_total, 4);
> @@ -77,6 +77,15 @@ gm200_grctx_generate_405b60(struct gf100_gr *gr)
>   nvkm_wr32(device, 0x405ba0 + (i * 4), gpcs[i]);
>  }
>  
> +void
> +gm200_grctx_generate_tpc_mask(struct gf100_gr *gr)
> +{
> + u32 tmp, i;
> + for (tmp = 0, i = 0; i < gr->gpc_nr; i++)
> + tmp |= ((1 << gr->tpc_nr[i]) - 1) << (i * gr->func->tpc_nr);
> + nvkm_wr32(gr->base.engine.subdev.device, 0x4041c4, tmp);
> +}
> +
>  static void
>  gm200_grctx_generate_main(struct gf100_gr *gr, struct gf100_grctx *info)
>  {
> @@ -105,10 +114,10 @@ gm200_grctx_generate_main(struct gf100_gr *gr, struct gf100_grctx *info)
>   nvkm_wr32(device, 0x405b00, (gr->tpc_total << 8) | gr->gpc_nr);
>  
>   for (tmp = 0, i = 0; i < gr->gpc_nr; i++)
> - tmp |= ((1 << gr->tpc_nr[i]) - 1) << (i * 4);
> + tmp |= ((1 << gr->tpc_nr[i]) - 1) << (i * gr->func->tpc_nr);
>   nvkm_wr32(device, 0x4041c4, tmp);
>  
> - gm200_grctx_generate_405b60(gr);
> + gm200_grctx_generate_smid_config(gr);
>  
>   gf100_gr_icmd(gr, gr->fuc_bundle);
>   nvkm_wr32(device, 0x404154, idle_timeout);
> @@ -133,4 +142,6 @@ gm200_grctx = {
>   .attrib_nr = 0x400,
>   .alpha_nr_max = 0x1800,
>   .alpha_nr = 0x1000,
> + .tpc_mask = gm200_grctx_generate_tpc_mask,
> + .smid_config = gm200_grctx_generate_smid_config,
>  };
> diff --git a/drivers/gpu/drm/nouveau/nvkm/engine/gr/ctxgm20b.c b/drivers/gpu/drm/nouveau/nvkm/engine/gr/ctxgm20b.c
> index e5702e3..26b2866 100644
> --- a/drivers/gpu/drm/nouveau/nvkm/engine/gr/ctxgm20b.c
> +++ b/drivers/gpu/drm/nouveau/nvkm/engine/gr/ctxgm20b.c
> @@ -68,7 +68,7 @@ gm20b_grctx_generate_main(struct gf100_gr *gr, struct gf100_grctx *info)
>   tmp |= ((1 << gr->tpc_nr[i]) - 1) << (i * 4);
>   nvkm_wr32(device, 0x4041c4, tmp);
>  
> - gm200_grctx_generate_405b60(gr);
> + gm200_grctx_generate_smid_config(gr);
>  
>   gf100_gr_wait_idle(gr);
>  
> diff --git a/drivers/gpu/drm/nouveau/nvkm/engine/gr/ctxgp100.c b/drivers/gpu/drm/nouveau/nvkm/engine/gr/ctxgp100.c
> index 88ea322..850a1b10 100644
> --- a/drivers/gpu/drm/nouveau/nvkm/engine/gr/ctxgp100.c
> +++ b/drivers/gpu/drm/nouveau/nvkm/engine/gr/ctxgp100.c
> @@ -89,13 +89,12 @@ gp100_grctx_generate_attrib(struct gf100_grctx *info)
>   mmio_wr32(info, 0x41befc, 0x00000000);
>  }
>  
> -static void
> -gp100_grctx_generate_405b60(struct gf100_gr *gr)
> +void
> +gp100_grctx_generate_smid_config(struct gf100_gr *gr)
>  {
>   struct nvkm_device *device = gr->base.engine.subdev.device;
>   const u32 dist_nr = DIV_ROUND_UP(gr->tpc_total, 4);
> - u32 dist[TPC_MAX / 4] = {};
> - u32 gpcs[GPC_MAX * 2] = {};
> + u32 dist[TPC_MAX / 4] = {}, gpcs[16] = {};
>   u8  tpcnr[GPC_MAX];
>   int tpc, gpc, i;
>  
> @@ -112,12 +111,12 @@ gp100_grctx_generate_405b60(struct gf100_gr *gr)
>   tpc = gr->tpc_nr[gpc] - tpcnr[gpc]--;
>  
>   dist[i / 4] |= ((gpc << 4) | tpc) << ((i % 4) * 8);
> - gpcs[gpc + (gr->gpc_nr * (tpc / 4))] |= i << (tpc * 8);
> + gpcs[gpc + (gr->func->gpc_nr * (tpc / 4))] |= i << (tpc * 8);
>   }
>  
>   for (i = 0; i < dist_nr; i++)
>   nvkm_wr32(device, 0x405b60 + (i * 4), dist[i]);
> - for (i = 0; i < gr->gpc_nr * 2; i++)
> + for (i = 0; i < ARRAY_SIZE(gpcs); i++)
>   nvkm_wr32(device, 0x405ba0 + (i * 4), gpcs[i]);
>  }
>  
> @@ -149,10 +148,10 @@ gp100_grctx_generate_main(struct gf100_gr *gr, struct gf100_grctx *info)
>   nvkm_wr32(device, 0x405b00, (gr->tpc_total << 8) | gr->gpc_nr);
>  
>   for (tmp = 0, i = 0; i < gr->gpc_nr; i++)
> - tmp |= ((1 << gr->tpc_nr[i]) - 1) << (i * 5);
> + tmp |= ((1 << gr->tpc_nr[i]) - 1) << (i * gr->func->tpc_nr);
>   nvkm_wr32(device, 0x4041c4, tmp);
>  
> - gp100_grctx_generate_405b60(gr);
> + gp100_grctx_generate_smid_config(gr);
>  
>   gf100_gr_icmd(gr, gr->fuc_bundle);
>   nvkm_wr32(device, 0x404154, idle_timeout);
> @@ -174,4 +173,6 @@ gp100_grctx = {
>   .attrib_nr = 0x440,
>   .alpha_nr_max = 0xc00,
>   .alpha_nr = 0x800,
> + .tpc_mask = gm200_grctx_generate_tpc_mask,
> + .smid_config = gp100_grctx_generate_smid_config,
>  };
> diff --git a/drivers/gpu/drm/nouveau/nvkm/engine/gr/ctxgp102.c b/drivers/gpu/drm/nouveau/nvkm/engine/gr/ctxgp102.c
> index 7a66b4c..234b0a8 100644
> --- a/drivers/gpu/drm/nouveau/nvkm/engine/gr/ctxgp102.c
> +++ b/drivers/gpu/drm/nouveau/nvkm/engine/gr/ctxgp102.c
> @@ -94,4 +94,6 @@ gp102_grctx = {
>   .attrib_nr = 0x320,
>   .alpha_nr_max = 0xc00,
>   .alpha_nr = 0x800,
> + .tpc_mask = gm200_grctx_generate_tpc_mask,
> + .smid_config = gp100_grctx_generate_smid_config,
>  };
> diff --git a/drivers/gpu/drm/nouveau/nvkm/engine/gr/ctxgp107.c b/drivers/gpu/drm/nouveau/nvkm/engine/gr/ctxgp107.c
> index 8da91a0..62443c9 100644
> --- a/drivers/gpu/drm/nouveau/nvkm/engine/gr/ctxgp107.c
> +++ b/drivers/gpu/drm/nouveau/nvkm/engine/gr/ctxgp107.c
> @@ -44,4 +44,6 @@ gp107_grctx = {
>   .attrib_nr = 0x540,
>   .alpha_nr_max = 0xc00,
>   .alpha_nr = 0x800,
> + .tpc_mask = gm200_grctx_generate_tpc_mask,
> + .smid_config = gp100_grctx_generate_smid_config,
>  };
> diff --git a/drivers/gpu/drm/nouveau/nvkm/engine/gr/gf100.h b/drivers/gpu/drm/nouveau/nvkm/engine/gr/gf100.h
> index d7c2adb..b9ed3aa 100644
> --- a/drivers/gpu/drm/nouveau/nvkm/engine/gr/gf100.h
> +++ b/drivers/gpu/drm/nouveau/nvkm/engine/gr/gf100.h
> @@ -135,6 +135,8 @@ struct gf100_gr_func {
>   struct gf100_gr_ucode *ucode;
>   } gpccs;
>   int (*rops)(struct gf100_gr *);
> + int gpc_nr;
> + int tpc_nr;
>   int ppc_nr;
>   const struct gf100_grctx_func *grctx;
>   struct nvkm_sclass sclass[];
> diff --git a/drivers/gpu/drm/nouveau/nvkm/engine/gr/gm200.c b/drivers/gpu/drm/nouveau/nvkm/engine/gr/gm200.c
> index 6435f12..c5cca42 100644
> --- a/drivers/gpu/drm/nouveau/nvkm/engine/gr/gm200.c
> +++ b/drivers/gpu/drm/nouveau/nvkm/engine/gr/gm200.c
> @@ -213,6 +213,7 @@ gm200_gr = {
>   .init_rop_active_fbps = gm200_gr_init_rop_active_fbps,
>   .init_ppc_exceptions = gk104_gr_init_ppc_exceptions,
>   .rops = gm200_gr_rops,
> + .tpc_nr = 4,
>   .ppc_nr = 2,
>   .grctx = &gm200_grctx,
>   .sclass = {
> diff --git a/drivers/gpu/drm/nouveau/nvkm/engine/gr/gp100.c b/drivers/gpu/drm/nouveau/nvkm/engine/gr/gp100.c
> index 867a5f7..b87d865 100644
> --- a/drivers/gpu/drm/nouveau/nvkm/engine/gr/gp100.c
> +++ b/drivers/gpu/drm/nouveau/nvkm/engine/gr/gp100.c
> @@ -164,6 +164,8 @@ gp100_gr = {
>   .init_ppc_exceptions = gk104_gr_init_ppc_exceptions,
>   .init_num_active_ltcs = gp100_gr_init_num_active_ltcs,
>   .rops = gm200_gr_rops,
> + .gpc_nr = 6,
> + .tpc_nr = 5,
>   .ppc_nr = 2,
>   .grctx = &gp100_grctx,
>   .sclass = {
> diff --git a/drivers/gpu/drm/nouveau/nvkm/engine/gr/gp102.c b/drivers/gpu/drm/nouveau/nvkm/engine/gr/gp102.c
> index 61e3a0b..8d1b09b 100644
> --- a/drivers/gpu/drm/nouveau/nvkm/engine/gr/gp102.c
> +++ b/drivers/gpu/drm/nouveau/nvkm/engine/gr/gp102.c
> @@ -49,6 +49,8 @@ gp102_gr = {
>   .init_swdx_pes_mask = gp102_gr_init_swdx_pes_mask,
>   .init_num_active_ltcs = gp100_gr_init_num_active_ltcs,
>   .rops = gm200_gr_rops,
> + .gpc_nr = 6,
> + .tpc_nr = 5,
>   .ppc_nr = 3,
>   .grctx = &gp102_grctx,
>   .sclass = {
> diff --git a/drivers/gpu/drm/nouveau/nvkm/engine/gr/gp107.c b/drivers/gpu/drm/nouveau/nvkm/engine/gr/gp107.c
> index f727232..7ca037e 100644
> --- a/drivers/gpu/drm/nouveau/nvkm/engine/gr/gp107.c
> +++ b/drivers/gpu/drm/nouveau/nvkm/engine/gr/gp107.c
> @@ -35,6 +35,8 @@ gp107_gr = {
>   .init_swdx_pes_mask = gp102_gr_init_swdx_pes_mask,
>   .init_num_active_ltcs = gp100_gr_init_num_active_ltcs,
>   .rops = gm200_gr_rops,
> + .gpc_nr = 2,
> + .tpc_nr = 3,
>   .ppc_nr = 1,
>   .grctx = &gp107_grctx,
>   .sclass = {
> diff --git a/drivers/gpu/drm/nouveau/nvkm/engine/gr/gp10b.c b/drivers/gpu/drm/nouveau/nvkm/engine/gr/gp10b.c
> index 5f3d161..775c4cf 100644
> --- a/drivers/gpu/drm/nouveau/nvkm/engine/gr/gp10b.c
> +++ b/drivers/gpu/drm/nouveau/nvkm/engine/gr/gp10b.c
> @@ -41,6 +41,8 @@ gp10b_gr = {
>   .init_ppc_exceptions = gk104_gr_init_ppc_exceptions,
>   .init_num_active_ltcs = gp10b_gr_init_num_active_ltcs,
>   .rops = gm200_gr_rops,
> + .gpc_nr = 1,
> + .tpc_nr = 2,
>   .ppc_nr = 1,
>   .grctx = &gp102_grctx,
>   .sclass = {
>


--
kernel-team mailing list
[hidden email]
https://lists.ubuntu.com/mailman/listinfo/kernel-team

signature.asc (836 bytes) Download Attachment