[bionic:linux-bluefield][PATCH 0/2] Integrate Fixes from Mellanox 2019-11-13

classic Classic list List threaded Threaded
4 messages Options
Reply | Threaded
Open this post in threaded view
|

[bionic:linux-bluefield][PATCH 0/2] Integrate Fixes from Mellanox 2019-11-13

Wen-chien Jesse Sung
BugLink: https://launchpad.net/bugs/1853245

=== Impact ===
Some issues found while Mellanox verifying linux-bluefield. These are
the patches they sent to address those issues.

=== Risk of Regression ===
Should be reasonably low since:
* Patches are clean cherry-picks from upstream.
* Verified on target hardware.

Bodong Wang (2):
  {IB, net}/mlx5: E-Switch, Use index of rep for vport to IB port
    mapping
  RDMA/mlx5: Cleanup rep when doing unload

 drivers/infiniband/hw/mlx5/ib_rep.c           | 22 +++++++++++--------
 drivers/infiniband/hw/mlx5/mlx5_ib.h          |  1 -
 .../mellanox/mlx5/core/eswitch_offloads.c     |  1 +
 include/linux/mlx5/eswitch.h                  |  2 ++
 4 files changed, 16 insertions(+), 10 deletions(-)

--
2.20.1


--
kernel-team mailing list
[hidden email]
https://lists.ubuntu.com/mailman/listinfo/kernel-team
Reply | Threaded
Open this post in threaded view
|

[bionic:linux-bluefield][PATCH 1/2] {IB, net}/mlx5: E-Switch, Use index of rep for vport to IB port mapping

Wen-chien Jesse Sung
From: Bodong Wang <[hidden email]>

BugLink: https://launchpad.net/bugs/1853245

In the single IB device mode, the mapping between vport number and
rep relies on a counter. However for dynamic vport allocation, it is
desired to keep consistent map of eswitch vport and IB port.

Hence, simplify code to remove the free running counter and instead
use the available vport index during load/unload sequence from the
eswitch.

Signed-off-by: Bodong Wang <[hidden email]>
Suggested-by: Parav Pandit <[hidden email]>
Reviewed-by: Parav Pandit <[hidden email]>
Reviewed-by: Mark Bloch <[hidden email]>
Signed-off-by: Saeed Mahameed <[hidden email]>
(cherry picked from commit 2f69e591e4531d3192841a4eb2bd9b512f5a8b66)
---
 drivers/infiniband/hw/mlx5/ib_rep.c                        | 4 ++--
 drivers/infiniband/hw/mlx5/mlx5_ib.h                       | 1 -
 drivers/net/ethernet/mellanox/mlx5/core/eswitch_offloads.c | 1 +
 include/linux/mlx5/eswitch.h                               | 2 ++
 4 files changed, 5 insertions(+), 3 deletions(-)

diff --git a/drivers/infiniband/hw/mlx5/ib_rep.c b/drivers/infiniband/hw/mlx5/ib_rep.c
index c09b21db444a..0381031ad3ec 100644
--- a/drivers/infiniband/hw/mlx5/ib_rep.c
+++ b/drivers/infiniband/hw/mlx5/ib_rep.c
@@ -14,7 +14,7 @@ mlx5_ib_set_vport_rep(struct mlx5_core_dev *dev, struct mlx5_eswitch_rep *rep)
  int vport_index;
 
  ibdev = mlx5_ib_get_uplink_ibdev(dev->priv.eswitch);
- vport_index = ibdev->free_port++;
+ vport_index = rep->vport_index;
 
  ibdev->port[vport_index].rep = rep;
  write_lock(&ibdev->port[vport_index].roce.netdev_lock);
@@ -50,7 +50,7 @@ mlx5_ib_vport_rep_load(struct mlx5_core_dev *dev, struct mlx5_eswitch_rep *rep)
  }
 
  ibdev->is_rep = true;
- vport_index = ibdev->free_port++;
+ vport_index = rep->vport_index;
  ibdev->port[vport_index].rep = rep;
  ibdev->port[vport_index].roce.netdev =
  mlx5_ib_get_rep_netdev(dev->priv.eswitch, rep->vport);
diff --git a/drivers/infiniband/hw/mlx5/mlx5_ib.h b/drivers/infiniband/hw/mlx5/mlx5_ib.h
index 7df7857eb7e4..22b13a5f006a 100644
--- a/drivers/infiniband/hw/mlx5/mlx5_ib.h
+++ b/drivers/infiniband/hw/mlx5/mlx5_ib.h
@@ -979,7 +979,6 @@ struct mlx5_ib_dev {
  u16 devx_whitelist_uid;
  struct mlx5_srq_table   srq_table;
  struct mlx5_async_ctx   async_ctx;
- int free_port;
 };
 
 static inline struct mlx5_ib_cq *to_mibcq(struct mlx5_core_cq *mcq)
diff --git a/drivers/net/ethernet/mellanox/mlx5/core/eswitch_offloads.c b/drivers/net/ethernet/mellanox/mlx5/core/eswitch_offloads.c
index a8212402dec4..fa43bb839acc 100644
--- a/drivers/net/ethernet/mellanox/mlx5/core/eswitch_offloads.c
+++ b/drivers/net/ethernet/mellanox/mlx5/core/eswitch_offloads.c
@@ -1428,6 +1428,7 @@ int esw_offloads_init_reps(struct mlx5_eswitch *esw)
 
  mlx5_esw_for_all_reps(esw, vport_index, rep) {
  rep->vport = mlx5_eswitch_index_to_vport_num(esw, vport_index);
+ rep->vport_index = vport_index;
  ether_addr_copy(rep->hw_id, hw_id);
 
  for (rep_type = 0; rep_type < NUM_REP_TYPES; rep_type++)
diff --git a/include/linux/mlx5/eswitch.h b/include/linux/mlx5/eswitch.h
index 5c4645d7d69b..bb98cb3a9e87 100644
--- a/include/linux/mlx5/eswitch.h
+++ b/include/linux/mlx5/eswitch.h
@@ -46,6 +46,8 @@ struct mlx5_eswitch_rep {
  u16       vport;
  u8       hw_id[ETH_ALEN];
  u16       vlan;
+ /* Only IB rep is using vport_index */
+ u16       vport_index;
  u32       vlan_refcount;
 };
 
--
2.20.1


--
kernel-team mailing list
[hidden email]
https://lists.ubuntu.com/mailman/listinfo/kernel-team
Reply | Threaded
Open this post in threaded view
|

[bionic:linux-bluefield][PATCH 2/2] RDMA/mlx5: Cleanup rep when doing unload

Wen-chien Jesse Sung
In reply to this post by Wen-chien Jesse Sung
From: Bodong Wang <[hidden email]>

BugLink: https://launchpad.net/bugs/1853245

When an IB rep is loaded, netdev for the same vport is saved for later
reference. However, it's not cleaned up when doing unload. For ECPF,
kernel crashes when driver is referring to the already removed netdev.

Following steps lead to a shown call trace:
1. Create n VFs from host PF
2. Distroy the VFs
3. Run "rdma link" from ARM

Call trace:
  mlx5_ib_get_netdev+0x9c/0xe8 [mlx5_ib]
  mlx5_query_port_roce+0x268/0x558 [mlx5_ib]
  mlx5_ib_rep_query_port+0x14/0x34 [mlx5_ib]
  ib_query_port+0x9c/0xfc [ib_core]
  fill_port_info+0x74/0x28c [ib_core]
  nldev_port_get_doit+0x1a8/0x1e8 [ib_core]
  rdma_nl_rcv_msg+0x16c/0x1c0 [ib_core]
  rdma_nl_rcv+0xe8/0x144 [ib_core]
  netlink_unicast+0x184/0x214
  netlink_sendmsg+0x288/0x354
  sock_sendmsg+0x18/0x2c
  __sys_sendto+0xbc/0x138
  __arm64_sys_sendto+0x28/0x34
  el0_svc_common+0xb0/0x100
  el0_svc_handler+0x6c/0x84
  el0_svc+0x8/0xc

Cleanup the rep and netdev reference when unloading IB rep.

Fixes: 26628e2d58c9 ("RDMA/mlx5: Move to single device multiport ports in switchdev mode")
Signed-off-by: Bodong Wang <[hidden email]>
Reviewed-by: Mark Bloch <[hidden email]>
Reviewed-by: Parav Pandit <[hidden email]>
Signed-off-by: Saeed Mahameed <[hidden email]>
(cherry picked from commit b8ca123860ee556a8d42ab8c5c2afa469817a813)
---
 drivers/infiniband/hw/mlx5/ib_rep.c | 18 +++++++++++-------
 1 file changed, 11 insertions(+), 7 deletions(-)

diff --git a/drivers/infiniband/hw/mlx5/ib_rep.c b/drivers/infiniband/hw/mlx5/ib_rep.c
index 0381031ad3ec..b200dfae4d1f 100644
--- a/drivers/infiniband/hw/mlx5/ib_rep.c
+++ b/drivers/infiniband/hw/mlx5/ib_rep.c
@@ -17,6 +17,7 @@ mlx5_ib_set_vport_rep(struct mlx5_core_dev *dev, struct mlx5_eswitch_rep *rep)
  vport_index = rep->vport_index;
 
  ibdev->port[vport_index].rep = rep;
+ rep->rep_data[REP_IB].priv = ibdev;
  write_lock(&ibdev->port[vport_index].roce.netdev_lock);
  ibdev->port[vport_index].roce.netdev =
  mlx5_ib_get_rep_netdev(dev->priv.eswitch, rep->vport);
@@ -68,15 +69,18 @@ mlx5_ib_vport_rep_load(struct mlx5_core_dev *dev, struct mlx5_eswitch_rep *rep)
 static void
 mlx5_ib_vport_rep_unload(struct mlx5_eswitch_rep *rep)
 {
- struct mlx5_ib_dev *dev;
-
- if (!rep->rep_data[REP_IB].priv ||
-    rep->vport != MLX5_VPORT_UPLINK)
- return;
+ struct mlx5_ib_dev *dev = mlx5_ib_rep_to_dev(rep);
+ struct mlx5_ib_port *port;
 
- dev = mlx5_ib_rep_to_dev(rep);
- __mlx5_ib_remove(dev, dev->profile, MLX5_IB_STAGE_MAX);
+ port = &dev->port[rep->vport_index];
+ write_lock(&port->roce.netdev_lock);
+ port->roce.netdev = NULL;
+ write_unlock(&port->roce.netdev_lock);
  rep->rep_data[REP_IB].priv = NULL;
+ port->rep = NULL;
+
+ if (rep->vport == MLX5_VPORT_UPLINK)
+ __mlx5_ib_remove(dev, dev->profile, MLX5_IB_STAGE_MAX);
 }
 
 static void *mlx5_ib_vport_get_proto_dev(struct mlx5_eswitch_rep *rep)
--
2.20.1


--
kernel-team mailing list
[hidden email]
https://lists.ubuntu.com/mailman/listinfo/kernel-team
Reply | Threaded
Open this post in threaded view
|

NAK: [bionic:linux-bluefield][PATCH 0/2] Integrate Fixes from Mellanox 2019-11-13

Anthony Wong-2
In reply to this post by Wen-chien Jesse Sung
On Wed, Nov 20, 2019 at 03:39:01PM +0800, Wen-chien Jesse Sung wrote:

> BugLink: https://launchpad.net/bugs/1853245
>
> === Impact ===
> Some issues found while Mellanox verifying linux-bluefield. These are
> the patches they sent to address those issues.
>
> === Risk of Regression ===
> Should be reasonably low since:
> * Patches are clean cherry-picks from upstream.
> * Verified on target hardware.
>
> Bodong Wang (2):
>   {IB, net}/mlx5: E-Switch, Use index of rep for vport to IB port
>     mapping
>   RDMA/mlx5: Cleanup rep when doing unload
>
>  drivers/infiniband/hw/mlx5/ib_rep.c           | 22 +++++++++++--------
>  drivers/infiniband/hw/mlx5/mlx5_ib.h          |  1 -
>  .../mellanox/mlx5/core/eswitch_offloads.c     |  1 +
>  include/linux/mlx5/eswitch.h                  |  2 ++
>  4 files changed, 16 insertions(+), 10 deletions(-)

Superseded by v2, so NAK this.

--
kernel-team mailing list
[hidden email]
https://lists.ubuntu.com/mailman/listinfo/kernel-team