[SRU][FOCAL][PATCH 0/1] net/mlx5: Fix a race when moving command interface to polling mode

Previous Topic Next Topic
 
classic Classic list List threaded Threaded
3 messages Options
Reply | Threaded
Open this post in threaded view
|

[SRU][FOCAL][PATCH 0/1] net/mlx5: Fix a race when moving command interface to polling mode

William Breathitt Gray
[Impact]

As part of driver unload, it destroys the commands EQ (via FW command).
As the commands EQ is destroyed, FW will not generate EQEs for any command
that driver sends afterwards. Driver should poll for later commands status.

Driver commands mode metadata is updated before the commands EQ is
actually destroyed. This can lead for double completion handle by the
driver (polling and interrupt), if a command is executed and completed by
FW after the mode was changed, but before the EQ was destroyed.

[Fix]

Fix that by using the mlx5_cmd_allowed_opcode mechanism to guarantee
that only DESTROY_EQ command can be executed during this time period.

[Where problems could occur]

The scope of the changes in this patch is narrow: only the
destroy_async_eqs() function is touched. If a problem occurs, it will
occur during the small window when the driver unloads. Regression
potential is low however because only the DESTROY_EQ command should
execute during this time period.

Eran Ben Elisha (1):
  net/mlx5: Fix a race when moving command interface to polling mode

 drivers/net/ethernet/mellanox/mlx5/core/eq.c | 2 ++
 1 file changed, 2 insertions(+)

--
2.27.0


--
kernel-team mailing list
[hidden email]
https://lists.ubuntu.com/mailman/listinfo/kernel-team
Reply | Threaded
Open this post in threaded view
|

[SRU][FOCAL][PATCH 1/1] net/mlx5: Fix a race when moving command interface to polling mode

William Breathitt Gray
From: Eran Ben Elisha <[hidden email]>

BugLink: https://bugs.launchpad.net/bugs/1905574

As part of driver unload, it destroys the commands EQ (via FW command).
As the commands EQ is destroyed, FW will not generate EQEs for any command
that driver sends afterwards. Driver should poll for later commands status.

Driver commands mode metadata is updated before the commands EQ is
actually destroyed. This can lead for double completion handle by the
driver (polling and interrupt), if a command is executed and completed by
FW after the mode was changed, but before the EQ was destroyed.

Fix that by using the mlx5_cmd_allowed_opcode mechanism to guarantee
that only DESTROY_EQ command can be executed during this time period.

Fixes: e126ba97dba9 ("mlx5: Add driver for Mellanox Connect-IB adapters")
Signed-off-by: Eran Ben Elisha <[hidden email]>
Reviewed-by: Moshe Shemesh <[hidden email]>
Signed-off-by: Saeed Mahameed <[hidden email]>
(cherry-picked from commit 432161ea26d6d5e5c3f7306d9407d26ed1e1953e)
Signed-off-by: William Breathitt Gray <[hidden email]>
---
 drivers/net/ethernet/mellanox/mlx5/core/eq.c | 2 ++
 1 file changed, 2 insertions(+)

diff --git a/drivers/net/ethernet/mellanox/mlx5/core/eq.c b/drivers/net/ethernet/mellanox/mlx5/core/eq.c
index 0a20938b4aad..938c4a46f9de 100644
--- a/drivers/net/ethernet/mellanox/mlx5/core/eq.c
+++ b/drivers/net/ethernet/mellanox/mlx5/core/eq.c
@@ -695,8 +695,10 @@ static void destroy_async_eqs(struct mlx5_core_dev *dev)
 
  cleanup_async_eq(dev, &table->pages_eq, "pages");
  cleanup_async_eq(dev, &table->async_eq, "async");
+ mlx5_cmd_allowed_opcode(dev, MLX5_CMD_OP_DESTROY_EQ);
  mlx5_cmd_use_polling(dev);
  cleanup_async_eq(dev, &table->cmd_eq, "cmd");
+ mlx5_cmd_allowed_opcode(dev, CMD_ALLOWED_OPCODE_ALL);
  mlx5_eq_notifier_unregister(dev, &table->cq_err_nb);
 }
 
--
2.27.0


--
kernel-team mailing list
[hidden email]
https://lists.ubuntu.com/mailman/listinfo/kernel-team
Reply | Threaded
Open this post in threaded view
|

NACK: [SRU][FOCAL][PATCH 1/1] net/mlx5: Fix a race when moving command interface to polling mode

William Breathitt Gray
On Tue, Jan 19, 2021 at 11:00:15AM -0500, William Breathitt Gray wrote:

> From: Eran Ben Elisha <[hidden email]>
>
> BugLink: https://bugs.launchpad.net/bugs/1905574
>
> As part of driver unload, it destroys the commands EQ (via FW command).
> As the commands EQ is destroyed, FW will not generate EQEs for any command
> that driver sends afterwards. Driver should poll for later commands status.
>
> Driver commands mode metadata is updated before the commands EQ is
> actually destroyed. This can lead for double completion handle by the
> driver (polling and interrupt), if a command is executed and completed by
> FW after the mode was changed, but before the EQ was destroyed.
>
> Fix that by using the mlx5_cmd_allowed_opcode mechanism to guarantee
> that only DESTROY_EQ command can be executed during this time period.
>
> Fixes: e126ba97dba9 ("mlx5: Add driver for Mellanox Connect-IB adapters")
> Signed-off-by: Eran Ben Elisha <[hidden email]>
> Reviewed-by: Moshe Shemesh <[hidden email]>
> Signed-off-by: Saeed Mahameed <[hidden email]>
> (cherry-picked from commit 432161ea26d6d5e5c3f7306d9407d26ed1e1953e)
Cherry pick line has a typo.

Nacked-by: William Breathitt Gray <[hidden email]>

> Signed-off-by: William Breathitt Gray <[hidden email]>
> ---
>  drivers/net/ethernet/mellanox/mlx5/core/eq.c | 2 ++
>  1 file changed, 2 insertions(+)
>
> diff --git a/drivers/net/ethernet/mellanox/mlx5/core/eq.c b/drivers/net/ethernet/mellanox/mlx5/core/eq.c
> index 0a20938b4aad..938c4a46f9de 100644
> --- a/drivers/net/ethernet/mellanox/mlx5/core/eq.c
> +++ b/drivers/net/ethernet/mellanox/mlx5/core/eq.c
> @@ -695,8 +695,10 @@ static void destroy_async_eqs(struct mlx5_core_dev *dev)
>  
>   cleanup_async_eq(dev, &table->pages_eq, "pages");
>   cleanup_async_eq(dev, &table->async_eq, "async");
> + mlx5_cmd_allowed_opcode(dev, MLX5_CMD_OP_DESTROY_EQ);
>   mlx5_cmd_use_polling(dev);
>   cleanup_async_eq(dev, &table->cmd_eq, "cmd");
> + mlx5_cmd_allowed_opcode(dev, CMD_ALLOWED_OPCODE_ALL);
>   mlx5_eq_notifier_unregister(dev, &table->cq_err_nb);
>  }
>  
> --
> 2.27.0
>

--
kernel-team mailing list
[hidden email]
https://lists.ubuntu.com/mailman/listinfo/kernel-team

signature.asc (849 bytes) Download Attachment