[SRU][EOAN][PATCH 0/1] Fix for LP:#1852663

classic Classic list List threaded Threaded
6 messages Options
Reply | Threaded
Open this post in threaded view
|

[SRU][EOAN][PATCH 0/1] Fix for LP:#1852663

Gerald Yang
BugLink: https://bugs.launchpad.net/bugs/1852663

[Impact]
Assign some VFs to VMs, when deleting VMs, a general protection fault occursin i40e_config_vf_promiscuous_mode

general protection fault: 0000 [#1] SMP PTI
CPU: 54 PID: 6200 Comm: libvirtd Not tainted 5.3.0-21-generic #22~18.04.1-UbuntuHardware name: HPE ProLiant DL380 Gen10/ProLiant DL380 Gen10, BIOS U30 05/21/2019
RIP: 0010:i40e_config_vf_promiscuous_mode+0x172/0x330 [i40e]
Code: 48 8b 00 83 d1 00 48 85 c0 75 ef 49 83 c4 08 4c 39 e6 75 dd 85 c9 74 73 0f b6 45 c0 45 31 d2 89 45 d0 4d 8b 3e 4d 85 ff 74 53 <41> 0f b7 4f 16 66 81 f9 ff 0f 77 3f 0f b7 b3 ea 0c 00 00 8b 55 d0
RSP: 0018:ffffb987b5c77760 EFLAGS: 00010202
RAX: 0000000000000000 RBX: ffff9bb5df5a9000 RCX: 0000000000000000
RDX: 0000000000000000 RSI: 0000000006000000 RDI: ffff9bace4bce350
RBP: ffffb987b5c777b0 R08: 0000000000000000 R09: ffff9bace56a9da0
R10: 0000000000000000 R11: 0000000000000100 R12: ffff9bb5df5a9a28
R13: ffff9bace4bce008 R14: ffff9bb5df5a9338 R15: 26c2b975d54f5980
FS: 00007f9f07fff700(0000) GS:ffff9bfcff480000(0000) knlGS:0000000000000000
CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033
CR2: 00007fa73c9c0e10 CR3: 000000f6ab37a002 CR4: 00000000007626e0
DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000
DR3: 0000000000000000 DR6: 00000000fffe0ff0 DR7: 0000000000000400
PKRU: 55555554
Call Trace:
i40e_ndo_set_vf_port_vlan+0x1a2/0x440[i40e]
do_setlink+0x53f/0xee0
?update_load_avg+0x596/0x620
?update_curr+0x7a/0x1d0
?__switch_to_asm+0x40/0x70
?__switch_to_asm+0x34/0x70
?__switch_to_asm+0x40/0x70
?__switch_to_asm+0x34/0x70
rtnl_setlink+0x113/0x150
rtnetlink_rcv_msg+0x296/0x340
?aa_label_sk_perm.part.4+0x10f/0x160
?_cond_resched+0x19/0x40
?rtnl_calcit.isra.30+0x120/0x120
netlink_rcv_skb+0x51/0x120
rtnetlink_rcv+0x15/0x20
netlink_unicast+0x1a4/0x250
netlink_sendmsg+0x2d7/0x3d0
sock_sendmsg+0x63/0x70
___sys_sendmsg+0x2a9/0x320
?aa_label_sk_perm.part.4+0x10f/0x160
?_raw_spin_unlock_bh+0x1e/0x20
?release_sock+0x8f/0xa0
__sys_sendmsg+0x63/0xa0
?__sys_sendmsg+0x63/0xa0
__x64_sys_sendmsg+0x1f/0x30
do_syscall_64+0x5a/0x130
entry_SYSCALL_64_after_hwframe+0x44/0xa9

This issue also happens when deleting k8s pod if VF is used by k8s pod, there was a bug reported in the e1000-devel mailing list
https://sourceforge.net/p/e1000/mailman/message/36766306/

The fix is suggested by Billy McFall, to add protection when accessing the hash list(vsi->mac_filter_hash), but it's not upstream yet

[Test Case]
Spin up some VMs with VFs, then delete all VMs

[Regression Potential]
Low, the fix is to add a protection for a hash list, shouldn't have potential regression

Gerald Yang (1):
  UBUNTU: SAUCE: i40e: fix GPF when deleting VM

 drivers/net/ethernet/intel/i40e/i40e_virtchnl_pf.c | 5 ++++-
 1 file changed, 4 insertions(+), 1 deletion(-)

--
2.17.1


--
kernel-team mailing list
[hidden email]
https://lists.ubuntu.com/mailman/listinfo/kernel-team
Reply | Threaded
Open this post in threaded view
|

[SRU][EOAN][PATCH 1/1]UBUNTU: SAUCE: i40e Fix GPF when deleting VMs

Gerald Yang
BugLink: https://bugs.launchpad.net/bugs/1852663

Fix a general protection in i40e_config_vf_promiscuous_mode

When deleting VMs with VFs created by i40e, a general protection
fault occurs in i40e_config_vf_promiscuous_mode due to race
condition for vsi->mac_filter_hash
And it also happens when deleteing pod with VFs

This issue was reported in e1000-devel mailling list
https://sourceforge.net/p/e1000/mailman/message/36766306/

Suggested-by: Billy McFall <[hidden email]>
Signed-off-by: Gerald Yang <[hidden email]>
---
 drivers/net/ethernet/intel/i40e/i40e_virtchnl_pf.c | 5 ++++-
 1 file changed, 4 insertions(+), 1 deletion(-)

diff --git a/drivers/net/ethernet/intel/i40e/i40e_virtchnl_pf.c b/drivers/net/ethernet/intel/i40e/i40e_virtchnl_pf.c
index 02b09a8ad54c..6f78db626031 100644
--- a/drivers/net/ethernet/intel/i40e/i40e_virtchnl_pf.c
+++ b/drivers/net/ethernet/intel/i40e/i40e_virtchnl_pf.c
@@ -1121,6 +1121,7 @@ static i40e_status i40e_config_vf_promiscuous_mode(struct i40e_vf *vf,
  struct i40e_pf *pf = vf->pf;
  struct i40e_hw *hw = &pf->hw;
  struct i40e_mac_filter *f;
+ struct hlist_node *h;
  i40e_status aq_ret = 0;
  struct i40e_vsi *vsi;
  int bkt;
@@ -1160,7 +1161,8 @@ static i40e_status i40e_config_vf_promiscuous_mode(struct i40e_vf *vf,
  }
  return aq_ret;
  } else if (i40e_getnum_vf_vsi_vlan_filters(vsi)) {
- hash_for_each(vsi->mac_filter_hash, bkt, f, hlist) {
+ spin_lock_bh(&vsi->mac_filter_hash_lock);
+ hash_for_each_safe(vsi->mac_filter_hash, bkt, h, f, hlist) {
  if (f->vlan < 0 || f->vlan > I40E_MAX_VLANID)
  continue;
  aq_ret = i40e_aq_set_vsi_mc_promisc_on_vlan(hw,
@@ -1193,6 +1195,7 @@ static i40e_status i40e_config_vf_promiscuous_mode(struct i40e_vf *vf,
  i40e_aq_str(&pf->hw, aq_err));
  }
  }
+ spin_unlock_bh(&vsi->mac_filter_hash_lock);
  return aq_ret;
  }
  aq_ret = i40e_aq_set_vsi_multicast_promiscuous(hw, vsi->seid, allmulti,
--
2.17.1


--
kernel-team mailing list
[hidden email]
https://lists.ubuntu.com/mailman/listinfo/kernel-team
Reply | Threaded
Open this post in threaded view
|

ACK: [SRU][EOAN][PATCH 1/1]UBUNTU: SAUCE: i40e Fix GPF when deleting VMs

Sultan Alsawaf
Acked-by: Sultan Alsawaf <[hidden email]>

On Thu, Nov 14, 2019, 7:54 PM Gerald Yang <[hidden email]> wrote:
BugLink: https://bugs.launchpad.net/bugs/1852663

Fix a general protection in i40e_config_vf_promiscuous_mode

When deleting VMs with VFs created by i40e, a general protection
fault occurs in i40e_config_vf_promiscuous_mode due to race
condition for vsi->mac_filter_hash
And it also happens when deleteing pod with VFs

This issue was reported in e1000-devel mailling list
https://sourceforge.net/p/e1000/mailman/message/36766306/

Suggested-by: Billy McFall <[hidden email]>
Signed-off-by: Gerald Yang <[hidden email]>
---
 drivers/net/ethernet/intel/i40e/i40e_virtchnl_pf.c | 5 ++++-
 1 file changed, 4 insertions(+), 1 deletion(-)

diff --git a/drivers/net/ethernet/intel/i40e/i40e_virtchnl_pf.c b/drivers/net/ethernet/intel/i40e/i40e_virtchnl_pf.c
index 02b09a8ad54c..6f78db626031 100644
--- a/drivers/net/ethernet/intel/i40e/i40e_virtchnl_pf.c
+++ b/drivers/net/ethernet/intel/i40e/i40e_virtchnl_pf.c
@@ -1121,6 +1121,7 @@ static i40e_status i40e_config_vf_promiscuous_mode(struct i40e_vf *vf,
        struct i40e_pf *pf = vf->pf;
        struct i40e_hw *hw = &pf->hw;
        struct i40e_mac_filter *f;
+       struct hlist_node *h;
        i40e_status aq_ret = 0;
        struct i40e_vsi *vsi;
        int bkt;
@@ -1160,7 +1161,8 @@ static i40e_status i40e_config_vf_promiscuous_mode(struct i40e_vf *vf,
                }
                return aq_ret;
        } else if (i40e_getnum_vf_vsi_vlan_filters(vsi)) {
-               hash_for_each(vsi->mac_filter_hash, bkt, f, hlist) {
+               spin_lock_bh(&vsi->mac_filter_hash_lock);
+               hash_for_each_safe(vsi->mac_filter_hash, bkt, h, f, hlist) {
                        if (f->vlan < 0 || f->vlan > I40E_MAX_VLANID)
                                continue;
                        aq_ret = i40e_aq_set_vsi_mc_promisc_on_vlan(hw,
@@ -1193,6 +1195,7 @@ static i40e_status i40e_config_vf_promiscuous_mode(struct i40e_vf *vf,
                                        i40e_aq_str(&pf->hw, aq_err));
                        }
                }
+               spin_unlock_bh(&vsi->mac_filter_hash_lock);
                return aq_ret;
        }
        aq_ret = i40e_aq_set_vsi_multicast_promiscuous(hw, vsi->seid, allmulti,
--
2.17.1


--
kernel-team mailing list
[hidden email]
https://lists.ubuntu.com/mailman/listinfo/kernel-team

--
kernel-team mailing list
[hidden email]
https://lists.ubuntu.com/mailman/listinfo/kernel-team
Reply | Threaded
Open this post in threaded view
|

ACK: [SRU][EOAN][PATCH 1/1]UBUNTU: SAUCE: i40e Fix GPF when deleting VMs

Connor Kuehl
In reply to this post by Gerald Yang
On 11/14/19 7:54 PM, Gerald Yang wrote:

> BugLink: https://bugs.launchpad.net/bugs/1852663
>
> Fix a general protection in i40e_config_vf_promiscuous_mode
>
> When deleting VMs with VFs created by i40e, a general protection
> fault occurs in i40e_config_vf_promiscuous_mode due to race
> condition for vsi->mac_filter_hash
> And it also happens when deleteing pod with VFs
>
> This issue was reported in e1000-devel mailling list
> https://sourceforge.net/p/e1000/mailman/message/36766306/
>
> Suggested-by: Billy McFall <[hidden email]>
> Signed-off-by: Gerald Yang <[hidden email]>

This seems reasonable to me, it looks like the lock is well-managed. It
doesn't look like there's been much discussion in the mailing list it
was submitted to regarding acceptance, so it would be good to keep an
eye on that.

Acked-by: Connor Kuehl <[hidden email]>

> ---
>   drivers/net/ethernet/intel/i40e/i40e_virtchnl_pf.c | 5 ++++-
>   1 file changed, 4 insertions(+), 1 deletion(-)
>
> diff --git a/drivers/net/ethernet/intel/i40e/i40e_virtchnl_pf.c b/drivers/net/ethernet/intel/i40e/i40e_virtchnl_pf.c
> index 02b09a8ad54c..6f78db626031 100644
> --- a/drivers/net/ethernet/intel/i40e/i40e_virtchnl_pf.c
> +++ b/drivers/net/ethernet/intel/i40e/i40e_virtchnl_pf.c
> @@ -1121,6 +1121,7 @@ static i40e_status i40e_config_vf_promiscuous_mode(struct i40e_vf *vf,
>   struct i40e_pf *pf = vf->pf;
>   struct i40e_hw *hw = &pf->hw;
>   struct i40e_mac_filter *f;
> + struct hlist_node *h;
>   i40e_status aq_ret = 0;
>   struct i40e_vsi *vsi;
>   int bkt;
> @@ -1160,7 +1161,8 @@ static i40e_status i40e_config_vf_promiscuous_mode(struct i40e_vf *vf,
>   }
>   return aq_ret;
>   } else if (i40e_getnum_vf_vsi_vlan_filters(vsi)) {
> - hash_for_each(vsi->mac_filter_hash, bkt, f, hlist) {
> + spin_lock_bh(&vsi->mac_filter_hash_lock);
> + hash_for_each_safe(vsi->mac_filter_hash, bkt, h, f, hlist) {
>   if (f->vlan < 0 || f->vlan > I40E_MAX_VLANID)
>   continue;
>   aq_ret = i40e_aq_set_vsi_mc_promisc_on_vlan(hw,
> @@ -1193,6 +1195,7 @@ static i40e_status i40e_config_vf_promiscuous_mode(struct i40e_vf *vf,
>   i40e_aq_str(&pf->hw, aq_err));
>   }
>   }
> + spin_unlock_bh(&vsi->mac_filter_hash_lock);
>   return aq_ret;
>   }
>   aq_ret = i40e_aq_set_vsi_multicast_promiscuous(hw, vsi->seid, allmulti,
>


--
kernel-team mailing list
[hidden email]
https://lists.ubuntu.com/mailman/listinfo/kernel-team
Reply | Threaded
Open this post in threaded view
|

APPLIED: [SRU][EOAN][PATCH 0/1] Fix for LP:#1852663

Khaled Elmously
In reply to this post by Gerald Yang
On 2019-11-15 11:54:02 , Gerald Yang wrote:

> BugLink: https://bugs.launchpad.net/bugs/1852663
>
> [Impact]
> Assign some VFs to VMs, when deleting VMs, a general protection fault occursin i40e_config_vf_promiscuous_mode
>
> general protection fault: 0000 [#1] SMP PTI
> CPU: 54 PID: 6200 Comm: libvirtd Not tainted 5.3.0-21-generic #22~18.04.1-UbuntuHardware name: HPE ProLiant DL380 Gen10/ProLiant DL380 Gen10, BIOS U30 05/21/2019
> RIP: 0010:i40e_config_vf_promiscuous_mode+0x172/0x330 [i40e]
> Code: 48 8b 00 83 d1 00 48 85 c0 75 ef 49 83 c4 08 4c 39 e6 75 dd 85 c9 74 73 0f b6 45 c0 45 31 d2 89 45 d0 4d 8b 3e 4d 85 ff 74 53 <41> 0f b7 4f 16 66 81 f9 ff 0f 77 3f 0f b7 b3 ea 0c 00 00 8b 55 d0
> RSP: 0018:ffffb987b5c77760 EFLAGS: 00010202
> RAX: 0000000000000000 RBX: ffff9bb5df5a9000 RCX: 0000000000000000
> RDX: 0000000000000000 RSI: 0000000006000000 RDI: ffff9bace4bce350
> RBP: ffffb987b5c777b0 R08: 0000000000000000 R09: ffff9bace56a9da0
> R10: 0000000000000000 R11: 0000000000000100 R12: ffff9bb5df5a9a28
> R13: ffff9bace4bce008 R14: ffff9bb5df5a9338 R15: 26c2b975d54f5980
> FS: 00007f9f07fff700(0000) GS:ffff9bfcff480000(0000) knlGS:0000000000000000
> CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033
> CR2: 00007fa73c9c0e10 CR3: 000000f6ab37a002 CR4: 00000000007626e0
> DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000
> DR3: 0000000000000000 DR6: 00000000fffe0ff0 DR7: 0000000000000400
> PKRU: 55555554
> Call Trace:
> i40e_ndo_set_vf_port_vlan+0x1a2/0x440[i40e]
> do_setlink+0x53f/0xee0
> ?update_load_avg+0x596/0x620
> ?update_curr+0x7a/0x1d0
> ?__switch_to_asm+0x40/0x70
> ?__switch_to_asm+0x34/0x70
> ?__switch_to_asm+0x40/0x70
> ?__switch_to_asm+0x34/0x70
> rtnl_setlink+0x113/0x150
> rtnetlink_rcv_msg+0x296/0x340
> ?aa_label_sk_perm.part.4+0x10f/0x160
> ?_cond_resched+0x19/0x40
> ?rtnl_calcit.isra.30+0x120/0x120
> netlink_rcv_skb+0x51/0x120
> rtnetlink_rcv+0x15/0x20
> netlink_unicast+0x1a4/0x250
> netlink_sendmsg+0x2d7/0x3d0
> sock_sendmsg+0x63/0x70
> ___sys_sendmsg+0x2a9/0x320
> ?aa_label_sk_perm.part.4+0x10f/0x160
> ?_raw_spin_unlock_bh+0x1e/0x20
> ?release_sock+0x8f/0xa0
> __sys_sendmsg+0x63/0xa0
> ?__sys_sendmsg+0x63/0xa0
> __x64_sys_sendmsg+0x1f/0x30
> do_syscall_64+0x5a/0x130
> entry_SYSCALL_64_after_hwframe+0x44/0xa9
>
> This issue also happens when deleting k8s pod if VF is used by k8s pod, there was a bug reported in the e1000-devel mailing list
> https://sourceforge.net/p/e1000/mailman/message/36766306/
>
> The fix is suggested by Billy McFall, to add protection when accessing the hash list(vsi->mac_filter_hash), but it's not upstream yet
>
> [Test Case]
> Spin up some VMs with VFs, then delete all VMs
>
> [Regression Potential]
> Low, the fix is to add a protection for a hash list, shouldn't have potential regression
>
> Gerald Yang (1):
>   UBUNTU: SAUCE: i40e: fix GPF when deleting VM
>
>  drivers/net/ethernet/intel/i40e/i40e_virtchnl_pf.c | 5 ++++-
>  1 file changed, 4 insertions(+), 1 deletion(-)
>
> --
> 2.17.1
>
>
> --
> kernel-team mailing list
> [hidden email]
> https://lists.ubuntu.com/mailman/listinfo/kernel-team

--
kernel-team mailing list
[hidden email]
https://lists.ubuntu.com/mailman/listinfo/kernel-team
Reply | Threaded
Open this post in threaded view
|

APPLIED[Unstable]: [SRU][EOAN][PATCH 0/1] Fix for LP:#1852663

Seth Forshee
In reply to this post by Gerald Yang
On Fri, Nov 15, 2019 at 11:54:02AM +0800, Gerald Yang wrote:

> BugLink: https://bugs.launchpad.net/bugs/1852663
>
> [Impact]
> Assign some VFs to VMs, when deleting VMs, a general protection fault occursin i40e_config_vf_promiscuous_mode
>
> general protection fault: 0000 [#1] SMP PTI
> CPU: 54 PID: 6200 Comm: libvirtd Not tainted 5.3.0-21-generic #22~18.04.1-UbuntuHardware name: HPE ProLiant DL380 Gen10/ProLiant DL380 Gen10, BIOS U30 05/21/2019
> RIP: 0010:i40e_config_vf_promiscuous_mode+0x172/0x330 [i40e]
> Code: 48 8b 00 83 d1 00 48 85 c0 75 ef 49 83 c4 08 4c 39 e6 75 dd 85 c9 74 73 0f b6 45 c0 45 31 d2 89 45 d0 4d 8b 3e 4d 85 ff 74 53 <41> 0f b7 4f 16 66 81 f9 ff 0f 77 3f 0f b7 b3 ea 0c 00 00 8b 55 d0
> RSP: 0018:ffffb987b5c77760 EFLAGS: 00010202
> RAX: 0000000000000000 RBX: ffff9bb5df5a9000 RCX: 0000000000000000
> RDX: 0000000000000000 RSI: 0000000006000000 RDI: ffff9bace4bce350
> RBP: ffffb987b5c777b0 R08: 0000000000000000 R09: ffff9bace56a9da0
> R10: 0000000000000000 R11: 0000000000000100 R12: ffff9bb5df5a9a28
> R13: ffff9bace4bce008 R14: ffff9bb5df5a9338 R15: 26c2b975d54f5980
> FS: 00007f9f07fff700(0000) GS:ffff9bfcff480000(0000) knlGS:0000000000000000
> CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033
> CR2: 00007fa73c9c0e10 CR3: 000000f6ab37a002 CR4: 00000000007626e0
> DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000
> DR3: 0000000000000000 DR6: 00000000fffe0ff0 DR7: 0000000000000400
> PKRU: 55555554
> Call Trace:
> i40e_ndo_set_vf_port_vlan+0x1a2/0x440[i40e]
> do_setlink+0x53f/0xee0
> ?update_load_avg+0x596/0x620
> ?update_curr+0x7a/0x1d0
> ?__switch_to_asm+0x40/0x70
> ?__switch_to_asm+0x34/0x70
> ?__switch_to_asm+0x40/0x70
> ?__switch_to_asm+0x34/0x70
> rtnl_setlink+0x113/0x150
> rtnetlink_rcv_msg+0x296/0x340
> ?aa_label_sk_perm.part.4+0x10f/0x160
> ?_cond_resched+0x19/0x40
> ?rtnl_calcit.isra.30+0x120/0x120
> netlink_rcv_skb+0x51/0x120
> rtnetlink_rcv+0x15/0x20
> netlink_unicast+0x1a4/0x250
> netlink_sendmsg+0x2d7/0x3d0
> sock_sendmsg+0x63/0x70
> ___sys_sendmsg+0x2a9/0x320
> ?aa_label_sk_perm.part.4+0x10f/0x160
> ?_raw_spin_unlock_bh+0x1e/0x20
> ?release_sock+0x8f/0xa0
> __sys_sendmsg+0x63/0xa0
> ?__sys_sendmsg+0x63/0xa0
> __x64_sys_sendmsg+0x1f/0x30
> do_syscall_64+0x5a/0x130
> entry_SYSCALL_64_after_hwframe+0x44/0xa9
>
> This issue also happens when deleting k8s pod if VF is used by k8s pod, there was a bug reported in the e1000-devel mailing list
> https://sourceforge.net/p/e1000/mailman/message/36766306/
>
> The fix is suggested by Billy McFall, to add protection when accessing the hash list(vsi->mac_filter_hash), but it's not upstream yet
>
> [Test Case]
> Spin up some VMs with VFs, then delete all VMs
>
> [Regression Potential]
> Low, the fix is to add a protection for a hash list, shouldn't have potential regression

Applied to unstable/master, thanks!

--
kernel-team mailing list
[hidden email]
https://lists.ubuntu.com/mailman/listinfo/kernel-team