[SRU][Trusty][Xenial][PATCH] Fix for LP:#1764956

classic Classic list List threaded Threaded
9 messages Options
Reply | Threaded
Open this post in threaded view
|

[SRU][Trusty][Xenial][PATCH] Fix for LP:#1764956

Gavin Guo
BugLink: https://launchpad.net/bugs/1764956

[Impact]
the IBRS would be mistakenly enabled in the host when the switching
from an IBRS-enabled VM and that causes the performance overhead in
the host. The other condition could also mistakenly disables the IBRS
in VM when context-switching from the host. And this could be
considered a CVE host.

[Fix]
The patch fixes the logic inside the x86_virt_spec_ctrl that it checks
the ibrs_enabled and _or_ the hostval with the SPEC_CTRL_IBRS as the
x86_spec_ctrl_base by default is zero. Because the upstream
implementation is not equal to the Xenial's implementation. Upstream
doesn't use the IBRS as the formal fix. So, by default, it's zero.

On the other hand, after the VM exit, the SPEC_CTRL register also
needs to be saved manually by reading the SPEC_CTRL MSR as the MSR
intercept is disabled by default in the hardware_setup(v4.4) and
vmx_init(v3.13). The access to SPEC_CTRL MSR in VM is direct and
doesn't trigger a trap. So, the vmx_set_msr() function isn't called.

The v3.13 kernel hasn't been tested. However, the patch can be viewed
at:
http://kernel.ubuntu.com/git/gavinguo/ubuntu-trusty-amd64.git/log/?h=sf00191076-sru

The v4.4 patch:
http://kernel.ubuntu.com/git/gavinguo/ubuntu-xenial.git/log/?h=sf00191076-spectre-v2-regres-backport-juerg

[Test]

The patch has been tested on the 4.4.0-140.166 and works fine.

The reproducing environment:
Guest kernel version: 4.4.0-138.164
Host kernel version: 4.4.0-140.166

(host IBRS, guest IBRS)

- 1). (0, 1).
The case can be reproduced by the following instructions:
guest$ echo 1 | sudo tee /proc/sys/kernel/ibrs_enabled
1

<Several minutes later...>

host$ cat /proc/sys/kernel/ibrs_enabled
0
host$ for i in {0..55}; do sudo rdmsr 0x48 -p $i; done
11111111111111000000000000000000010010100000000000000000

Some of the IBRS bit inside the SPEC_CTRL MSR are mistakenly
enabled.

host$ taskset -c 5 stress-ng -c 1 --cpu-ops 2500
stress-ng: info:  [11264] defaulting to a 86400 second run per stressor
stress-ng: info:  [11264] dispatching hogs: 1 cpu
stress-ng: info:  [11264] cache allocate: default cache size: 35840K
stress-ng: info:  [11264] successful run completed in 33.48s

The host kernel didn't notice the IBRS bit is enabled. So, the situation
is the same as "echo 2 > /proc/sys/kernel/ibrs_enabled" in the host.
And running the stress-ng is a pure userspace CPU capability
calculation. So, the performance downgrades to about 1/3. Without the
IBRS enabled, it needs about 10s.

- 2). (1, 1) disables IBRS in host -> (0, 1) actually it becomes (0, 0).
The guest IBRS has been mistakenly disabled.

guest$ echo 2 | sudo tee /proc/sys/kernel/ibrs_enabled
guest$ for i in {0..55}; do sudo rdmsr 0x48 -p $i; done
11111111111111111111111111111111111111111111111111111111

host$ echo 2 | sudo tee /proc/sys/kernel/ibrs_enabled
host$ for i in {0..55}; do sudo rdmsr 0x48 -p $i; done
11111111111111111111111111111111111111111111111111111111
host$ echo 0 | sudo tee /proc/sys/kernel/ibrs_enabled
host$ for i in {0..55}; do sudo rdmsr 0x48 -p $i; done
00000000000000000000000000000000000000000000000000000000

guest$ for i in {0..55}; do sudo rdmsr 0x48 -p $i; done
00000000000000000000000000000000000000000000000000000000

Gavin Guo (1):
  UBUNTU: SAUCE: x86/speculation: Fix the IBRS synchronization

 arch/x86/kernel/cpu/bugs.c |  7 +++++++
 arch/x86/kvm/vmx.c         | 37 +++++++++++++++++++++++++++++++++++++
 2 files changed, 44 insertions(+)

--
2.7.4


--
kernel-team mailing list
[hidden email]
https://lists.ubuntu.com/mailman/listinfo/kernel-team
Reply | Threaded
Open this post in threaded view
|

[SRU][Trusty][Xenial][PATCH] UBUNTU: SAUCE: x86/speculation: Fix the IBRS synchronization

Gavin Guo
BugLink: https://launchpad.net/bugs/1764956

Ubuntu v4.4 kernel uses the in-house patches for IBRS. The backports
still have some issues causing the IBRS status wrong when
context-switching between the VM and host. For example, the IBRS would
be mistakenly enabled in the host when the switching from an IBRS-enabled
VM and that causes the performance overhead in the host. The other
condition could also mistakenly disables the IBRS in VM when
context-switching from the host. And this could be considered a CVE host.

The detail different situations analysis:

The reproducing environment:
Guest kernel version: 4.4.0-138.164
Host kernel version: 4.4.0-140.166

(host IBRS, guest IBRS)

- 1). (0, 1).
The case can be reproduced by the following instructions:
guest$ echo 1 | sudo tee /proc/sys/kernel/ibrs_enabled
1

<Several minutes later...>

host$ cat /proc/sys/kernel/ibrs_enabled
0
host$ for i in {0..55}; do sudo rdmsr 0x48 -p $i; done
11111111111111000000000000000000010010100000000000000000

Some of the IBRS bit inside the SPEC_CTRL MSR are mistakenly
enabled.

host$ taskset -c 5 stress-ng -c 1 --cpu-ops 2500
stress-ng: info:  [11264] defaulting to a 86400 second run per stressor
stress-ng: info:  [11264] dispatching hogs: 1 cpu
stress-ng: info:  [11264] cache allocate: default cache size: 35840K
stress-ng: info:  [11264] successful run completed in 33.48s

The host kernel didn't notice the IBRS bit is enabled. So, the situation
is the same as "echo 2 > /proc/sys/kernel/ibrs_enabled" in the host.
And running the stress-ng is a pure userspace CPU capability
calculation. So, the performance downgrades to about 1/3. Without the
IBRS enabled, it needs about 10s.

- 2). (1, 1) disables IBRS in host -> (0, 1) actually it becomes (0, 0).
The guest IBRS has been mistakenly disabled.

guest$ echo 2 | sudo tee /proc/sys/kernel/ibrs_enabled
guest$ for i in {0..55}; do sudo rdmsr 0x48 -p $i; done
11111111111111111111111111111111111111111111111111111111

host$ echo 2 | sudo tee /proc/sys/kernel/ibrs_enabled
host$ for i in {0..55}; do sudo rdmsr 0x48 -p $i; done
11111111111111111111111111111111111111111111111111111111
host$ echo 0 | sudo tee /proc/sys/kernel/ibrs_enabled
host$ for i in {0..55}; do sudo rdmsr 0x48 -p $i; done
00000000000000000000000000000000000000000000000000000000

guest$ for i in {0..55}; do sudo rdmsr 0x48 -p $i; done
00000000000000000000000000000000000000000000000000000000

Fixes: 4d8d3dbed275 ("UBUNTU: SAUCE: x86/bugs, KVM: Support the combination ...")
Fixes: f676aa34b402 ("x86/kvm: add MSR_IA32_SPEC_CTRL and MSR_IA32_PRED_CMD ...")
Signed-off-by: Gavin Guo <[hidden email]>
---
 arch/x86/kernel/cpu/bugs.c |  7 +++++++
 arch/x86/kvm/vmx.c         | 37 +++++++++++++++++++++++++++++++++++++
 2 files changed, 44 insertions(+)

diff --git a/arch/x86/kernel/cpu/bugs.c b/arch/x86/kernel/cpu/bugs.c
index 86d08522a721..09c328275c2b 100644
--- a/arch/x86/kernel/cpu/bugs.c
+++ b/arch/x86/kernel/cpu/bugs.c
@@ -185,6 +185,13 @@ x86_virt_spec_ctrl(u64 guest_spec_ctrl, u64 guest_virt_spec_ctrl, bool setguest)
  guestval = hostval & ~x86_spec_ctrl_mask;
  guestval |= guest_spec_ctrl & x86_spec_ctrl_mask;
 
+ /*
+ * Check the host IBRS status to make IBRS regsiter update
+ * correctly.
+ */
+ if (ibrs_enabled)
+ hostval |= SPEC_CTRL_IBRS;
+
  /* SSBD controlled in MSR_SPEC_CTRL */
  if (static_cpu_has(X86_FEATURE_SPEC_CTRL_SSBD))
  hostval |= ssbd_tif_to_spec_ctrl(ti->flags);
diff --git a/arch/x86/kvm/vmx.c b/arch/x86/kvm/vmx.c
index 22012bcc4ef6..b1e3f64f4aee 100644
--- a/arch/x86/kvm/vmx.c
+++ b/arch/x86/kvm/vmx.c
@@ -1840,6 +1840,32 @@ static void update_exception_bitmap(struct kvm_vcpu *vcpu)
  vmcs_write32(EXCEPTION_BITMAP, eb);
 }
 
+/*
+ * Check if MSR is intercepted for currently loaded MSR bitmap.
+ */
+static bool msr_write_intercepted(struct kvm_vcpu *vcpu, u32 msr)
+{
+ /*
+ * The longmode_only = "false" for MSR_IA32_SPEC_CTRL MSR register in
+ * hardware_setup function. So, the vmx_msr_bitmap_legacy bitmap is
+ * used. Refer to vmx_disable_intercept_for_msr function for the detail.
+ */
+ unsigned long *msr_bitmap = vmx_msr_bitmap_legacy;
+ int f = sizeof(unsigned long);
+
+ if (!cpu_has_vmx_msr_bitmap())
+ return true;
+
+ if (msr <= 0x1fff) {
+ return !!test_bit(msr, msr_bitmap + 0x800 / f);
+ } else if ((msr >= 0xc0000000) && (msr <= 0xc0001fff)) {
+ msr &= 0x1fff;
+ return !!test_bit(msr, msr_bitmap + 0xc00 / f);
+ }
+
+ return true;
+}
+
 static void clear_atomic_switch_msr_special(struct vcpu_vmx *vmx,
  unsigned long entry, unsigned long exit)
 {
@@ -9011,6 +9037,17 @@ static void __noclone vmx_vcpu_run(struct kvm_vcpu *vcpu)
 #endif
       );
 
+ /*
+ * In Ubuntu v4.4, the MSR_IA32_SPEC_CTRL trap is disabled in
+ * hardware_setup function. The guest SPEC_CTRL register needs to be
+ * saved to make the status correct. Refer to "commit f676aa34b402
+ * x86/kvm: add MSR_IA32_SPEC_CTRL and MSR_IA32_PRED_CMD to kvm" The
+ * related upstream commit is "commit 28b387fb74d KVM/VMX: Allow direct
+ * access to MSR_IA32_SPEC_CTRL"
+ */
+ if (!msr_write_intercepted(vcpu, MSR_IA32_SPEC_CTRL))
+ vcpu->arch.spec_ctrl = native_read_msr(MSR_IA32_SPEC_CTRL);
+
  x86_spec_ctrl_restore_host(vcpu->arch.spec_ctrl, 0);
 
  /* Eliminate branch target predictions from guest mode */
--
2.7.4


--
kernel-team mailing list
[hidden email]
https://lists.ubuntu.com/mailman/listinfo/kernel-team
Reply | Threaded
Open this post in threaded view
|

NACK: [SRU][Trusty][Xenial][PATCH] Fix for LP:#1764956

Juerg Haefliger
In reply to this post by Gavin Guo
Thanks for this Gavin!

I recognize the problem and as discussed yesterday, I need to
investigate some more. It seems we're missing a couple of patches to
make IBRS (and probably IBPB) passthrough work correctly. So rather than
patching this up I prefer to backport the relevant commits.

After taking a quick peek, it seems we're missing (or at least
partially) the following (from linux-stable):

fc00dde96099 KVM/SVM: Allow direct access to MSR_IA32_SPEC_CTRL
e5a83419c957 KVM/VMX: Allow direct access to MSR_IA32_SPEC_CTRL
755502f810c6 KVM/VMX: Emulate MSR_IA32_ARCH_CAPABILITIES
7013129a4034 KVM/x86: Add IBPB support

And probably more plus whatever is needed for our runtime controls (from
your patch).

...Juerg


On Thu, 22 Nov 2018 22:09:31 +0800
Gavin Guo <[hidden email]> wrote:

> BugLink: https://launchpad.net/bugs/1764956
>
> [Impact]
> the IBRS would be mistakenly enabled in the host when the switching
> from an IBRS-enabled VM and that causes the performance overhead in
> the host. The other condition could also mistakenly disables the IBRS
> in VM when context-switching from the host. And this could be
> considered a CVE host.
>
> [Fix]
> The patch fixes the logic inside the x86_virt_spec_ctrl that it checks
> the ibrs_enabled and _or_ the hostval with the SPEC_CTRL_IBRS as the
> x86_spec_ctrl_base by default is zero. Because the upstream
> implementation is not equal to the Xenial's implementation. Upstream
> doesn't use the IBRS as the formal fix. So, by default, it's zero.
>
> On the other hand, after the VM exit, the SPEC_CTRL register also
> needs to be saved manually by reading the SPEC_CTRL MSR as the MSR
> intercept is disabled by default in the hardware_setup(v4.4) and
> vmx_init(v3.13). The access to SPEC_CTRL MSR in VM is direct and
> doesn't trigger a trap. So, the vmx_set_msr() function isn't called.
>
> The v3.13 kernel hasn't been tested. However, the patch can be viewed
> at:
> http://kernel.ubuntu.com/git/gavinguo/ubuntu-trusty-amd64.git/log/?h=sf00191076-sru
>
> The v4.4 patch:
> http://kernel.ubuntu.com/git/gavinguo/ubuntu-xenial.git/log/?h=sf00191076-spectre-v2-regres-backport-juerg
>
> [Test]
>
> The patch has been tested on the 4.4.0-140.166 and works fine.
>
> The reproducing environment:
> Guest kernel version: 4.4.0-138.164
> Host kernel version: 4.4.0-140.166
>
> (host IBRS, guest IBRS)
>
> - 1). (0, 1).
> The case can be reproduced by the following instructions:
> guest$ echo 1 | sudo tee /proc/sys/kernel/ibrs_enabled
> 1
>
> <Several minutes later...>
>
> host$ cat /proc/sys/kernel/ibrs_enabled
> 0
> host$ for i in {0..55}; do sudo rdmsr 0x48 -p $i; done
> 11111111111111000000000000000000010010100000000000000000
>
> Some of the IBRS bit inside the SPEC_CTRL MSR are mistakenly
> enabled.
>
> host$ taskset -c 5 stress-ng -c 1 --cpu-ops 2500
> stress-ng: info:  [11264] defaulting to a 86400 second run per
> stressor stress-ng: info:  [11264] dispatching hogs: 1 cpu
> stress-ng: info:  [11264] cache allocate: default cache size: 35840K
> stress-ng: info:  [11264] successful run completed in 33.48s
>
> The host kernel didn't notice the IBRS bit is enabled. So, the
> situation is the same as "echo 2 > /proc/sys/kernel/ibrs_enabled" in
> the host. And running the stress-ng is a pure userspace CPU capability
> calculation. So, the performance downgrades to about 1/3. Without the
> IBRS enabled, it needs about 10s.
>
> - 2). (1, 1) disables IBRS in host -> (0, 1) actually it becomes (0,
> 0). The guest IBRS has been mistakenly disabled.
>
> guest$ echo 2 | sudo tee /proc/sys/kernel/ibrs_enabled
> guest$ for i in {0..55}; do sudo rdmsr 0x48 -p $i; done
> 11111111111111111111111111111111111111111111111111111111
>
> host$ echo 2 | sudo tee /proc/sys/kernel/ibrs_enabled
> host$ for i in {0..55}; do sudo rdmsr 0x48 -p $i; done
> 11111111111111111111111111111111111111111111111111111111
> host$ echo 0 | sudo tee /proc/sys/kernel/ibrs_enabled
> host$ for i in {0..55}; do sudo rdmsr 0x48 -p $i; done
> 00000000000000000000000000000000000000000000000000000000
>
> guest$ for i in {0..55}; do sudo rdmsr 0x48 -p $i; done
> 00000000000000000000000000000000000000000000000000000000
>
> Gavin Guo (1):
>   UBUNTU: SAUCE: x86/speculation: Fix the IBRS synchronization
>
>  arch/x86/kernel/cpu/bugs.c |  7 +++++++
>  arch/x86/kvm/vmx.c         | 37 +++++++++++++++++++++++++++++++++++++
>  2 files changed, 44 insertions(+)
>

--
kernel-team mailing list
[hidden email]
https://lists.ubuntu.com/mailman/listinfo/kernel-team

attachment0 (849 bytes) Download Attachment
Reply | Threaded
Open this post in threaded view
|

Re: NACK: [SRU][Trusty][Xenial][PATCH] Fix for LP:#1764956

Gavin Guo
Hi Juerg,

On Fri, Nov 23, 2018 at 5:14 PM Juerg Haefliger
<[hidden email]> wrote:

>
> Thanks for this Gavin!
>
> I recognize the problem and as discussed yesterday, I need to
> investigate some more. It seems we're missing a couple of patches to
> make IBRS (and probably IBPB) passthrough work correctly. So rather than
> patching this up I prefer to backport the relevant commits.
>
> After taking a quick peek, it seems we're missing (or at least
> partially) the following (from linux-stable):
>
> fc00dde96099 KVM/SVM: Allow direct access to MSR_IA32_SPEC_CTRL
> e5a83419c957 KVM/VMX: Allow direct access to MSR_IA32_SPEC_CTRL
> 755502f810c6 KVM/VMX: Emulate MSR_IA32_ARCH_CAPABILITIES
> 7013129a4034 KVM/x86: Add IBPB support
>
> And probably more plus whatever is needed for our runtime controls (from
> your patch).

Agree, I'll also look into the patches for future compatibility
issues. Thank you for the time looking into this. :)

>
>
> ...Juerg
>
>
> On Thu, 22 Nov 2018 22:09:31 +0800
> Gavin Guo <[hidden email]> wrote:
>
> > BugLink: https://launchpad.net/bugs/1764956
> >
> > [Impact]
> > the IBRS would be mistakenly enabled in the host when the switching
> > from an IBRS-enabled VM and that causes the performance overhead in
> > the host. The other condition could also mistakenly disables the IBRS
> > in VM when context-switching from the host. And this could be
> > considered a CVE host.
> >
> > [Fix]
> > The patch fixes the logic inside the x86_virt_spec_ctrl that it checks
> > the ibrs_enabled and _or_ the hostval with the SPEC_CTRL_IBRS as the
> > x86_spec_ctrl_base by default is zero. Because the upstream
> > implementation is not equal to the Xenial's implementation. Upstream
> > doesn't use the IBRS as the formal fix. So, by default, it's zero.
> >
> > On the other hand, after the VM exit, the SPEC_CTRL register also
> > needs to be saved manually by reading the SPEC_CTRL MSR as the MSR
> > intercept is disabled by default in the hardware_setup(v4.4) and
> > vmx_init(v3.13). The access to SPEC_CTRL MSR in VM is direct and
> > doesn't trigger a trap. So, the vmx_set_msr() function isn't called.
> >
> > The v3.13 kernel hasn't been tested. However, the patch can be viewed
> > at:
> > http://kernel.ubuntu.com/git/gavinguo/ubuntu-trusty-amd64.git/log/?h=sf00191076-sru
> >
> > The v4.4 patch:
> > http://kernel.ubuntu.com/git/gavinguo/ubuntu-xenial.git/log/?h=sf00191076-spectre-v2-regres-backport-juerg
> >
> > [Test]
> >
> > The patch has been tested on the 4.4.0-140.166 and works fine.
> >
> > The reproducing environment:
> > Guest kernel version: 4.4.0-138.164
> > Host kernel version: 4.4.0-140.166
> >
> > (host IBRS, guest IBRS)
> >
> > - 1). (0, 1).
> > The case can be reproduced by the following instructions:
> > guest$ echo 1 | sudo tee /proc/sys/kernel/ibrs_enabled
> > 1
> >
> > <Several minutes later...>
> >
> > host$ cat /proc/sys/kernel/ibrs_enabled
> > 0
> > host$ for i in {0..55}; do sudo rdmsr 0x48 -p $i; done
> > 11111111111111000000000000000000010010100000000000000000
> >
> > Some of the IBRS bit inside the SPEC_CTRL MSR are mistakenly
> > enabled.
> >
> > host$ taskset -c 5 stress-ng -c 1 --cpu-ops 2500
> > stress-ng: info:  [11264] defaulting to a 86400 second run per
> > stressor stress-ng: info:  [11264] dispatching hogs: 1 cpu
> > stress-ng: info:  [11264] cache allocate: default cache size: 35840K
> > stress-ng: info:  [11264] successful run completed in 33.48s
> >
> > The host kernel didn't notice the IBRS bit is enabled. So, the
> > situation is the same as "echo 2 > /proc/sys/kernel/ibrs_enabled" in
> > the host. And running the stress-ng is a pure userspace CPU capability
> > calculation. So, the performance downgrades to about 1/3. Without the
> > IBRS enabled, it needs about 10s.
> >
> > - 2). (1, 1) disables IBRS in host -> (0, 1) actually it becomes (0,
> > 0). The guest IBRS has been mistakenly disabled.
> >
> > guest$ echo 2 | sudo tee /proc/sys/kernel/ibrs_enabled
> > guest$ for i in {0..55}; do sudo rdmsr 0x48 -p $i; done
> > 11111111111111111111111111111111111111111111111111111111
> >
> > host$ echo 2 | sudo tee /proc/sys/kernel/ibrs_enabled
> > host$ for i in {0..55}; do sudo rdmsr 0x48 -p $i; done
> > 11111111111111111111111111111111111111111111111111111111
> > host$ echo 0 | sudo tee /proc/sys/kernel/ibrs_enabled
> > host$ for i in {0..55}; do sudo rdmsr 0x48 -p $i; done
> > 00000000000000000000000000000000000000000000000000000000
> >
> > guest$ for i in {0..55}; do sudo rdmsr 0x48 -p $i; done
> > 00000000000000000000000000000000000000000000000000000000
> >
> > Gavin Guo (1):
> >   UBUNTU: SAUCE: x86/speculation: Fix the IBRS synchronization
> >
> >  arch/x86/kernel/cpu/bugs.c |  7 +++++++
> >  arch/x86/kvm/vmx.c         | 37 +++++++++++++++++++++++++++++++++++++
> >  2 files changed, 44 insertions(+)
> >
>

--
kernel-team mailing list
[hidden email]
https://lists.ubuntu.com/mailman/listinfo/kernel-team
Reply | Threaded
Open this post in threaded view
|

Re: NACK: [SRU][Trusty][Xenial][PATCH] Fix for LP:#1764956

Juerg Haefliger
On Mon, 26 Nov 2018 22:55:49 +0800
Gavin Guo <[hidden email]> wrote:

> Hi Juerg,
>
> On Fri, Nov 23, 2018 at 5:14 PM Juerg Haefliger
> <[hidden email]> wrote:
> >
> > Thanks for this Gavin!
> >
> > I recognize the problem and as discussed yesterday, I need to
> > investigate some more. It seems we're missing a couple of patches to
> > make IBRS (and probably IBPB) passthrough work correctly. So rather
> > than patching this up I prefer to backport the relevant commits.
> >
> > After taking a quick peek, it seems we're missing (or at least
> > partially) the following (from linux-stable):
> >
> > fc00dde96099 KVM/SVM: Allow direct access to MSR_IA32_SPEC_CTRL
> > e5a83419c957 KVM/VMX: Allow direct access to MSR_IA32_SPEC_CTRL
> > 755502f810c6 KVM/VMX: Emulate MSR_IA32_ARCH_CAPABILITIES
> > 7013129a4034 KVM/x86: Add IBPB support
> >
> > And probably more plus whatever is needed for our runtime controls
> > (from your patch).  
>
> Agree, I'll also look into the patches for future compatibility
> issues. Thank you for the time looking into this. :)
Gavin, can you test
https://code.launchpad.net/~juergh/+git/xenial-linux branch lp1764956?

Thanks
...Juerg


> >
> >
> > ...Juerg

--
kernel-team mailing list
[hidden email]
https://lists.ubuntu.com/mailman/listinfo/kernel-team

attachment0 (849 bytes) Download Attachment
Reply | Threaded
Open this post in threaded view
|

Re: NACK: [SRU][Trusty][Xenial][PATCH] Fix for LP:#1764956

Gavin Guo
Hi Juerg,

On Wed, Nov 28, 2018 at 11:05 PM Juerg Haefliger
<[hidden email]> wrote:

>
> On Mon, 26 Nov 2018 22:55:49 +0800
> Gavin Guo <[hidden email]> wrote:
>
> > Hi Juerg,
> >
> > On Fri, Nov 23, 2018 at 5:14 PM Juerg Haefliger
> > <[hidden email]> wrote:
> > >
> > > Thanks for this Gavin!
> > >
> > > I recognize the problem and as discussed yesterday, I need to
> > > investigate some more. It seems we're missing a couple of patches to
> > > make IBRS (and probably IBPB) passthrough work correctly. So rather
> > > than patching this up I prefer to backport the relevant commits.
> > >
> > > After taking a quick peek, it seems we're missing (or at least
> > > partially) the following (from linux-stable):
> > >
> > > fc00dde96099 KVM/SVM: Allow direct access to MSR_IA32_SPEC_CTRL
> > > e5a83419c957 KVM/VMX: Allow direct access to MSR_IA32_SPEC_CTRL
> > > 755502f810c6 KVM/VMX: Emulate MSR_IA32_ARCH_CAPABILITIES
> > > 7013129a4034 KVM/x86: Add IBPB support
> > >
> > > And probably more plus whatever is needed for our runtime controls
> > > (from your patch).
> >
> > Agree, I'll also look into the patches for future compatibility
> > issues. Thank you for the time looking into this. :)
>
> Gavin, can you test
> https://code.launchpad.net/~juergh/+git/xenial-linux branch lp1764956?

Thank you for the prompt response, please let me spend some time
rebasing the patch and do some experimentation. I'll keep you posted
about the result.

>
> Thanks
> ...Juerg
>
>
> > >
> > >
> > > ...Juerg

--
kernel-team mailing list
[hidden email]
https://lists.ubuntu.com/mailman/listinfo/kernel-team
Reply | Threaded
Open this post in threaded view
|

[SRU][Trusty][Xenial][PATCH v2] UBUNTU: SAUCE: x86/speculation: Fix the IBRS synchronization

Gavin Guo
BugLink: https://launchpad.net/bugs/1764956

Ubuntu v4.4 kernel uses the in-house patches for IBRS. The backports
still have some issues causing the IBRS status wrong when
context-switching between the VM and host. For example, the IBRS would
be mistakenly enabled in the host when the switching from an IBRS-enabled
VM and that causes the performance overhead in the host. The other
condition could also mistakenly disables the IBRS in VM when
context-switching from the host. And this could be considered a CVE host.

The detail different situations analysis:

The reproducing environment:
Guest kernel version: 4.4.0-138.164
Host kernel version: 4.4.0-140.166

(host IBRS, guest IBRS)

- 1). (0, 1).
The case can be reproduced by the following instructions:
guest$ echo 1 | sudo tee /proc/sys/kernel/ibrs_enabled
1

<Several minutes later...>

host$ cat /proc/sys/kernel/ibrs_enabled
0
host$ for i in {0..55}; do sudo rdmsr 0x48 -p $i; done
11111111111111000000000000000000010010100000000000000000

Some of the IBRS bit inside the SPEC_CTRL MSR are mistakenly
enabled.

host$ taskset -c 5 stress-ng -c 1 --cpu-ops 2500
stress-ng: info:  [11264] defaulting to a 86400 second run per stressor
stress-ng: info:  [11264] dispatching hogs: 1 cpu
stress-ng: info:  [11264] cache allocate: default cache size: 35840K
stress-ng: info:  [11264] successful run completed in 33.48s

The host kernel didn't notice the IBRS bit is enabled. So, the situation
is the same as "echo 2 > /proc/sys/kernel/ibrs_enabled" in the host.
And running the stress-ng is a pure userspace CPU capability
calculation. So, the performance downgrades to about 1/3. Without the
IBRS enabled, it needs about 10s.

- 2). (1, 1) disables IBRS in host -> (0, 1) actually it becomes (0, 0).
The guest IBRS has been mistakenly disabled.

guest$ echo 2 | sudo tee /proc/sys/kernel/ibrs_enabled
guest$ for i in {0..55}; do sudo rdmsr 0x48 -p $i; done
11111111111111111111111111111111111111111111111111111111

host$ echo 2 | sudo tee /proc/sys/kernel/ibrs_enabled
host$ for i in {0..55}; do sudo rdmsr 0x48 -p $i; done
11111111111111111111111111111111111111111111111111111111
host$ echo 0 | sudo tee /proc/sys/kernel/ibrs_enabled
host$ for i in {0..55}; do sudo rdmsr 0x48 -p $i; done
00000000000000000000000000000000000000000000000000000000

guest$ for i in {0..55}; do sudo rdmsr 0x48 -p $i; done
00000000000000000000000000000000000000000000000000000000

Fixes: 4d8d3dbed275 ("UBUNTU: SAUCE: x86/bugs, KVM: Support the combination ...")
Fixes: f676aa34b402 ("x86/kvm: add MSR_IA32_SPEC_CTRL and MSR_IA32_PRED_CMD ...")
Signed-off-by: Gavin Guo <[hidden email]>
---
 arch/x86/kernel/cpu/bugs.c | 7 +++++++
 1 file changed, 7 insertions(+)

diff --git a/arch/x86/kernel/cpu/bugs.c b/arch/x86/kernel/cpu/bugs.c
index 60907abf12f5..e5f1ba148e3c 100644
--- a/arch/x86/kernel/cpu/bugs.c
+++ b/arch/x86/kernel/cpu/bugs.c
@@ -185,6 +185,13 @@ x86_virt_spec_ctrl(u64 guest_spec_ctrl, u64 guest_virt_spec_ctrl, bool setguest)
  guestval = hostval & ~x86_spec_ctrl_mask;
  guestval |= guest_spec_ctrl & x86_spec_ctrl_mask;
 
+ /*
+ * Check the host IBRS status to make IBRS regsiter update
+ * correctly.
+ */
+ if (ibrs_enabled)
+ hostval |= SPEC_CTRL_IBRS;
+
  /* SSBD controlled in MSR_SPEC_CTRL */
  if (static_cpu_has(X86_FEATURE_SPEC_CTRL_SSBD))
  hostval |= ssbd_tif_to_spec_ctrl(ti->flags);
--
2.7.4


--
kernel-team mailing list
[hidden email]
https://lists.ubuntu.com/mailman/listinfo/kernel-team
Reply | Threaded
Open this post in threaded view
|

Re: [SRU][Trusty][Xenial][PATCH v2] UBUNTU: SAUCE: x86/speculation: Fix the IBRS synchronization

Juerg Haefliger
On Fri, 30 Nov 2018 17:44:37 +0800
Gavin Guo <[hidden email]> wrote:

> BugLink: https://launchpad.net/bugs/1764956
>
> Ubuntu v4.4 kernel uses the in-house patches for IBRS. The backports
> still have some issues causing the IBRS status wrong when
> context-switching between the VM and host. For example, the IBRS would
> be mistakenly enabled in the host when the switching from an IBRS-enabled
> VM and that causes the performance overhead in the host. The other
> condition could also mistakenly disables the IBRS in VM when
> context-switching from the host. And this could be considered a CVE host.
>
> The detail different situations analysis:
>
> The reproducing environment:
> Guest kernel version: 4.4.0-138.164
> Host kernel version: 4.4.0-140.166
>
> (host IBRS, guest IBRS)
>
> - 1). (0, 1).
> The case can be reproduced by the following instructions:
> guest$ echo 1 | sudo tee /proc/sys/kernel/ibrs_enabled
> 1
>
> <Several minutes later...>
>
> host$ cat /proc/sys/kernel/ibrs_enabled
> 0
> host$ for i in {0..55}; do sudo rdmsr 0x48 -p $i; done
> 11111111111111000000000000000000010010100000000000000000
>
> Some of the IBRS bit inside the SPEC_CTRL MSR are mistakenly
> enabled.
>
> host$ taskset -c 5 stress-ng -c 1 --cpu-ops 2500
> stress-ng: info:  [11264] defaulting to a 86400 second run per stressor
> stress-ng: info:  [11264] dispatching hogs: 1 cpu
> stress-ng: info:  [11264] cache allocate: default cache size: 35840K
> stress-ng: info:  [11264] successful run completed in 33.48s
>
> The host kernel didn't notice the IBRS bit is enabled. So, the situation
> is the same as "echo 2 > /proc/sys/kernel/ibrs_enabled" in the host.
> And running the stress-ng is a pure userspace CPU capability
> calculation. So, the performance downgrades to about 1/3. Without the
> IBRS enabled, it needs about 10s.
>
> - 2). (1, 1) disables IBRS in host -> (0, 1) actually it becomes (0, 0).
> The guest IBRS has been mistakenly disabled.
>
> guest$ echo 2 | sudo tee /proc/sys/kernel/ibrs_enabled
> guest$ for i in {0..55}; do sudo rdmsr 0x48 -p $i; done
> 11111111111111111111111111111111111111111111111111111111
>
> host$ echo 2 | sudo tee /proc/sys/kernel/ibrs_enabled
> host$ for i in {0..55}; do sudo rdmsr 0x48 -p $i; done
> 11111111111111111111111111111111111111111111111111111111
> host$ echo 0 | sudo tee /proc/sys/kernel/ibrs_enabled
> host$ for i in {0..55}; do sudo rdmsr 0x48 -p $i; done
> 00000000000000000000000000000000000000000000000000000000
>
> guest$ for i in {0..55}; do sudo rdmsr 0x48 -p $i; done
> 00000000000000000000000000000000000000000000000000000000
>
> Fixes: 4d8d3dbed275 ("UBUNTU: SAUCE: x86/bugs, KVM: Support the combination ...")
> Fixes: f676aa34b402 ("x86/kvm: add MSR_IA32_SPEC_CTRL and MSR_IA32_PRED_CMD ...")
> Signed-off-by: Gavin Guo <[hidden email]>
> ---
>  arch/x86/kernel/cpu/bugs.c | 7 +++++++
>  1 file changed, 7 insertions(+)
>
> diff --git a/arch/x86/kernel/cpu/bugs.c b/arch/x86/kernel/cpu/bugs.c
> index 60907abf12f5..e5f1ba148e3c 100644
> --- a/arch/x86/kernel/cpu/bugs.c
> +++ b/arch/x86/kernel/cpu/bugs.c
> @@ -185,6 +185,13 @@ x86_virt_spec_ctrl(u64 guest_spec_ctrl, u64 guest_virt_spec_ctrl, bool setguest)
>   guestval = hostval & ~x86_spec_ctrl_mask;
>   guestval |= guest_spec_ctrl & x86_spec_ctrl_mask;
>  
> + /*
> + * Check the host IBRS status to make IBRS regsiter update
> + * correctly.
> + */
> + if (ibrs_enabled)
> + hostval |= SPEC_CTRL_IBRS;
This means we're setting IBRS on the host even if ibrs_enabled == 1. I
don't think that's correct. With ibrs_enabled == 1 we could VMENTER with
IBRS cleared but with the above we will set it on VMEXIT which is
incorrect, no?

...Juerg


> +
>   /* SSBD controlled in MSR_SPEC_CTRL */
>   if (static_cpu_has(X86_FEATURE_SPEC_CTRL_SSBD))
>   hostval |= ssbd_tif_to_spec_ctrl(ti->flags);


--
kernel-team mailing list
[hidden email]
https://lists.ubuntu.com/mailman/listinfo/kernel-team

attachment0 (849 bytes) Download Attachment
Reply | Threaded
Open this post in threaded view
|

Re: [SRU][Trusty][Xenial][PATCH v2] UBUNTU: SAUCE: x86/speculation: Fix the IBRS synchronization

Gavin Guo
On Mon, Dec 3, 2018 at 9:18 PM Juerg Haefliger
<[hidden email]> wrote:

>
> On Fri, 30 Nov 2018 17:44:37 +0800
> Gavin Guo <[hidden email]> wrote:
>
> > BugLink: https://launchpad.net/bugs/1764956
> >
> > Ubuntu v4.4 kernel uses the in-house patches for IBRS. The backports
> > still have some issues causing the IBRS status wrong when
> > context-switching between the VM and host. For example, the IBRS would
> > be mistakenly enabled in the host when the switching from an IBRS-enabled
> > VM and that causes the performance overhead in the host. The other
> > condition could also mistakenly disables the IBRS in VM when
> > context-switching from the host. And this could be considered a CVE host.
> >
> > The detail different situations analysis:
> >
> > The reproducing environment:
> > Guest kernel version: 4.4.0-138.164
> > Host kernel version: 4.4.0-140.166
> >
> > (host IBRS, guest IBRS)
> >
> > - 1). (0, 1).
> > The case can be reproduced by the following instructions:
> > guest$ echo 1 | sudo tee /proc/sys/kernel/ibrs_enabled
> > 1
> >
> > <Several minutes later...>
> >
> > host$ cat /proc/sys/kernel/ibrs_enabled
> > 0
> > host$ for i in {0..55}; do sudo rdmsr 0x48 -p $i; done
> > 11111111111111000000000000000000010010100000000000000000
> >
> > Some of the IBRS bit inside the SPEC_CTRL MSR are mistakenly
> > enabled.
> >
> > host$ taskset -c 5 stress-ng -c 1 --cpu-ops 2500
> > stress-ng: info:  [11264] defaulting to a 86400 second run per stressor
> > stress-ng: info:  [11264] dispatching hogs: 1 cpu
> > stress-ng: info:  [11264] cache allocate: default cache size: 35840K
> > stress-ng: info:  [11264] successful run completed in 33.48s
> >
> > The host kernel didn't notice the IBRS bit is enabled. So, the situation
> > is the same as "echo 2 > /proc/sys/kernel/ibrs_enabled" in the host.
> > And running the stress-ng is a pure userspace CPU capability
> > calculation. So, the performance downgrades to about 1/3. Without the
> > IBRS enabled, it needs about 10s.
> >
> > - 2). (1, 1) disables IBRS in host -> (0, 1) actually it becomes (0, 0).
> > The guest IBRS has been mistakenly disabled.
> >
> > guest$ echo 2 | sudo tee /proc/sys/kernel/ibrs_enabled
> > guest$ for i in {0..55}; do sudo rdmsr 0x48 -p $i; done
> > 11111111111111111111111111111111111111111111111111111111
> >
> > host$ echo 2 | sudo tee /proc/sys/kernel/ibrs_enabled
> > host$ for i in {0..55}; do sudo rdmsr 0x48 -p $i; done
> > 11111111111111111111111111111111111111111111111111111111
> > host$ echo 0 | sudo tee /proc/sys/kernel/ibrs_enabled
> > host$ for i in {0..55}; do sudo rdmsr 0x48 -p $i; done
> > 00000000000000000000000000000000000000000000000000000000
> >
> > guest$ for i in {0..55}; do sudo rdmsr 0x48 -p $i; done
> > 00000000000000000000000000000000000000000000000000000000
> >
> > Fixes: 4d8d3dbed275 ("UBUNTU: SAUCE: x86/bugs, KVM: Support the combination ...")
> > Fixes: f676aa34b402 ("x86/kvm: add MSR_IA32_SPEC_CTRL and MSR_IA32_PRED_CMD ...")
> > Signed-off-by: Gavin Guo <[hidden email]>
> > ---
> >  arch/x86/kernel/cpu/bugs.c | 7 +++++++
> >  1 file changed, 7 insertions(+)
> >
> > diff --git a/arch/x86/kernel/cpu/bugs.c b/arch/x86/kernel/cpu/bugs.c
> > index 60907abf12f5..e5f1ba148e3c 100644
> > --- a/arch/x86/kernel/cpu/bugs.c
> > +++ b/arch/x86/kernel/cpu/bugs.c
> > @@ -185,6 +185,13 @@ x86_virt_spec_ctrl(u64 guest_spec_ctrl, u64 guest_virt_spec_ctrl, bool setguest)
> >               guestval = hostval & ~x86_spec_ctrl_mask;
> >               guestval |= guest_spec_ctrl & x86_spec_ctrl_mask;
> >
> > +             /*
> > +              * Check the host IBRS status to make IBRS regsiter update
> > +              * correctly.
> > +              */
> > +             if (ibrs_enabled)
> > +                     hostval |= SPEC_CTRL_IBRS;
>
> This means we're setting IBRS on the host even if ibrs_enabled == 1. I
> don't think that's correct. With ibrs_enabled == 1 we could VMENTER with
> IBRS cleared but with the above we will set it on VMEXIT which is
> incorrect, no?

Good point! The reason why it should be set to one is based on the
definition of the "ibrs_enabled == 1". When setting "ibrs_enabled ==
1", the IBRS is enabled _ONLY_ in the _KERNEL_ mode. In user space,
the IBRS is disabled.

It's possible from the two paths:

i). Boot time
start_kernel -> check_bugs -> spectre_v2_select_mitigation ->
set_ibrs_enabled(1)

ii). Sysctl path
ibrs_enabled_handler -> set_ibrs_enabled(1)

In set_ibrs_enabled(1), the IBRS register isn't set. It just assigns
one to the ibrs_enabled variable.

So, where is the point to enable the IBRS register in kernel mode?

When ibrs_enabled = 1, it won't jump over the __ASM_ENABLE_IBRS, so,
the IBRS is set(for 0/2, it will jump over the __ASM_ENABLE_IBRS to
"10:", then, the IBRS won't set in the ENABLE_IBRS).

.macro ENABLE_IBRS
        testl   $1, ibrs_enabled
        jz      10f
        __ASM_ENABLE_IBRS
        jmp 20f
10:
        lfence
20:
.endm

The ENABLE_IBRS macro is expanded in the point where user space
switches to kernel space, such as:

arch/x86/entry/entry_64.S
1). ENTRY(entry_SYSCALL_64)
2). common_interrupt
3). error_entry
4). ENTRY(nmi)


And DISABLE_IBRS will disable the IBRS register when switching back to
user space from the kernel space to keep the performance in user space.

.macro DISABLE_IBRS
        testl   $1, ibrs_enabled
        jz      9f
        __ASM_DISABLE_IBRS
9:
.endm

Finally, the reason is that before the VMENTRY, it's in the kernel mode,
so, the IBRS register status should be enabled by the above-mentioned
path from the user space to kernel space when "ibrs_enabled == 1". And
SPEC_CTRL IBRS bit should be one. Let assume that if the guest clear the IBRS
register, after the VMEXIT, we should set one back to the SPEC_CTRL
IBRS bit again to re-enable in the kernel mode. Or we can save the
IBRS bit status before the VMENTRY when "ibrs_enabled == 1." But I
think it's redundant. Or do you figure out if there are any
possibilities that in the kernel mode before VMENTRY, the SPEC_CTRL
IBRS bit could be zero?


>
> ...Juerg
>
>
> > +
> >               /* SSBD controlled in MSR_SPEC_CTRL */
> >               if (static_cpu_has(X86_FEATURE_SPEC_CTRL_SSBD))
> >                       hostval |= ssbd_tif_to_spec_ctrl(ti->flags);
>

--
kernel-team mailing list
[hidden email]
https://lists.ubuntu.com/mailman/listinfo/kernel-team