[Xenial][SRU][CVE-2018-20784][PATCHv2 0/6] Fix for fair scheduler

classic Classic list List threaded Threaded
11 messages Options
Reply | Threaded
Open this post in threaded view
|

[Xenial][SRU][CVE-2018-20784][PATCHv2 0/6] Fix for fair scheduler

Connor Kuehl
https://people.canonical.com/~ubuntu-security/cve/2018/CVE-2018-20784.html

From the link above:

        "In the Linux kernel before 4.20.2, kernel/sched/fair.c mishandles leaf
        cfs_rq's, which allows attackers to cause a denial of service (infinite
        loop in update_blocked_averages) or possibly have unspecified other impact
        by inducing a high load."

In the first submission for this backport, I had only backported c40f7d74c741
("sched/fair: Fix infinite loop in update_blocked_averages() by reverting a9e7f6544b9c")
which ends up reverting a fix that was SRU'd previously.

Here is that previous SRU: https://bugs.launchpad.net/ubuntu/+source/linux/+bug/1747896

This submission contains the fix for this CVE as well as the necessary
patches for preserving the fix for the previously SRU'd patch (these
were found by Tyler Hicks. Thank you, Tyler!)

For posterity, my testing was as follows (taken from the LP report
above in the bug description):

I booted Xenial in a 64 bit VM twice. The first time was without this
CVE backport applied. The second time was with it applied. I ran the
reproducer in both cases and experienced the same CPU utilization (both
cores I allocated to my VM were at 100%) and in both cases I experienced
stable memory pressure. They would both hover around 120MB ± 3-5MB.

With this submission, I also observed the same results with watching the
cfs_rqs. With and without this backport applied, the cfs_rqs fluctuated
between 13-18. This is an improvement over the previous submission of
this CVE backport since the previous version experienced different
results by watching the cfs_rqs (namely, they'd remain at 61 for the
duration of the tests).

Linus Torvalds (1):
  sched/fair: Fix infinite loop in update_blocked_averages() by
    reverting a9e7f6544b9c

Peter Zijlstra (1):
  sched/fair: Add tmp_alone_branch assertion

Vincent Guittot (4):
  sched/fair: Fix hierarchical order in rq->leaf_cfs_rq_list
  sched/fair: Fix insertion in rq->leaf_cfs_rq_list
  sched/fair: Optimize update_blocked_averages()
  sched/fair: Fix O(nr_cgroups) in the load balancing path

 kernel/sched/core.c  |   1 +
 kernel/sched/fair.c  | 140 ++++++++++++++++++++++++++++++++++++-------
 kernel/sched/sched.h |   1 +
 3 files changed, 119 insertions(+), 23 deletions(-)

--
2.17.1


--
kernel-team mailing list
[hidden email]
https://lists.ubuntu.com/mailman/listinfo/kernel-team
Reply | Threaded
Open this post in threaded view
|

[Xenial][SRU][CVE-2018-20784][PATCH 1/6] sched/fair: Fix infinite loop in update_blocked_averages() by reverting a9e7f6544b9c

Connor Kuehl
From: Linus Torvalds <[hidden email]>

CVE-2018-20784

Zhipeng Xie, Xie XiuQi and Sargun Dhillon reported lockups in the
scheduler under high loads, starting at around the v4.18 time frame,
and Zhipeng Xie tracked it down to bugs in the rq->leaf_cfs_rq_list
manipulation.

Do a (manual) revert of:

  a9e7f6544b9c ("sched/fair: Fix O(nr_cgroups) in load balance path")

It turns out that the list_del_leaf_cfs_rq() introduced by this commit
is a surprising property that was not considered in followup commits
such as:

  9c2791f936ef ("sched/fair: Fix hierarchical order in rq->leaf_cfs_rq_list")

As Vincent Guittot explains:

 "I think that there is a bigger problem with commit a9e7f6544b9c and
  cfs_rq throttling:

  Let take the example of the following topology TG2 --> TG1 --> root:

   1) The 1st time a task is enqueued, we will add TG2 cfs_rq then TG1
      cfs_rq to leaf_cfs_rq_list and we are sure to do the whole branch in
      one path because it has never been used and can't be throttled so
      tmp_alone_branch will point to leaf_cfs_rq_list at the end.

   2) Then TG1 is throttled

   3) and we add TG3 as a new child of TG1.

   4) The 1st enqueue of a task on TG3 will add TG3 cfs_rq just before TG1
      cfs_rq and tmp_alone_branch will stay  on rq->leaf_cfs_rq_list.

  With commit a9e7f6544b9c, we can del a cfs_rq from rq->leaf_cfs_rq_list.
  So if the load of TG1 cfs_rq becomes NULL before step 2) above, TG1
  cfs_rq is removed from the list.
  Then at step 4), TG3 cfs_rq is added at the beginning of rq->leaf_cfs_rq_list
  but tmp_alone_branch still points to TG3 cfs_rq because its throttled
  parent can't be enqueued when the lock is released.
  tmp_alone_branch doesn't point to rq->leaf_cfs_rq_list whereas it should.

  So if TG3 cfs_rq is removed or destroyed before tmp_alone_branch
  points on another TG cfs_rq, the next TG cfs_rq that will be added,
  will be linked outside rq->leaf_cfs_rq_list - which is bad.

  In addition, we can break the ordering of the cfs_rq in
  rq->leaf_cfs_rq_list but this ordering is used to update and
  propagate the update from leaf down to root."

Instead of trying to work through all these cases and trying to reproduce
the very high loads that produced the lockup to begin with, simplify
the code temporarily by reverting a9e7f6544b9c - which change was clearly
not thought through completely.

This (hopefully) gives us a kernel that doesn't lock up so people
can continue to enjoy their holidays without worrying about regressions. ;-)

[ mingo: Wrote changelog, fixed weird spelling in code comment while at it. ]

Analyzed-by: Xie XiuQi <[hidden email]>
Analyzed-by: Vincent Guittot <[hidden email]>
Reported-by: Zhipeng Xie <[hidden email]>
Reported-by: Sargun Dhillon <[hidden email]>
Reported-by: Xie XiuQi <[hidden email]>
Tested-by: Zhipeng Xie <[hidden email]>
Tested-by: Sargun Dhillon <[hidden email]>
Signed-off-by: Linus Torvalds <[hidden email]>
Acked-by: Vincent Guittot <[hidden email]>
Cc: <[hidden email]> # v4.13+
Cc: Bin Li <[hidden email]>
Cc: Mike Galbraith <[hidden email]>
Cc: Peter Zijlstra <[hidden email]>
Cc: Tejun Heo <[hidden email]>
Cc: Thomas Gleixner <[hidden email]>
Fixes: a9e7f6544b9c ("sched/fair: Fix O(nr_cgroups) in load balance path")
Link: http://lkml.kernel.org/r/1545879866-27809-1-git-send-email-xiexiuqi@...
Signed-off-by: Ingo Molnar <[hidden email]>
(backported from commit c40f7d74c741a907cfaeb73a7697081881c497d0)
[ Connor Kuehl: context adjustments were required to remove
  'cfs_rq_is_decayed()' and to merge the changes to
  'update_blocked_averages()'. ]
Signed-off-by: Connor Kuehl <[hidden email]>
---
 kernel/sched/fair.c | 48 ++++++++++++---------------------------------
 1 file changed, 12 insertions(+), 36 deletions(-)

diff --git a/kernel/sched/fair.c b/kernel/sched/fair.c
index 7ad5fa326de1..cfaf7e3458db 100644
--- a/kernel/sched/fair.c
+++ b/kernel/sched/fair.c
@@ -313,10 +313,9 @@ static inline void list_del_leaf_cfs_rq(struct cfs_rq *cfs_rq)
  }
 }
 
-/* Iterate thr' all leaf cfs_rq's on a runqueue */
-#define for_each_leaf_cfs_rq_safe(rq, cfs_rq, pos) \
- list_for_each_entry_safe(cfs_rq, pos, &rq->leaf_cfs_rq_list, \
- leaf_cfs_rq_list)
+/* Iterate through all leaf cfs_rq's on a runqueue: */
+#define for_each_leaf_cfs_rq(rq, cfs_rq) \
+ list_for_each_entry_rcu(cfs_rq, &rq->leaf_cfs_rq_list, leaf_cfs_rq_list)
 
 /* Do the two (enqueued) entities belong to the same group ? */
 static inline struct cfs_rq *
@@ -409,8 +408,8 @@ static inline void list_del_leaf_cfs_rq(struct cfs_rq *cfs_rq)
 {
 }
 
-#define for_each_leaf_cfs_rq_safe(rq, cfs_rq, pos) \
- for (cfs_rq = &rq->cfs, pos = NULL; cfs_rq; cfs_rq = pos)
+#define for_each_leaf_cfs_rq(rq, cfs_rq) \
+ for (cfs_rq = &rq->cfs; cfs_rq; cfs_rq = NULL)
 
 static inline struct sched_entity *parent_entity(struct sched_entity *se)
 {
@@ -4128,9 +4127,9 @@ static void destroy_cfs_bandwidth(struct cfs_bandwidth *cfs_b)
 
 static void __maybe_unused update_runtime_enabled(struct rq *rq)
 {
- struct cfs_rq *cfs_rq, *pos;
+ struct cfs_rq *cfs_rq;
 
- for_each_leaf_cfs_rq_safe(rq, cfs_rq, pos) {
+ for_each_leaf_cfs_rq(rq, cfs_rq) {
  struct cfs_bandwidth *cfs_b = &cfs_rq->tg->cfs_bandwidth;
 
  raw_spin_lock(&cfs_b->lock);
@@ -4143,7 +4142,7 @@ static void __maybe_unused unthrottle_offline_cfs_rqs(struct rq *rq)
 {
  struct cfs_rq *cfs_rq, *pos;
 
- for_each_leaf_cfs_rq_safe(rq, cfs_rq, pos) {
+ for_each_leaf_cfs_rq(rq, cfs_rq) {
  if (!cfs_rq->runtime_enabled)
  continue;
 
@@ -6034,27 +6033,10 @@ static void attach_tasks(struct lb_env *env)
 
 #ifdef CONFIG_FAIR_GROUP_SCHED
 
-static inline bool cfs_rq_is_decayed(struct cfs_rq *cfs_rq)
-{
- if (cfs_rq->load.weight)
- return false;
-
- if (cfs_rq->avg.load_sum)
- return false;
-
- if (cfs_rq->avg.util_sum)
- return false;
-
- if (cfs_rq->runnable_load_sum)
- return false;
-
- return true;
-}
-
 static void update_blocked_averages(int cpu)
 {
  struct rq *rq = cpu_rq(cpu);
- struct cfs_rq *cfs_rq, *pos;
+ struct cfs_rq *cfs_rq;
  unsigned long flags;
 
  raw_spin_lock_irqsave(&rq->lock, flags);
@@ -6064,7 +6046,7 @@ static void update_blocked_averages(int cpu)
  * Iterates the task_group tree in a bottom up fashion, see
  * list_add_leaf_cfs_rq() for details.
  */
- for_each_leaf_cfs_rq_safe(rq, cfs_rq, pos) {
+ for_each_leaf_cfs_rq(rq, cfs_rq) {
 
  /* throttled entities do not contribute to load */
  if (throttled_hierarchy(cfs_rq))
@@ -6073,12 +6055,6 @@ static void update_blocked_averages(int cpu)
  if (update_cfs_rq_load_avg(cfs_rq_clock_task(cfs_rq), cfs_rq))
  update_tg_load_avg(cfs_rq, 0);
 
- /*
- * There can be a lot of idle CPU cgroups.  Don't let fully
- * decayed cfs_rqs linger on the list.
- */
- if (cfs_rq_is_decayed(cfs_rq))
- list_del_leaf_cfs_rq(cfs_rq);
  }
  raw_spin_unlock_irqrestore(&rq->lock, flags);
 }
@@ -8484,10 +8460,10 @@ const struct sched_class fair_sched_class = {
 #ifdef CONFIG_SCHED_DEBUG
 void print_cfs_stats(struct seq_file *m, int cpu)
 {
- struct cfs_rq *cfs_rq, *pos;
+ struct cfs_rq *cfs_rq;
 
  rcu_read_lock();
- for_each_leaf_cfs_rq_safe(cpu_rq(cpu), cfs_rq, pos)
+ for_each_leaf_cfs_rq(cpu_rq(cpu), cfs_rq)
  print_cfs_rq(m, cpu, cfs_rq);
  rcu_read_unlock();
 }
--
2.17.1


--
kernel-team mailing list
[hidden email]
https://lists.ubuntu.com/mailman/listinfo/kernel-team
Reply | Threaded
Open this post in threaded view
|

[Xenial][SRU][CVE-2018-20784][PATCH 2/6] sched/fair: Fix hierarchical order in rq->leaf_cfs_rq_list

Connor Kuehl
In reply to this post by Connor Kuehl
From: Vincent Guittot <[hidden email]>

CVE-2018-20784

Fix the insertion of cfs_rq in rq->leaf_cfs_rq_list to ensure that a
child will always be called before its parent.

The hierarchical order in shares update list has been introduced by
commit:

  67e86250f8ea ("sched: Introduce hierarchal order on shares update list")

With the current implementation a child can be still put after its
parent.

Lets take the example of:

       root
        \
         b
         /\
         c d*
           |
           e*

with root -> b -> c already enqueued but not d -> e so the
leaf_cfs_rq_list looks like: head -> c -> b -> root -> tail

The branch d -> e will be added the first time that they are enqueued,
starting with e then d.

When e is added, its parents is not already on the list so e is put at
the tail : head -> c -> b -> root -> e -> tail

Then, d is added at the head because its parent is already on the
list: head -> d -> c -> b -> root -> e -> tail

e is not placed at the right position and will be called the last
whereas it should be called at the beginning.

Because it follows the bottom-up enqueue sequence, we are sure that we
will finished to add either a cfs_rq without parent or a cfs_rq with a
parent that is already on the list. We can use this event to detect
when we have finished to add a new branch. For the others, whose
parents are not already added, we have to ensure that they will be
added after their children that have just been inserted the steps
before, and after any potential parents that are already in the list.
The easiest way is to put the cfs_rq just after the last inserted one
and to keep track of it untl the branch is fully added.

Signed-off-by: Vincent Guittot <[hidden email]>
Signed-off-by: Peter Zijlstra (Intel) <[hidden email]>
Acked-by: Dietmar Eggemann <[hidden email]>
Cc: Linus Torvalds <[hidden email]>
Cc: [hidden email]
Cc: Peter Zijlstra <[hidden email]>
Cc: Thomas Gleixner <[hidden email]>
Cc: [hidden email]
Cc: [hidden email]
Cc: [hidden email]
Cc: [hidden email]
Link: http://lkml.kernel.org/r/1478598827-32372-3-git-send-email-vincent.guittot@...
Signed-off-by: Ingo Molnar <[hidden email]>
(backported from commit 9c2791f936ef5fd04a118b5c284f2c9a95f4a647)
[ Connor Kuehl: offset adjustments ]
Signed-off-by: Connor Kuehl <[hidden email]>
---
 kernel/sched/core.c  |  1 +
 kernel/sched/fair.c  | 54 ++++++++++++++++++++++++++++++++++++++------
 kernel/sched/sched.h |  1 +
 3 files changed, 49 insertions(+), 7 deletions(-)

diff --git a/kernel/sched/core.c b/kernel/sched/core.c
index 5197e9cac5db..e92bf2a0465f 100644
--- a/kernel/sched/core.c
+++ b/kernel/sched/core.c
@@ -7523,6 +7523,7 @@ void __init sched_init(void)
 #ifdef CONFIG_FAIR_GROUP_SCHED
  root_task_group.shares = ROOT_TASK_GROUP_LOAD;
  INIT_LIST_HEAD(&rq->leaf_cfs_rq_list);
+ rq->tmp_alone_branch = &rq->leaf_cfs_rq_list;
  /*
  * How much cpu bandwidth does root_task_group get?
  *
diff --git a/kernel/sched/fair.c b/kernel/sched/fair.c
index cfaf7e3458db..97e45a9352ec 100644
--- a/kernel/sched/fair.c
+++ b/kernel/sched/fair.c
@@ -286,19 +286,59 @@ static inline struct cfs_rq *group_cfs_rq(struct sched_entity *grp)
 static inline void list_add_leaf_cfs_rq(struct cfs_rq *cfs_rq)
 {
  if (!cfs_rq->on_list) {
+ struct rq *rq = rq_of(cfs_rq);
+ int cpu = cpu_of(rq);
  /*
  * Ensure we either appear before our parent (if already
  * enqueued) or force our parent to appear after us when it is
- * enqueued.  The fact that we always enqueue bottom-up
- * reduces this to two cases.
+ * enqueued. The fact that we always enqueue bottom-up
+ * reduces this to two cases and a special case for the root
+ * cfs_rq. Furthermore, it also means that we will always reset
+ * tmp_alone_branch either when the branch is connected
+ * to a tree or when we reach the beg of the tree
  */
  if (cfs_rq->tg->parent &&
-    cfs_rq->tg->parent->cfs_rq[cpu_of(rq_of(cfs_rq))]->on_list) {
- list_add_rcu(&cfs_rq->leaf_cfs_rq_list,
- &rq_of(cfs_rq)->leaf_cfs_rq_list);
- } else {
+    cfs_rq->tg->parent->cfs_rq[cpu]->on_list) {
+ /*
+ * If parent is already on the list, we add the child
+ * just before. Thanks to circular linked property of
+ * the list, this means to put the child at the tail
+ * of the list that starts by parent.
+ */
+ list_add_tail_rcu(&cfs_rq->leaf_cfs_rq_list,
+ &(cfs_rq->tg->parent->cfs_rq[cpu]->leaf_cfs_rq_list));
+ /*
+ * The branch is now connected to its tree so we can
+ * reset tmp_alone_branch to the beginning of the
+ * list.
+ */
+ rq->tmp_alone_branch = &rq->leaf_cfs_rq_list;
+ } else if (!cfs_rq->tg->parent) {
+ /*
+ * cfs rq without parent should be put
+ * at the tail of the list.
+ */
  list_add_tail_rcu(&cfs_rq->leaf_cfs_rq_list,
- &rq_of(cfs_rq)->leaf_cfs_rq_list);
+ &rq->leaf_cfs_rq_list);
+ /*
+ * We have reach the beg of a tree so we can reset
+ * tmp_alone_branch to the beginning of the list.
+ */
+ rq->tmp_alone_branch = &rq->leaf_cfs_rq_list;
+ } else {
+ /*
+ * The parent has not already been added so we want to
+ * make sure that it will be put after us.
+ * tmp_alone_branch points to the beg of the branch
+ * where we will add parent.
+ */
+ list_add_rcu(&cfs_rq->leaf_cfs_rq_list,
+ rq->tmp_alone_branch);
+ /*
+ * update tmp_alone_branch to points to the new beg
+ * of the branch
+ */
+ rq->tmp_alone_branch = &cfs_rq->leaf_cfs_rq_list;
  }
 
  cfs_rq->on_list = 1;
diff --git a/kernel/sched/sched.h b/kernel/sched/sched.h
index dcbf0f4806fc..d76fd0a84725 100644
--- a/kernel/sched/sched.h
+++ b/kernel/sched/sched.h
@@ -606,6 +606,7 @@ struct rq {
 #ifdef CONFIG_FAIR_GROUP_SCHED
  /* list of leaf cfs_rq on this cpu: */
  struct list_head leaf_cfs_rq_list;
+ struct list_head *tmp_alone_branch;
 #endif /* CONFIG_FAIR_GROUP_SCHED */
 
  /*
--
2.17.1


--
kernel-team mailing list
[hidden email]
https://lists.ubuntu.com/mailman/listinfo/kernel-team
Reply | Threaded
Open this post in threaded view
|

[Xenial][SRU][CVE-2018-20784][PATCH 3/6] sched/fair: Add tmp_alone_branch assertion

Connor Kuehl
In reply to this post by Connor Kuehl
From: Peter Zijlstra <[hidden email]>

CVE-2018-20784

The magic in list_add_leaf_cfs_rq() requires that at the end of
enqueue_task_fair():

  rq->tmp_alone_branch == &rq->lead_cfs_rq_list

If this is violated, list integrity is compromised for list entries
and the tmp_alone_branch pointer might dangle.

Also, reflow list_add_leaf_cfs_rq() while there. This looses one
indentation level and generates a form that's convenient for the next
patch.

Signed-off-by: Peter Zijlstra (Intel) <[hidden email]>
Cc: Linus Torvalds <[hidden email]>
Cc: Mike Galbraith <[hidden email]>
Cc: Peter Zijlstra <[hidden email]>
Cc: Thomas Gleixner <[hidden email]>
Signed-off-by: Ingo Molnar <[hidden email]>
(backported from commit 5d299eabea5a251fbf66e8277704b874bbba92dc)
[ Connor Kuehl: Instead of backporting the SCHED_WARN_ON macro, just use
  the macro it wraps around (WARN_ON_ONCE) for assert_list_leaf_cfs_rq. ]
Signed-off-by: Connor Kuehl <[hidden email]>
---
 kernel/sched/fair.c | 126 +++++++++++++++++++++++++-------------------
 1 file changed, 71 insertions(+), 55 deletions(-)

diff --git a/kernel/sched/fair.c b/kernel/sched/fair.c
index 97e45a9352ec..2c400c1eb282 100644
--- a/kernel/sched/fair.c
+++ b/kernel/sched/fair.c
@@ -285,64 +285,69 @@ static inline struct cfs_rq *group_cfs_rq(struct sched_entity *grp)
 
 static inline void list_add_leaf_cfs_rq(struct cfs_rq *cfs_rq)
 {
- if (!cfs_rq->on_list) {
- struct rq *rq = rq_of(cfs_rq);
- int cpu = cpu_of(rq);
+ struct rq *rq = rq_of(cfs_rq);
+ int cpu = cpu_of(rq);
+
+ if (cfs_rq->on_list)
+ return;
+
+ cfs_rq->on_list = 1;
+
+ /*
+ * Ensure we either appear before our parent (if already
+ * enqueued) or force our parent to appear after us when it is
+ * enqueued. The fact that we always enqueue bottom-up
+ * reduces this to two cases and a special case for the root
+ * cfs_rq. Furthermore, it also means that we will always reset
+ * tmp_alone_branch either when the branch is connected
+ * to a tree or when we reach the top of the tree
+ */
+ if (cfs_rq->tg->parent &&
+    cfs_rq->tg->parent->cfs_rq[cpu]->on_list) {
  /*
- * Ensure we either appear before our parent (if already
- * enqueued) or force our parent to appear after us when it is
- * enqueued. The fact that we always enqueue bottom-up
- * reduces this to two cases and a special case for the root
- * cfs_rq. Furthermore, it also means that we will always reset
- * tmp_alone_branch either when the branch is connected
- * to a tree or when we reach the beg of the tree
+ * If parent is already on the list, we add the child
+ * just before. Thanks to circular linked property of
+ * the list, this means to put the child at the tail
+ * of the list that starts by parent.
  */
- if (cfs_rq->tg->parent &&
-    cfs_rq->tg->parent->cfs_rq[cpu]->on_list) {
- /*
- * If parent is already on the list, we add the child
- * just before. Thanks to circular linked property of
- * the list, this means to put the child at the tail
- * of the list that starts by parent.
- */
- list_add_tail_rcu(&cfs_rq->leaf_cfs_rq_list,
- &(cfs_rq->tg->parent->cfs_rq[cpu]->leaf_cfs_rq_list));
- /*
- * The branch is now connected to its tree so we can
- * reset tmp_alone_branch to the beginning of the
- * list.
- */
- rq->tmp_alone_branch = &rq->leaf_cfs_rq_list;
- } else if (!cfs_rq->tg->parent) {
- /*
- * cfs rq without parent should be put
- * at the tail of the list.
- */
- list_add_tail_rcu(&cfs_rq->leaf_cfs_rq_list,
- &rq->leaf_cfs_rq_list);
- /*
- * We have reach the beg of a tree so we can reset
- * tmp_alone_branch to the beginning of the list.
- */
- rq->tmp_alone_branch = &rq->leaf_cfs_rq_list;
- } else {
- /*
- * The parent has not already been added so we want to
- * make sure that it will be put after us.
- * tmp_alone_branch points to the beg of the branch
- * where we will add parent.
- */
- list_add_rcu(&cfs_rq->leaf_cfs_rq_list,
- rq->tmp_alone_branch);
- /*
- * update tmp_alone_branch to points to the new beg
- * of the branch
- */
- rq->tmp_alone_branch = &cfs_rq->leaf_cfs_rq_list;
- }
+ list_add_tail_rcu(&cfs_rq->leaf_cfs_rq_list,
+ &(cfs_rq->tg->parent->cfs_rq[cpu]->leaf_cfs_rq_list));
+ /*
+ * The branch is now connected to its tree so we can
+ * reset tmp_alone_branch to the beginning of the
+ * list.
+ */
+ rq->tmp_alone_branch = &rq->leaf_cfs_rq_list;
+ return;
+ }
 
- cfs_rq->on_list = 1;
+ if (!cfs_rq->tg->parent) {
+ /*
+ * cfs rq without parent should be put
+ * at the tail of the list.
+ */
+ list_add_tail_rcu(&cfs_rq->leaf_cfs_rq_list,
+ &rq->leaf_cfs_rq_list);
+ /*
+ * We have reach the top of a tree so we can reset
+ * tmp_alone_branch to the beginning of the list.
+ */
+ rq->tmp_alone_branch = &rq->leaf_cfs_rq_list;
+ return;
  }
+
+ /*
+ * The parent has not already been added so we want to
+ * make sure that it will be put after us.
+ * tmp_alone_branch points to the begin of the branch
+ * where we will add parent.
+ */
+ list_add_rcu(&cfs_rq->leaf_cfs_rq_list, rq->tmp_alone_branch);
+ /*
+ * update tmp_alone_branch to points to the new begin
+ * of the branch
+ */
+ rq->tmp_alone_branch = &cfs_rq->leaf_cfs_rq_list;
 }
 
 static inline void list_del_leaf_cfs_rq(struct cfs_rq *cfs_rq)
@@ -353,7 +358,12 @@ static inline void list_del_leaf_cfs_rq(struct cfs_rq *cfs_rq)
  }
 }
 
-/* Iterate through all leaf cfs_rq's on a runqueue: */
+static inline void assert_list_leaf_cfs_rq(struct rq *rq)
+{
+ WARN_ON_ONCE((rq->tmp_alone_branch != &rq->leaf_cfs_rq_list));
+}
+
+/* Iterate through all cfs_rq's on a runqueue in bottom-up order */
 #define for_each_leaf_cfs_rq(rq, cfs_rq) \
  list_for_each_entry_rcu(cfs_rq, &rq->leaf_cfs_rq_list, leaf_cfs_rq_list)
 
@@ -448,6 +458,10 @@ static inline void list_del_leaf_cfs_rq(struct cfs_rq *cfs_rq)
 {
 }
 
+static inline void assert_list_leaf_cfs_rq(struct rq *rq)
+{
+}
+
 #define for_each_leaf_cfs_rq(rq, cfs_rq) \
  for (cfs_rq = &rq->cfs; cfs_rq; cfs_rq = NULL)
 
@@ -4400,6 +4414,8 @@ static void dequeue_task_fair(struct rq *rq, struct task_struct *p, int flags)
  if (!se)
  sub_nr_running(rq, 1);
 
+ assert_list_leaf_cfs_rq(rq);
+
  hrtick_update(rq);
 }
 
--
2.17.1


--
kernel-team mailing list
[hidden email]
https://lists.ubuntu.com/mailman/listinfo/kernel-team
Reply | Threaded
Open this post in threaded view
|

[Xenial][SRU][CVE-2018-20784][PATCH 4/6] sched/fair: Fix insertion in rq->leaf_cfs_rq_list

Connor Kuehl
In reply to this post by Connor Kuehl
From: Vincent Guittot <[hidden email]>

CVE-2018-20784

Sargun reported a crash:

  "I picked up c40f7d74c741a907cfaeb73a7697081881c497d0 sched/fair: Fix
   infinite loop in update_blocked_averages() by reverting a9e7f6544b9c
   and put it on top of 4.19.13. In addition to this, I uninlined
   list_add_leaf_cfs_rq for debugging.

   This revealed a new bug that we didn't get to because we kept getting
   crashes from the previous issue. When we are running with cgroups that
   are rapidly changing, with CFS bandwidth control, and in addition
   using the cpusets cgroup, we see this crash. Specifically, it seems to
   occur with cgroups that are throttled and we change the allowed
   cpuset."

The algorithm used to order cfs_rq in rq->leaf_cfs_rq_list assumes that
it will walk down to root the 1st time a cfs_rq is used and we will finish
to add either a cfs_rq without parent or a cfs_rq with a parent that is
already on the list. But this is not always true in presence of throttling.
Because a cfs_rq can be throttled even if it has never been used but other CPUs
of the cgroup have already used all the bandwdith, we are not sure to go down to
the root and add all cfs_rq in the list.

Ensure that all cfs_rq will be added in the list even if they are throttled.

[ mingo: Fix !CGROUPS build. ]

Reported-by: Sargun Dhillon <[hidden email]>
Signed-off-by: Vincent Guittot <[hidden email]>
Signed-off-by: Peter Zijlstra (Intel) <[hidden email]>
Cc: Linus Torvalds <[hidden email]>
Cc: Mike Galbraith <[hidden email]>
Cc: Peter Zijlstra <[hidden email]>
Cc: Thomas Gleixner <[hidden email]>
Cc: [hidden email]
Fixes: 9c2791f936ef ("Fix hierarchical order in rq->leaf_cfs_rq_list")
Link: https://lkml.kernel.org/r/1548825767-10799-1-git-send-email-vincent.guittot@...
Signed-off-by: Ingo Molnar <[hidden email]>
(backported from commit f6783319737f28e4436a69611853a5a098cbe974)
[ Connor Kuehl: the hunk for 'dequeue_task_fair' required manual
  placement due to context adjustments. ]
Signed-off-by: Connor Kuehl <[hidden email]>
---
 kernel/sched/fair.c | 33 ++++++++++++++++++++++++++++-----
 1 file changed, 28 insertions(+), 5 deletions(-)

diff --git a/kernel/sched/fair.c b/kernel/sched/fair.c
index 2c400c1eb282..d8c0966385de 100644
--- a/kernel/sched/fair.c
+++ b/kernel/sched/fair.c
@@ -283,13 +283,13 @@ static inline struct cfs_rq *group_cfs_rq(struct sched_entity *grp)
  return grp->my_q;
 }
 
-static inline void list_add_leaf_cfs_rq(struct cfs_rq *cfs_rq)
+static inline bool list_add_leaf_cfs_rq(struct cfs_rq *cfs_rq)
 {
  struct rq *rq = rq_of(cfs_rq);
  int cpu = cpu_of(rq);
 
  if (cfs_rq->on_list)
- return;
+ return rq->tmp_alone_branch == &rq->leaf_cfs_rq_list;
 
  cfs_rq->on_list = 1;
 
@@ -318,7 +318,7 @@ static inline void list_add_leaf_cfs_rq(struct cfs_rq *cfs_rq)
  * list.
  */
  rq->tmp_alone_branch = &rq->leaf_cfs_rq_list;
- return;
+ return true;
  }
 
  if (!cfs_rq->tg->parent) {
@@ -333,7 +333,7 @@ static inline void list_add_leaf_cfs_rq(struct cfs_rq *cfs_rq)
  * tmp_alone_branch to the beginning of the list.
  */
  rq->tmp_alone_branch = &rq->leaf_cfs_rq_list;
- return;
+ return true;
  }
 
  /*
@@ -348,6 +348,7 @@ static inline void list_add_leaf_cfs_rq(struct cfs_rq *cfs_rq)
  * of the branch
  */
  rq->tmp_alone_branch = &cfs_rq->leaf_cfs_rq_list;
+ return false;
 }
 
 static inline void list_del_leaf_cfs_rq(struct cfs_rq *cfs_rq)
@@ -450,8 +451,9 @@ static inline struct cfs_rq *group_cfs_rq(struct sched_entity *grp)
  return NULL;
 }
 
-static inline void list_add_leaf_cfs_rq(struct cfs_rq *cfs_rq)
+static inline bool list_add_leaf_cfs_rq(struct cfs_rq *cfs_rq)
 {
+ return true;
 }
 
 static inline void list_del_leaf_cfs_rq(struct cfs_rq *cfs_rq)
@@ -4217,6 +4219,12 @@ static void __maybe_unused unthrottle_offline_cfs_rqs(struct rq *rq)
 }
 
 #else /* CONFIG_CFS_BANDWIDTH */
+
+static inline bool cfs_bandwidth_used(void)
+{
+ return false;
+}
+
 static inline u64 cfs_rq_clock_task(struct cfs_rq *cfs_rq)
 {
  return rq_clock_task(rq_of(cfs_rq));
@@ -4414,6 +4422,21 @@ static void dequeue_task_fair(struct rq *rq, struct task_struct *p, int flags)
  if (!se)
  sub_nr_running(rq, 1);
 
+ if (cfs_bandwidth_used()) {
+ /*
+ * When bandwidth control is enabled; the cfs_rq_throttled()
+ * breaks in the above iteration can result in incomplete
+ * leaf list maintenance, resulting in triggering the assertion
+ * below.
+ */
+ for_each_sched_entity(se) {
+ cfs_rq = cfs_rq_of(se);
+
+ if (list_add_leaf_cfs_rq(cfs_rq))
+ break;
+ }
+ }
+
  assert_list_leaf_cfs_rq(rq);
 
  hrtick_update(rq);
--
2.17.1


--
kernel-team mailing list
[hidden email]
https://lists.ubuntu.com/mailman/listinfo/kernel-team
Reply | Threaded
Open this post in threaded view
|

[Xenial][SRU][CVE-2018-20784][PATCH 5/6] sched/fair: Optimize update_blocked_averages()

Connor Kuehl
In reply to this post by Connor Kuehl
From: Vincent Guittot <[hidden email]>

CVE-2018-20784

Removing a cfs_rq from rq->leaf_cfs_rq_list can break the parent/child
ordering of the list when it will be added back. In order to remove an
empty and fully decayed cfs_rq, we must remove its children too, so they
will be added back in the right order next time.

With a normal decay of PELT, a parent will be empty and fully decayed
if all children are empty and fully decayed too. In such a case, we just
have to ensure that the whole branch will be added when a new task is
enqueued. This is default behavior since :

  commit f6783319737f ("sched/fair: Fix insertion in rq->leaf_cfs_rq_list")

In case of throttling, the PELT of throttled cfs_rq will not be updated
whereas the parent will. This breaks the assumption made above unless we
remove the children of a cfs_rq that is throttled. Then, they will be
added back when unthrottled and a sched_entity will be enqueued.

As throttled cfs_rq are now removed from the list, we can remove the
associated test in update_blocked_averages().

Signed-off-by: Vincent Guittot <[hidden email]>
Signed-off-by: Peter Zijlstra (Intel) <[hidden email]>
Cc: Linus Torvalds <[hidden email]>
Cc: Peter Zijlstra <[hidden email]>
Cc: Thomas Gleixner <[hidden email]>
Cc: [hidden email]
Cc: [hidden email]
Cc: [hidden email]
Cc: [hidden email]
Link: https://lkml.kernel.org/r/1549469662-13614-2-git-send-email-vincent.guittot@...
Signed-off-by: Ingo Molnar <[hidden email]>
(backported from commit 31bc6aeaab1d1de8959b67edbed5c7a4b3cdbe7c)
[ Connor Kuehl: offset adjustments and the hunk in
  'update_blocked_averages' required manual placement. ]
Signed-off-by: Connor Kuehl <[hidden email]>
---
 kernel/sched/fair.c | 27 +++++++++++++++++++++------
 1 file changed, 21 insertions(+), 6 deletions(-)

diff --git a/kernel/sched/fair.c b/kernel/sched/fair.c
index d8c0966385de..8bc9a5e24380 100644
--- a/kernel/sched/fair.c
+++ b/kernel/sched/fair.c
@@ -354,6 +354,18 @@ static inline bool list_add_leaf_cfs_rq(struct cfs_rq *cfs_rq)
 static inline void list_del_leaf_cfs_rq(struct cfs_rq *cfs_rq)
 {
  if (cfs_rq->on_list) {
+ struct rq *rq = rq_of(cfs_rq);
+
+ /*
+ * With cfs_rq being unthrottled/throttled during an enqueue,
+ * it can happen the tmp_alone_branch points the a leaf that
+ * we finally want to del. In this case, tmp_alone_branch moves
+ * to the prev element but it will point to rq->leaf_cfs_rq_list
+ * at the end of the enqueue.
+ */
+ if (rq->tmp_alone_branch == &cfs_rq->leaf_cfs_rq_list)
+ rq->tmp_alone_branch = cfs_rq->leaf_cfs_rq_list.prev;
+
  list_del_rcu(&cfs_rq->leaf_cfs_rq_list);
  cfs_rq->on_list = 0;
  }
@@ -3659,6 +3671,10 @@ static int tg_unthrottle_up(struct task_group *tg, void *data)
  /* adjust cfs_rq_clock_task() */
  cfs_rq->throttled_clock_task_time += rq_clock_task(rq) -
      cfs_rq->throttled_clock_task;
+
+ /* Add cfs_rq with already running entity in the list */
+ if (cfs_rq->nr_running >= 1)
+ list_add_leaf_cfs_rq(cfs_rq);
  }
 #endif
 
@@ -3671,8 +3687,10 @@ static int tg_throttle_down(struct task_group *tg, void *data)
  struct cfs_rq *cfs_rq = tg->cfs_rq[cpu_of(rq)];
 
  /* group is entering throttled state, stop time */
- if (!cfs_rq->throttle_count)
+ if (!cfs_rq->throttle_count) {
  cfs_rq->throttled_clock_task = rq_clock_task(rq);
+ list_del_leaf_cfs_rq(cfs_rq);
+ }
  cfs_rq->throttle_count++;
 
  return 0;
@@ -3775,6 +3793,8 @@ void unthrottle_cfs_rq(struct cfs_rq *cfs_rq)
  break;
  }
 
+ assert_list_leaf_cfs_rq(rq);
+
  if (!se)
  add_nr_running(rq, task_delta);
 
@@ -6126,11 +6146,6 @@ static void update_blocked_averages(int cpu)
  * list_add_leaf_cfs_rq() for details.
  */
  for_each_leaf_cfs_rq(rq, cfs_rq) {
-
- /* throttled entities do not contribute to load */
- if (throttled_hierarchy(cfs_rq))
- continue;
-
  if (update_cfs_rq_load_avg(cfs_rq_clock_task(cfs_rq), cfs_rq))
  update_tg_load_avg(cfs_rq, 0);
 
--
2.17.1


--
kernel-team mailing list
[hidden email]
https://lists.ubuntu.com/mailman/listinfo/kernel-team
Reply | Threaded
Open this post in threaded view
|

[Xenial][SRU][CVE-2018-20784][PATCH 6/6] sched/fair: Fix O(nr_cgroups) in the load balancing path

Connor Kuehl
In reply to this post by Connor Kuehl
From: Vincent Guittot <[hidden email]>

CVE-2018-20784

This re-applies the commit reverted here:

  commit c40f7d74c741 ("sched/fair: Fix infinite loop in update_blocked_averages() by reverting a9e7f6544b9c")

I.e. now that cfs_rq can be safely removed/added in the list, we can re-apply:

 commit a9e7f6544b9c ("sched/fair: Fix O(nr_cgroups) in load balance path")

Signed-off-by: Vincent Guittot <[hidden email]>
Signed-off-by: Peter Zijlstra (Intel) <[hidden email]>
Cc: Linus Torvalds <[hidden email]>
Cc: Peter Zijlstra <[hidden email]>
Cc: Thomas Gleixner <[hidden email]>
Cc: [hidden email]
Cc: [hidden email]
Cc: [hidden email]
Cc: [hidden email]
Link: https://lkml.kernel.org/r/1549469662-13614-3-git-send-email-vincent.guittot@...
Signed-off-by: Ingo Molnar <[hidden email]>
(backported from commit 039ae8bcf7a5f4476f4487e6bf816885fb3fb617)
[ Connor Kuehl: In 'cfs_rq_is_decayed' the data member
  'runnable_load_sum' belongs to struct cfs_rq and not sched_avg, so
  update that. Some instances of 'for_each_leaf_cfs_rq' required manual
  updating to the new 'for_each_leaf_cfs_rq_safe' and the last hunk for
  'update_blocked_averages' required manual placement. ]
Signed-off-by: Connor Kuehl <[hidden email]>
---
 kernel/sched/fair.c | 48 +++++++++++++++++++++++++++++++++------------
 1 file changed, 36 insertions(+), 12 deletions(-)

diff --git a/kernel/sched/fair.c b/kernel/sched/fair.c
index 8bc9a5e24380..d16eec4b8d76 100644
--- a/kernel/sched/fair.c
+++ b/kernel/sched/fair.c
@@ -376,9 +376,10 @@ static inline void assert_list_leaf_cfs_rq(struct rq *rq)
  WARN_ON_ONCE((rq->tmp_alone_branch != &rq->leaf_cfs_rq_list));
 }
 
-/* Iterate through all cfs_rq's on a runqueue in bottom-up order */
-#define for_each_leaf_cfs_rq(rq, cfs_rq) \
- list_for_each_entry_rcu(cfs_rq, &rq->leaf_cfs_rq_list, leaf_cfs_rq_list)
+/* Iterate thr' all leaf cfs_rq's on a runqueue */
+#define for_each_leaf_cfs_rq_safe(rq, cfs_rq, pos) \
+ list_for_each_entry_safe(cfs_rq, pos, &rq->leaf_cfs_rq_list, \
+ leaf_cfs_rq_list)
 
 /* Do the two (enqueued) entities belong to the same group ? */
 static inline struct cfs_rq *
@@ -476,8 +477,8 @@ static inline void assert_list_leaf_cfs_rq(struct rq *rq)
 {
 }
 
-#define for_each_leaf_cfs_rq(rq, cfs_rq) \
- for (cfs_rq = &rq->cfs; cfs_rq; cfs_rq = NULL)
+#define for_each_leaf_cfs_rq_safe(rq, cfs_rq, pos) \
+ for (cfs_rq = &rq->cfs, pos = NULL; cfs_rq; cfs_rq = pos)
 
 static inline struct sched_entity *parent_entity(struct sched_entity *se)
 {
@@ -4203,9 +4204,9 @@ static void destroy_cfs_bandwidth(struct cfs_bandwidth *cfs_b)
 
 static void __maybe_unused update_runtime_enabled(struct rq *rq)
 {
- struct cfs_rq *cfs_rq;
+ struct cfs_rq *cfs_rq, *pos;
 
- for_each_leaf_cfs_rq(rq, cfs_rq) {
+ for_each_leaf_cfs_rq_safe(rq, cfs_rq, pos) {
  struct cfs_bandwidth *cfs_b = &cfs_rq->tg->cfs_bandwidth;
 
  raw_spin_lock(&cfs_b->lock);
@@ -4218,7 +4219,7 @@ static void __maybe_unused unthrottle_offline_cfs_rqs(struct rq *rq)
 {
  struct cfs_rq *cfs_rq, *pos;
 
- for_each_leaf_cfs_rq(rq, cfs_rq) {
+ for_each_leaf_cfs_rq_safe(rq, cfs_rq, pos) {
  if (!cfs_rq->runtime_enabled)
  continue;
 
@@ -6132,10 +6133,27 @@ static void attach_tasks(struct lb_env *env)
 
 #ifdef CONFIG_FAIR_GROUP_SCHED
 
+static inline bool cfs_rq_is_decayed(struct cfs_rq *cfs_rq)
+{
+ if (cfs_rq->load.weight)
+ return false;
+
+ if (cfs_rq->avg.load_sum)
+ return false;
+
+ if (cfs_rq->avg.util_sum)
+ return false;
+
+ if (cfs_rq->runnable_load_sum)
+ return false;
+
+ return true;
+}
+
 static void update_blocked_averages(int cpu)
 {
  struct rq *rq = cpu_rq(cpu);
- struct cfs_rq *cfs_rq;
+ struct cfs_rq *cfs_rq, *pos;
  unsigned long flags;
 
  raw_spin_lock_irqsave(&rq->lock, flags);
@@ -6145,9 +6163,15 @@ static void update_blocked_averages(int cpu)
  * Iterates the task_group tree in a bottom up fashion, see
  * list_add_leaf_cfs_rq() for details.
  */
- for_each_leaf_cfs_rq(rq, cfs_rq) {
+ for_each_leaf_cfs_rq_safe(rq, cfs_rq, pos) {
  if (update_cfs_rq_load_avg(cfs_rq_clock_task(cfs_rq), cfs_rq))
  update_tg_load_avg(cfs_rq, 0);
+ /*
+ * There can be a lot of idle CPU cgroups.  Don't let fully
+ * decayed cfs_rqs linger on the list.
+ */
+ if (cfs_rq_is_decayed(cfs_rq))
+ list_del_leaf_cfs_rq(cfs_rq);
 
  }
  raw_spin_unlock_irqrestore(&rq->lock, flags);
@@ -8554,10 +8578,10 @@ const struct sched_class fair_sched_class = {
 #ifdef CONFIG_SCHED_DEBUG
 void print_cfs_stats(struct seq_file *m, int cpu)
 {
- struct cfs_rq *cfs_rq;
+ struct cfs_rq *cfs_rq, *pos;
 
  rcu_read_lock();
- for_each_leaf_cfs_rq(cpu_rq(cpu), cfs_rq)
+ for_each_leaf_cfs_rq_safe(cpu_rq(cpu), cfs_rq, pos)
  print_cfs_rq(m, cpu, cfs_rq);
  rcu_read_unlock();
 }
--
2.17.1


--
kernel-team mailing list
[hidden email]
https://lists.ubuntu.com/mailman/listinfo/kernel-team
Reply | Threaded
Open this post in threaded view
|

ACK/Cmnt: [Xenial][SRU][CVE-2018-20784][PATCHv2 0/6] Fix for fair scheduler

Stefan Bader-2
In reply to this post by Connor Kuehl
On 17.10.19 19:51, Connor Kuehl wrote:

> https://people.canonical.com/~ubuntu-security/cve/2018/CVE-2018-20784.html
>
> From the link above:
>
> "In the Linux kernel before 4.20.2, kernel/sched/fair.c mishandles leaf
> cfs_rq's, which allows attackers to cause a denial of service (infinite
> loop in update_blocked_averages) or possibly have unspecified other impact
> by inducing a high load."
>
> In the first submission for this backport, I had only backported c40f7d74c741
> ("sched/fair: Fix infinite loop in update_blocked_averages() by reverting a9e7f6544b9c")
> which ends up reverting a fix that was SRU'd previously.
>
> Here is that previous SRU: https://bugs.launchpad.net/ubuntu/+source/linux/+bug/1747896
>
> This submission contains the fix for this CVE as well as the necessary
> patches for preserving the fix for the previously SRU'd patch (these
> were found by Tyler Hicks. Thank you, Tyler!)
>
> For posterity, my testing was as follows (taken from the LP report
> above in the bug description):
>
> I booted Xenial in a 64 bit VM twice. The first time was without this
> CVE backport applied. The second time was with it applied. I ran the
> reproducer in both cases and experienced the same CPU utilization (both
> cores I allocated to my VM were at 100%) and in both cases I experienced
> stable memory pressure. They would both hover around 120MB ± 3-5MB.
>
> With this submission, I also observed the same results with watching the
> cfs_rqs. With and without this backport applied, the cfs_rqs fluctuated
> between 13-18. This is an improvement over the previous submission of
> this CVE backport since the previous version experienced different
> results by watching the cfs_rqs (namely, they'd remain at 61 for the
> duration of the tests).
>
> Linus Torvalds (1):
>   sched/fair: Fix infinite loop in update_blocked_averages() by
>     reverting a9e7f6544b9c
>
> Peter Zijlstra (1):
>   sched/fair: Add tmp_alone_branch assertion
>
> Vincent Guittot (4):
>   sched/fair: Fix hierarchical order in rq->leaf_cfs_rq_list
>   sched/fair: Fix insertion in rq->leaf_cfs_rq_list
>   sched/fair: Optimize update_blocked_averages()
>   sched/fair: Fix O(nr_cgroups) in the load balancing path
>
>  kernel/sched/core.c  |   1 +
>  kernel/sched/fair.c  | 140 ++++++++++++++++++++++++++++++++++++-------
>  kernel/sched/sched.h |   1 +
>  3 files changed, 119 insertions(+), 23 deletions(-)
>
Rather complex change but security relevant and promising test results.

Acked-by: Stefan Bader <[hidden email]>


--
kernel-team mailing list
[hidden email]
https://lists.ubuntu.com/mailman/listinfo/kernel-team

signature.asc (849 bytes) Download Attachment
Reply | Threaded
Open this post in threaded view
|

Re: [Xenial][SRU][CVE-2018-20784][PATCHv2 0/6] Fix for fair scheduler

Connor Kuehl
In reply to this post by Connor Kuehl
Just a ping for this patch set.

On 10/17/19 10:51 AM, Connor Kuehl wrote:

> https://people.canonical.com/~ubuntu-security/cve/2018/CVE-2018-20784.html
>
>  From the link above:
>
> "In the Linux kernel before 4.20.2, kernel/sched/fair.c mishandles leaf
> cfs_rq's, which allows attackers to cause a denial of service (infinite
> loop in update_blocked_averages) or possibly have unspecified other impact
> by inducing a high load."
>
> In the first submission for this backport, I had only backported c40f7d74c741
> ("sched/fair: Fix infinite loop in update_blocked_averages() by reverting a9e7f6544b9c")
> which ends up reverting a fix that was SRU'd previously.
>
> Here is that previous SRU: https://bugs.launchpad.net/ubuntu/+source/linux/+bug/1747896
>
> This submission contains the fix for this CVE as well as the necessary
> patches for preserving the fix for the previously SRU'd patch (these
> were found by Tyler Hicks. Thank you, Tyler!)
>
> For posterity, my testing was as follows (taken from the LP report
> above in the bug description):
>
> I booted Xenial in a 64 bit VM twice. The first time was without this
> CVE backport applied. The second time was with it applied. I ran the
> reproducer in both cases and experienced the same CPU utilization (both
> cores I allocated to my VM were at 100%) and in both cases I experienced
> stable memory pressure. They would both hover around 120MB ± 3-5MB.
>
> With this submission, I also observed the same results with watching the
> cfs_rqs. With and without this backport applied, the cfs_rqs fluctuated
> between 13-18. This is an improvement over the previous submission of
> this CVE backport since the previous version experienced different
> results by watching the cfs_rqs (namely, they'd remain at 61 for the
> duration of the tests).
>
> Linus Torvalds (1):
>    sched/fair: Fix infinite loop in update_blocked_averages() by
>      reverting a9e7f6544b9c
>
> Peter Zijlstra (1):
>    sched/fair: Add tmp_alone_branch assertion
>
> Vincent Guittot (4):
>    sched/fair: Fix hierarchical order in rq->leaf_cfs_rq_list
>    sched/fair: Fix insertion in rq->leaf_cfs_rq_list
>    sched/fair: Optimize update_blocked_averages()
>    sched/fair: Fix O(nr_cgroups) in the load balancing path
>
>   kernel/sched/core.c  |   1 +
>   kernel/sched/fair.c  | 140 ++++++++++++++++++++++++++++++++++++-------
>   kernel/sched/sched.h |   1 +
>   3 files changed, 119 insertions(+), 23 deletions(-)
>


--
kernel-team mailing list
[hidden email]
https://lists.ubuntu.com/mailman/listinfo/kernel-team
Reply | Threaded
Open this post in threaded view
|

ACK: [Xenial][SRU][CVE-2018-20784][PATCHv2 0/6] Fix for fair scheduler

Sultan Alsawaf
In reply to this post by Connor Kuehl
On Thu, Oct 17, 2019 at 10:51:27AM -0700, Connor Kuehl wrote:

> https://people.canonical.com/~ubuntu-security/cve/2018/CVE-2018-20784.html
>
> From the link above:
>
> "In the Linux kernel before 4.20.2, kernel/sched/fair.c mishandles leaf
> cfs_rq's, which allows attackers to cause a denial of service (infinite
> loop in update_blocked_averages) or possibly have unspecified other impact
> by inducing a high load."
>
> In the first submission for this backport, I had only backported c40f7d74c741
> ("sched/fair: Fix infinite loop in update_blocked_averages() by reverting a9e7f6544b9c")
> which ends up reverting a fix that was SRU'd previously.
>
> Here is that previous SRU: https://bugs.launchpad.net/ubuntu/+source/linux/+bug/1747896
>
> This submission contains the fix for this CVE as well as the necessary
> patches for preserving the fix for the previously SRU'd patch (these
> were found by Tyler Hicks. Thank you, Tyler!)
>
> For posterity, my testing was as follows (taken from the LP report
> above in the bug description):
>
> I booted Xenial in a 64 bit VM twice. The first time was without this
> CVE backport applied. The second time was with it applied. I ran the
> reproducer in both cases and experienced the same CPU utilization (both
> cores I allocated to my VM were at 100%) and in both cases I experienced
> stable memory pressure. They would both hover around 120MB ± 3-5MB.
>
> With this submission, I also observed the same results with watching the
> cfs_rqs. With and without this backport applied, the cfs_rqs fluctuated
> between 13-18. This is an improvement over the previous submission of
> this CVE backport since the previous version experienced different
> results by watching the cfs_rqs (namely, they'd remain at 61 for the
> duration of the tests).
>
> Linus Torvalds (1):
>   sched/fair: Fix infinite loop in update_blocked_averages() by
>     reverting a9e7f6544b9c
>
> Peter Zijlstra (1):
>   sched/fair: Add tmp_alone_branch assertion
>
> Vincent Guittot (4):
>   sched/fair: Fix hierarchical order in rq->leaf_cfs_rq_list
>   sched/fair: Fix insertion in rq->leaf_cfs_rq_list
>   sched/fair: Optimize update_blocked_averages()
>   sched/fair: Fix O(nr_cgroups) in the load balancing path
>
>  kernel/sched/core.c  |   1 +
>  kernel/sched/fair.c  | 140 ++++++++++++++++++++++++++++++++++++-------
>  kernel/sched/sched.h |   1 +
>  3 files changed, 119 insertions(+), 23 deletions(-)
>
> --
> 2.17.1
>
>
> --
> kernel-team mailing list
> [hidden email]
> https://lists.ubuntu.com/mailman/listinfo/kernel-team

Acked-by: Sultan Alsawaf <[hidden email]>

--
kernel-team mailing list
[hidden email]
https://lists.ubuntu.com/mailman/listinfo/kernel-team
Reply | Threaded
Open this post in threaded view
|

APPLIED: [Xenial][SRU][CVE-2018-20784][PATCHv2 0/6] Fix for fair scheduler

Khaled Elmously
In reply to this post by Connor Kuehl
On 2019-10-17 10:51:27 , Connor Kuehl wrote:

> https://people.canonical.com/~ubuntu-security/cve/2018/CVE-2018-20784.html
>
> From the link above:
>
> "In the Linux kernel before 4.20.2, kernel/sched/fair.c mishandles leaf
> cfs_rq's, which allows attackers to cause a denial of service (infinite
> loop in update_blocked_averages) or possibly have unspecified other impact
> by inducing a high load."
>
> In the first submission for this backport, I had only backported c40f7d74c741
> ("sched/fair: Fix infinite loop in update_blocked_averages() by reverting a9e7f6544b9c")
> which ends up reverting a fix that was SRU'd previously.
>
> Here is that previous SRU: https://bugs.launchpad.net/ubuntu/+source/linux/+bug/1747896
>
> This submission contains the fix for this CVE as well as the necessary
> patches for preserving the fix for the previously SRU'd patch (these
> were found by Tyler Hicks. Thank you, Tyler!)
>
> For posterity, my testing was as follows (taken from the LP report
> above in the bug description):
>
> I booted Xenial in a 64 bit VM twice. The first time was without this
> CVE backport applied. The second time was with it applied. I ran the
> reproducer in both cases and experienced the same CPU utilization (both
> cores I allocated to my VM were at 100%) and in both cases I experienced
> stable memory pressure. They would both hover around 120MB ± 3-5MB.
>
> With this submission, I also observed the same results with watching the
> cfs_rqs. With and without this backport applied, the cfs_rqs fluctuated
> between 13-18. This is an improvement over the previous submission of
> this CVE backport since the previous version experienced different
> results by watching the cfs_rqs (namely, they'd remain at 61 for the
> duration of the tests).
>
> Linus Torvalds (1):
>   sched/fair: Fix infinite loop in update_blocked_averages() by
>     reverting a9e7f6544b9c
>
> Peter Zijlstra (1):
>   sched/fair: Add tmp_alone_branch assertion
>
> Vincent Guittot (4):
>   sched/fair: Fix hierarchical order in rq->leaf_cfs_rq_list
>   sched/fair: Fix insertion in rq->leaf_cfs_rq_list
>   sched/fair: Optimize update_blocked_averages()
>   sched/fair: Fix O(nr_cgroups) in the load balancing path
>
>  kernel/sched/core.c  |   1 +
>  kernel/sched/fair.c  | 140 ++++++++++++++++++++++++++++++++++++-------
>  kernel/sched/sched.h |   1 +
>  3 files changed, 119 insertions(+), 23 deletions(-)
>
> --
> 2.17.1
>
>
> --
> kernel-team mailing list
> [hidden email]
> https://lists.ubuntu.com/mailman/listinfo/kernel-team

--
kernel-team mailing list
[hidden email]
https://lists.ubuntu.com/mailman/listinfo/kernel-team