[SRU][Xenial][PATCH 0/6] Backport Ceph CRUSH_TUNABLES5 support

Previous Topic Next Topic
 
classic Classic list List threaded Threaded
10 messages Options
Reply | Threaded
Open this post in threaded view
|

[SRU][Xenial][PATCH 0/6] Backport Ceph CRUSH_TUNABLES5 support

Billy Olsen
From: Billy Olsen <[hidden email]>

BugLink: https://bugs.launchpad.net/bugs/1728739

[Impact]
Attempting to use the kernel rbd driver to mount images from a
Jewel (Xenial) or Luminous (Artful & Pike Ubuntu Cloud Archive)
server causes the mount to fail. This is due to Ceph's addition
of new CRUSH_TUNABLES5 feature which is not understood by the
4.4 kernel client.

[Fix]
Backport the 5 original patches (clean cherry-picks) that added
CRUSH_TUNABLES5 support in the upstream 4.5 kernel. Also backport
an additional patch that allows the client to understand the new
v7 format of the MOSDOpReply message.

- https://www.spinics.net/lists/ceph-devel/msg28421.html
- https://www.spinics.net/lists/ceph-devel/msg28458.html

[Test Case]
1. Deploy a Jewel/Luminous Ceph Cluster with crush tunables set
   to optimal.
2. Create an RBD image suitable for kernel client:
   $ rbd create --pool rbd --image-feature layering --size 1G test
3. Map rbd device to local server node:
   $ rbd map --pool rbd test

[Regression Potential]
Minimal. Code is limited to kernel rbd driver and new code should
primarily affect clients connecting to clusters with the new
tunables options.

[Notes]
Only applicable to Xenial LTS kernel 4.4 since code was included
in the 4.5 kernel.

Ilya Dryomov (6):
  crush: ensure bucket id is valid before indexing buckets array
  crush: ensure take bucket value is valid
  crush: add chooseleaf_stable tunable
  crush: decode and initialize chooseleaf_stable
  libceph: advertise support for TUNABLES5
  libceph: MOSDOpReply v7 encoding

 include/linux/ceph/ceph_features.h | 16 +++++++++++++++-
 include/linux/crush/crush.h        |  8 +++++++-
 net/ceph/crush/mapper.c            | 33 ++++++++++++++++++++++++++-------
 net/ceph/osd_client.c              | 10 ++++++++++
 net/ceph/osdmap.c                  | 19 ++++++++++++++-----
 5 files changed, 72 insertions(+), 14 deletions(-)

--
2.14.1


--
kernel-team mailing list
[hidden email]
https://lists.ubuntu.com/mailman/listinfo/kernel-team
Reply | Threaded
Open this post in threaded view
|

[SRU][Xenial][PATCH 1/6] crush: ensure bucket id is valid before indexing buckets array

Billy Olsen
From: Ilya Dryomov <[hidden email]>

BugLink: https://bugs.launchpad.net/bugs/1728739

We were indexing the buckets array without verifying the index was
within the [0,max_buckets) range.  This could happen because
a multistep rule does not have enough buckets and has CRUSH_ITEM_NONE
for an intermediate result, which would feed in CRUSH_ITEM_NONE and
make us crash.

Reflects ceph.git commit 976a24a326da8931e689ee22fce35feab5b67b76.

Signed-off-by: Ilya Dryomov <[hidden email]>
Reviewed-by: Sage Weil <[hidden email]>
(cherry picked from commit f224a6915f266921507bb6e50a82f87a3de5b4b5)
Signed-off-by: Billy Olsen <[hidden email]>
---
 net/ceph/crush/mapper.c | 12 ++++++++++--
 1 file changed, 10 insertions(+), 2 deletions(-)

diff --git a/net/ceph/crush/mapper.c b/net/ceph/crush/mapper.c
index 393bfb22d5bb..97ecf6f262aa 100644
--- a/net/ceph/crush/mapper.c
+++ b/net/ceph/crush/mapper.c
@@ -888,6 +888,7 @@ int crush_do_rule(const struct crush_map *map,
  osize = 0;
 
  for (i = 0; i < wsize; i++) {
+ int bno;
  /*
  * see CRUSH_N, CRUSH_N_MINUS macros.
  * basically, numrep <= 0 means relative to
@@ -900,6 +901,13 @@ int crush_do_rule(const struct crush_map *map,
  continue;
  }
  j = 0;
+ /* make sure bucket id is valid */
+ bno = -1 - w[i];
+ if (bno < 0 || bno >= map->max_buckets) {
+ /* w[i] is probably CRUSH_ITEM_NONE */
+ dprintk("  bad w[i] %d\n", w[i]);
+ continue;
+ }
  if (firstn) {
  int recurse_tries;
  if (choose_leaf_tries)
@@ -911,7 +919,7 @@ int crush_do_rule(const struct crush_map *map,
  recurse_tries = choose_tries;
  osize += crush_choose_firstn(
  map,
- map->buckets[-1-w[i]],
+ map->buckets[bno],
  weight, weight_max,
  x, numrep,
  curstep->arg2,
@@ -930,7 +938,7 @@ int crush_do_rule(const struct crush_map *map,
     numrep : (result_max-osize));
  crush_choose_indep(
  map,
- map->buckets[-1-w[i]],
+ map->buckets[bno],
  weight, weight_max,
  x, out_size, numrep,
  curstep->arg2,
--
2.14.1


--
kernel-team mailing list
[hidden email]
https://lists.ubuntu.com/mailman/listinfo/kernel-team
Reply | Threaded
Open this post in threaded view
|

[SRU][Xenial][PATCH 2/6] crush: ensure take bucket value is valid

Billy Olsen
In reply to this post by Billy Olsen
From: Ilya Dryomov <[hidden email]>

BugLink: https://bugs.launchpad.net/bugs/1728739

Ensure that the take argument is a valid bucket ID before indexing the
buckets array.

Reflects ceph.git commit 93ec538e8a667699876b72459b8ad78966d89c61.

Signed-off-by: Ilya Dryomov <[hidden email]>
Reviewed-by: Sage Weil <[hidden email]>
(cherry picked from commit 56a4f3091dceb7dfc14dc3ef1d5f59fe39ba4447)
Signed-off-by: Billy Olsen <[hidden email]>
---
 net/ceph/crush/mapper.c | 3 ++-
 1 file changed, 2 insertions(+), 1 deletion(-)

diff --git a/net/ceph/crush/mapper.c b/net/ceph/crush/mapper.c
index 97ecf6f262aa..abb700621e4a 100644
--- a/net/ceph/crush/mapper.c
+++ b/net/ceph/crush/mapper.c
@@ -835,7 +835,8 @@ int crush_do_rule(const struct crush_map *map,
  case CRUSH_RULE_TAKE:
  if ((curstep->arg1 >= 0 &&
      curstep->arg1 < map->max_devices) ||
-    (-1-curstep->arg1 < map->max_buckets &&
+    (-1-curstep->arg1 >= 0 &&
+     -1-curstep->arg1 < map->max_buckets &&
      map->buckets[-1-curstep->arg1])) {
  w[0] = curstep->arg1;
  wsize = 1;
--
2.14.1


--
kernel-team mailing list
[hidden email]
https://lists.ubuntu.com/mailman/listinfo/kernel-team
Reply | Threaded
Open this post in threaded view
|

[SRU][Xenial][PATCH 3/6] crush: add chooseleaf_stable tunable

Billy Olsen
In reply to this post by Billy Olsen
From: Ilya Dryomov <[hidden email]>

BugLink: https://bugs.launchpad.net/bugs/1728739

Add a tunable to fix the bug that chooseleaf may cause unnecessary pg
migrations when some device fails.

Reflects ceph.git commit fdb3f664448e80d984470f32f04e2e6f03ab52ec.

Signed-off-by: Ilya Dryomov <[hidden email]>
Reviewed-by: Sage Weil <[hidden email]>
(cherry picked from commit dc6ae6d8e7726bad4f1c87244b49cac851746c65)
Signed-off-by: Billy Olsen <[hidden email]>
---
 include/linux/crush/crush.h |  8 +++++++-
 net/ceph/crush/mapper.c     | 18 ++++++++++++++----
 2 files changed, 21 insertions(+), 5 deletions(-)

diff --git a/include/linux/crush/crush.h b/include/linux/crush/crush.h
index 48b49305716b..be8f12b8f195 100644
--- a/include/linux/crush/crush.h
+++ b/include/linux/crush/crush.h
@@ -59,7 +59,8 @@ enum {
  CRUSH_RULE_SET_CHOOSELEAF_TRIES = 9, /* override chooseleaf_descend_once */
  CRUSH_RULE_SET_CHOOSE_LOCAL_TRIES = 10,
  CRUSH_RULE_SET_CHOOSE_LOCAL_FALLBACK_TRIES = 11,
- CRUSH_RULE_SET_CHOOSELEAF_VARY_R = 12
+ CRUSH_RULE_SET_CHOOSELEAF_VARY_R = 12,
+ CRUSH_RULE_SET_CHOOSELEAF_STABLE = 13
 };
 
 /*
@@ -205,6 +206,11 @@ struct crush_map {
  * mappings line up a bit better with previous mappings. */
  __u8 chooseleaf_vary_r;
 
+ /* if true, it makes chooseleaf firstn to return stable results (if
+ * no local retry) so that data migrations would be optimal when some
+ * device fails. */
+ __u8 chooseleaf_stable;
+
 #ifndef __KERNEL__
  /*
  * version 0 (original) of straw_calc has various flaws.  version 1
diff --git a/net/ceph/crush/mapper.c b/net/ceph/crush/mapper.c
index abb700621e4a..5fcfb98f309e 100644
--- a/net/ceph/crush/mapper.c
+++ b/net/ceph/crush/mapper.c
@@ -403,6 +403,7 @@ static int is_out(const struct crush_map *map,
  * @local_retries: localized retries
  * @local_fallback_retries: localized fallback retries
  * @recurse_to_leaf: true if we want one device under each item of given type (chooseleaf instead of choose)
+ * @stable: stable mode starts rep=0 in the recursive call for all replicas
  * @vary_r: pass r to recursive calls
  * @out2: second output vector for leaf items (if @recurse_to_leaf)
  * @parent_r: r value passed from the parent
@@ -419,6 +420,7 @@ static int crush_choose_firstn(const struct crush_map *map,
        unsigned int local_fallback_retries,
        int recurse_to_leaf,
        unsigned int vary_r,
+       unsigned int stable,
        int *out2,
        int parent_r)
 {
@@ -433,13 +435,13 @@ static int crush_choose_firstn(const struct crush_map *map,
  int collide, reject;
  int count = out_size;
 
- dprintk("CHOOSE%s bucket %d x %d outpos %d numrep %d tries %d recurse_tries %d local_retries %d local_fallback_retries %d parent_r %d\n",
+ dprintk("CHOOSE%s bucket %d x %d outpos %d numrep %d tries %d recurse_tries %d local_retries %d local_fallback_retries %d parent_r %d stable %d\n",
  recurse_to_leaf ? "_LEAF" : "",
  bucket->id, x, outpos, numrep,
  tries, recurse_tries, local_retries, local_fallback_retries,
- parent_r);
+ parent_r, stable);
 
- for (rep = outpos; rep < numrep && count > 0 ; rep++) {
+ for (rep = stable ? 0 : outpos; rep < numrep && count > 0 ; rep++) {
  /* keep trying until we get a non-out, non-colliding item */
  ftotal = 0;
  skip_rep = 0;
@@ -512,13 +514,14 @@ static int crush_choose_firstn(const struct crush_map *map,
  if (crush_choose_firstn(map,
  map->buckets[-1-item],
  weight, weight_max,
- x, outpos+1, 0,
+ x, stable ? 1 : outpos+1, 0,
  out2, outpos, count,
  recurse_tries, 0,
  local_retries,
  local_fallback_retries,
  0,
  vary_r,
+ stable,
  NULL,
  sub_r) <= outpos)
  /* didn't get leaf */
@@ -816,6 +819,7 @@ int crush_do_rule(const struct crush_map *map,
  int choose_local_fallback_retries = map->choose_local_fallback_tries;
 
  int vary_r = map->chooseleaf_vary_r;
+ int stable = map->chooseleaf_stable;
 
  if ((__u32)ruleno >= map->max_rules) {
  dprintk(" bad ruleno %d\n", ruleno);
@@ -870,6 +874,11 @@ int crush_do_rule(const struct crush_map *map,
  vary_r = curstep->arg1;
  break;
 
+ case CRUSH_RULE_SET_CHOOSELEAF_STABLE:
+ if (curstep->arg1 >= 0)
+ stable = curstep->arg1;
+ break;
+
  case CRUSH_RULE_CHOOSELEAF_FIRSTN:
  case CRUSH_RULE_CHOOSE_FIRSTN:
  firstn = 1;
@@ -932,6 +941,7 @@ int crush_do_rule(const struct crush_map *map,
  choose_local_fallback_retries,
  recurse_to_leaf,
  vary_r,
+ stable,
  c+osize,
  0);
  } else {
--
2.14.1


--
kernel-team mailing list
[hidden email]
https://lists.ubuntu.com/mailman/listinfo/kernel-team
Reply | Threaded
Open this post in threaded view
|

[SRU][Xenial][PATCH 4/6] crush: decode and initialize chooseleaf_stable

Billy Olsen
In reply to this post by Billy Olsen
From: Ilya Dryomov <[hidden email]>

BugLink: https://bugs.launchpad.net/bugs/1728739

Also add missing \n while at it.

Signed-off-by: Ilya Dryomov <[hidden email]>
Reviewed-by: Sage Weil <[hidden email]>
(cherry picked from commit b9b519b78cfbef9ed1b7aabca63eaaa9d1682a71)
Signed-off-by: Billy Olsen <[hidden email]>
---
 net/ceph/osdmap.c | 19 ++++++++++++++-----
 1 file changed, 14 insertions(+), 5 deletions(-)

diff --git a/net/ceph/osdmap.c b/net/ceph/osdmap.c
index bc95e48d5cfb..94a9acbee51e 100644
--- a/net/ceph/osdmap.c
+++ b/net/ceph/osdmap.c
@@ -342,23 +342,32 @@ static struct crush_map *crush_decode(void *pbyval, void *end)
         c->choose_local_tries = ceph_decode_32(p);
         c->choose_local_fallback_tries =  ceph_decode_32(p);
         c->choose_total_tries = ceph_decode_32(p);
-        dout("crush decode tunable choose_local_tries = %d",
+        dout("crush decode tunable choose_local_tries = %d\n",
              c->choose_local_tries);
-        dout("crush decode tunable choose_local_fallback_tries = %d",
+        dout("crush decode tunable choose_local_fallback_tries = %d\n",
              c->choose_local_fallback_tries);
-        dout("crush decode tunable choose_total_tries = %d",
+        dout("crush decode tunable choose_total_tries = %d\n",
              c->choose_total_tries);
 
  ceph_decode_need(p, end, sizeof(u32), done);
  c->chooseleaf_descend_once = ceph_decode_32(p);
- dout("crush decode tunable chooseleaf_descend_once = %d",
+ dout("crush decode tunable chooseleaf_descend_once = %d\n",
      c->chooseleaf_descend_once);
 
  ceph_decode_need(p, end, sizeof(u8), done);
  c->chooseleaf_vary_r = ceph_decode_8(p);
- dout("crush decode tunable chooseleaf_vary_r = %d",
+ dout("crush decode tunable chooseleaf_vary_r = %d\n",
      c->chooseleaf_vary_r);
 
+ /* skip straw_calc_version, allowed_bucket_algs */
+ ceph_decode_need(p, end, sizeof(u8) + sizeof(u32), done);
+ *p += sizeof(u8) + sizeof(u32);
+
+ ceph_decode_need(p, end, sizeof(u8), done);
+ c->chooseleaf_stable = ceph_decode_8(p);
+ dout("crush decode tunable chooseleaf_stable = %d\n",
+     c->chooseleaf_stable);
+
 done:
  dout("crush_decode success\n");
  return c;
--
2.14.1


--
kernel-team mailing list
[hidden email]
https://lists.ubuntu.com/mailman/listinfo/kernel-team
Reply | Threaded
Open this post in threaded view
|

[SRU][Xenial][PATCH 5/6] libceph: advertise support for TUNABLES5

Billy Olsen
In reply to this post by Billy Olsen
From: Ilya Dryomov <[hidden email]>

BugLink: https://bugs.launchpad.net/bugs/1728739

Add TUNABLES5 feature (chooseleaf_stable tunable) to a set of features
supported by default.

Signed-off-by: Ilya Dryomov <[hidden email]>
Reviewed-by: Sage Weil <[hidden email]>
(cherry picked from commit 97db9a88186e3a7d3a1942370c836bf221d3ab90)
Signed-off-by: Billy Olsen <[hidden email]>
---
 include/linux/ceph/ceph_features.h | 13 ++++++++++++-
 1 file changed, 12 insertions(+), 1 deletion(-)

diff --git a/include/linux/ceph/ceph_features.h b/include/linux/ceph/ceph_features.h
index f89b31d45cc8..c3b211c9fe83 100644
--- a/include/linux/ceph/ceph_features.h
+++ b/include/linux/ceph/ceph_features.h
@@ -63,6 +63,16 @@
 #define CEPH_FEATURE_OSD_MIN_SIZE_RECOVERY (1ULL<<49)
 // duplicated since it was introduced at the same time as MIN_SIZE_RECOVERY
 #define CEPH_FEATURE_OSD_PROXY_FEATURES (1ULL<<49)  /* overlap w/ above */
+#define CEPH_FEATURE_MON_METADATA (1ULL<<50)
+#define CEPH_FEATURE_OSD_BITWISE_HOBJ_SORT (1ULL<<51) /* can sort objs bitwise */
+#define CEPH_FEATURE_OSD_PROXY_WRITE_FEATURES (1ULL<<52)
+#define CEPH_FEATURE_ERASURE_CODE_PLUGINS_V3 (1ULL<<53)
+#define CEPH_FEATURE_OSD_HITSET_GMT (1ULL<<54)
+#define CEPH_FEATURE_HAMMER_0_94_4 (1ULL<<55)
+#define CEPH_FEATURE_NEW_OSDOP_ENCODING   (1ULL<<56) /* New, v7 encoding */
+#define CEPH_FEATURE_MON_STATEFUL_SUB (1ULL<<57) /* stateful mon subscription */
+#define CEPH_FEATURE_MON_ROUTE_OSDMAP (1ULL<<57) /* peon sends osdmaps */
+#define CEPH_FEATURE_CRUSH_TUNABLES5 (1ULL<<58) /* chooseleaf stable mode */
 
 /*
  * The introduction of CEPH_FEATURE_OSD_SNAPMAPPER caused the feature
@@ -108,7 +118,8 @@ static inline u64 ceph_sanitize_features(u64 features)
  CEPH_FEATURE_CRUSH_TUNABLES3 | \
  CEPH_FEATURE_OSD_PRIMARY_AFFINITY | \
  CEPH_FEATURE_MSGR_KEEPALIVE2 | \
- CEPH_FEATURE_CRUSH_V4)
+ CEPH_FEATURE_CRUSH_V4 | \
+ CEPH_FEATURE_CRUSH_TUNABLES5)
 
 #define CEPH_FEATURES_REQUIRED_DEFAULT   \
  (CEPH_FEATURE_NOSRCADDR | \
--
2.14.1


--
kernel-team mailing list
[hidden email]
https://lists.ubuntu.com/mailman/listinfo/kernel-team
Reply | Threaded
Open this post in threaded view
|

[SRU][Xenial][PATCH 6/6] libceph: MOSDOpReply v7 encoding

Billy Olsen
In reply to this post by Billy Olsen
From: Ilya Dryomov <[hidden email]>

BugLink: https://bugs.launchpad.net/bugs/1728739

Empty request_redirect_t (struct ceph_request_redirect in the kernel
client) is now encoded with a bool.  NEW_OSDOPREPLY_ENCODING feature
bit overlaps with already supported CRUSH_TUNABLES5.

Signed-off-by: Ilya Dryomov <[hidden email]>
Reviewed-by: Sage Weil <[hidden email]>
(cherry picked from commit b0b31a8ffe54abf0a455bcaee54dd92f08817164)
Signed-off-by: Billy Olsen <[hidden email]>
---
 include/linux/ceph/ceph_features.h |  5 ++++-
 net/ceph/osd_client.c              | 10 ++++++++++
 2 files changed, 14 insertions(+), 1 deletion(-)

diff --git a/include/linux/ceph/ceph_features.h b/include/linux/ceph/ceph_features.h
index c3b211c9fe83..c1ef6f14e7be 100644
--- a/include/linux/ceph/ceph_features.h
+++ b/include/linux/ceph/ceph_features.h
@@ -73,6 +73,8 @@
 #define CEPH_FEATURE_MON_STATEFUL_SUB (1ULL<<57) /* stateful mon subscription */
 #define CEPH_FEATURE_MON_ROUTE_OSDMAP (1ULL<<57) /* peon sends osdmaps */
 #define CEPH_FEATURE_CRUSH_TUNABLES5 (1ULL<<58) /* chooseleaf stable mode */
+// duplicated since it was introduced at the same time as CEPH_FEATURE_CRUSH_TUNABLES5
+#define CEPH_FEATURE_NEW_OSDOPREPLY_ENCODING   (1ULL<<58) /* New, v7 encoding */
 
 /*
  * The introduction of CEPH_FEATURE_OSD_SNAPMAPPER caused the feature
@@ -119,7 +121,8 @@ static inline u64 ceph_sanitize_features(u64 features)
  CEPH_FEATURE_OSD_PRIMARY_AFFINITY | \
  CEPH_FEATURE_MSGR_KEEPALIVE2 | \
  CEPH_FEATURE_CRUSH_V4 | \
- CEPH_FEATURE_CRUSH_TUNABLES5)
+ CEPH_FEATURE_CRUSH_TUNABLES5 | \
+ CEPH_FEATURE_NEW_OSDOPREPLY_ENCODING)
 
 #define CEPH_FEATURES_REQUIRED_DEFAULT   \
  (CEPH_FEATURE_NOSRCADDR | \
diff --git a/net/ceph/osd_client.c b/net/ceph/osd_client.c
index a28e47ff1b1b..5bc053778fed 100644
--- a/net/ceph/osd_client.c
+++ b/net/ceph/osd_client.c
@@ -1770,6 +1770,7 @@ static void handle_reply(struct ceph_osd_client *osdc, struct ceph_msg *msg)
  u32 osdmap_epoch;
  int already_completed;
  u32 bytes;
+ u8 decode_redir;
  unsigned int i;
 
  tid = le64_to_cpu(msg->hdr.tid);
@@ -1841,6 +1842,15 @@ static void handle_reply(struct ceph_osd_client *osdc, struct ceph_msg *msg)
  p += 8 + 4; /* skip replay_version */
  p += 8; /* skip user_version */
 
+ if (le16_to_cpu(msg->hdr.version) >= 7)
+ ceph_decode_8_safe(&p, end, decode_redir, bad_put);
+ else
+ decode_redir = 1;
+ } else {
+ decode_redir = 0;
+ }
+
+ if (decode_redir) {
  err = ceph_redirect_decode(&p, end, &redir);
  if (err)
  goto bad_put;
--
2.14.1


--
kernel-team mailing list
[hidden email]
https://lists.ubuntu.com/mailman/listinfo/kernel-team
Reply | Threaded
Open this post in threaded view
|

ACK/cmnt: [SRU][Xenial][PATCH 0/6] Backport Ceph CRUSH_TUNABLES5 support

Stefan Bader-2
In reply to this post by Billy Olsen
On 01.11.2017 21:37, Billy Olsen wrote:

> From: Billy Olsen <[hidden email]>
>
> BugLink: https://bugs.launchpad.net/bugs/1728739
>
> [Impact]
> Attempting to use the kernel rbd driver to mount images from a
> Jewel (Xenial) or Luminous (Artful & Pike Ubuntu Cloud Archive)
> server causes the mount to fail. This is due to Ceph's addition
> of new CRUSH_TUNABLES5 feature which is not understood by the
> 4.4 kernel client.
>
> [Fix]
> Backport the 5 original patches (clean cherry-picks) that added
> CRUSH_TUNABLES5 support in the upstream 4.5 kernel. Also backport
> an additional patch that allows the client to understand the new
> v7 format of the MOSDOpReply message.
>
> - https://www.spinics.net/lists/ceph-devel/msg28421.html
> - https://www.spinics.net/lists/ceph-devel/msg28458.html
>
> [Test Case]
> 1. Deploy a Jewel/Luminous Ceph Cluster with crush tunables set
>    to optimal.
> 2. Create an RBD image suitable for kernel client:
>    $ rbd create --pool rbd --image-feature layering --size 1G test
> 3. Map rbd device to local server node:
>    $ rbd map --pool rbd test
>
> [Regression Potential]
> Minimal. Code is limited to kernel rbd driver and new code should
> primarily affect clients connecting to clusters with the new
> tunables options.
>
> [Notes]
> Only applicable to Xenial LTS kernel 4.4 since code was included
> in the 4.5 kernel.
>
> Ilya Dryomov (6):
>   crush: ensure bucket id is valid before indexing buckets array
>   crush: ensure take bucket value is valid
>   crush: add chooseleaf_stable tunable
>   crush: decode and initialize chooseleaf_stable
>   libceph: advertise support for TUNABLES5
>   libceph: MOSDOpReply v7 encoding
>
>  include/linux/ceph/ceph_features.h | 16 +++++++++++++++-
>  include/linux/crush/crush.h        |  8 +++++++-
>  net/ceph/crush/mapper.c            | 33 ++++++++++++++++++++++++++-------
>  net/ceph/osd_client.c              | 10 ++++++++++
>  net/ceph/osdmap.c                  | 19 ++++++++++++++-----
>  5 files changed, 72 insertions(+), 14 deletions(-)
>
The changes appear to be rather isolated and limited by feature flags which
should lower the risk of pulling those in. For verification testing it would be
good if it could add results of mapping a rbd device from servers using a 4.4
kernel to show backwards compat as well as fixing the failure case.

Acked-by: Stefan Bader <[hidden email]>



--
kernel-team mailing list
[hidden email]
https://lists.ubuntu.com/mailman/listinfo/kernel-team

signature.asc (836 bytes) Download Attachment
Reply | Threaded
Open this post in threaded view
|

ACK: [SRU][Xenial][PATCH 0/6] Backport Ceph CRUSH_TUNABLES5 support

Kleber Souza
In reply to this post by Billy Olsen
On 11/01/17 21:37, Billy Olsen wrote:

> From: Billy Olsen <[hidden email]>
>
> BugLink: https://bugs.launchpad.net/bugs/1728739
>
> [Impact]
> Attempting to use the kernel rbd driver to mount images from a
> Jewel (Xenial) or Luminous (Artful & Pike Ubuntu Cloud Archive)
> server causes the mount to fail. This is due to Ceph's addition
> of new CRUSH_TUNABLES5 feature which is not understood by the
> 4.4 kernel client.
>
> [Fix]
> Backport the 5 original patches (clean cherry-picks) that added
> CRUSH_TUNABLES5 support in the upstream 4.5 kernel. Also backport
> an additional patch that allows the client to understand the new
> v7 format of the MOSDOpReply message.
>
> - https://www.spinics.net/lists/ceph-devel/msg28421.html
> - https://www.spinics.net/lists/ceph-devel/msg28458.html
>
> [Test Case]
> 1. Deploy a Jewel/Luminous Ceph Cluster with crush tunables set
>    to optimal.
> 2. Create an RBD image suitable for kernel client:
>    $ rbd create --pool rbd --image-feature layering --size 1G test
> 3. Map rbd device to local server node:
>    $ rbd map --pool rbd test
>
> [Regression Potential]
> Minimal. Code is limited to kernel rbd driver and new code should
> primarily affect clients connecting to clusters with the new
> tunables options.
>
> [Notes]
> Only applicable to Xenial LTS kernel 4.4 since code was included
> in the 4.5 kernel.
>
> Ilya Dryomov (6):
>   crush: ensure bucket id is valid before indexing buckets array
>   crush: ensure take bucket value is valid
>   crush: add chooseleaf_stable tunable
>   crush: decode and initialize chooseleaf_stable
>   libceph: advertise support for TUNABLES5
>   libceph: MOSDOpReply v7 encoding
>
>  include/linux/ceph/ceph_features.h | 16 +++++++++++++++-
>  include/linux/crush/crush.h        |  8 +++++++-
>  net/ceph/crush/mapper.c            | 33 ++++++++++++++++++++++++++-------
>  net/ceph/osd_client.c              | 10 ++++++++++
>  net/ceph/osdmap.c                  | 19 ++++++++++++++-----
>  5 files changed, 72 insertions(+), 14 deletions(-)
>

Acked-by: Kleber Sacilotto de Souza <[hidden email]>

--
kernel-team mailing list
[hidden email]
https://lists.ubuntu.com/mailman/listinfo/kernel-team
Reply | Threaded
Open this post in threaded view
|

APPLIED: [SRU][Xenial][PATCH 0/6] Backport Ceph CRUSH_TUNABLES5 support

Thadeu Lima de Souza Cascardo-3
In reply to this post by Billy Olsen
Applied to xenial master-next branch.

Thanks.
Cascardo.

Applied-to: xenial/master-next

--
kernel-team mailing list
[hidden email]
https://lists.ubuntu.com/mailman/listinfo/kernel-team