[SRU][B/D] Ensure /proc/sys/net/bridge folders (dis)appear appropriately

classic Classic list List threaded Threaded
18 messages Options
Reply | Threaded
Open this post in threaded view
|

[SRU][B/D] Ensure /proc/sys/net/bridge folders (dis)appear appropriately

Connor Kuehl
Note: Bionic required two additional patches in order for these to apply cleanly, one
of which required minor backporting to use the updated wrappers/symbols.

BugLink: https://bugs.launchpad.net/bugs/1836910

Justification taken from the link above ^

SRU Justification

Impact: Currently, the /proc/sys/net/bridge folder is only created in the initial network namespace.
This blocks use-cases where users would like to e.g. not do bridge filtering for bridges in a specific
network namespace while doing so for bridges located in another network namespace.

Fix: The patches linked below ensure that the /proc/sys/net/bridge folder is available in each network
namespace if the module is loaded and disappears from all network namespaces when the module is unloaded.

In doing so the patch makes the sysctls:

bridge-nf-call-arptables
bridge-nf-call-ip6tables
bridge-nf-call-iptables
bridge-nf-filter-pppoe-tagged
bridge-nf-filter-vlan-tagged
bridge-nf-pass-vlan-input-dev

apply per network namespace.

Regression Potential: None, since this didn't use to work before. Otherwise limited to the br_netfilter module.
The netfilter rules are afaict already per network namespace so it should be safe for users to specify whether
bridge devices inside a network namespace are supposed to go through iptables et al. or not. Also, this can
already be done per-bridge by setting an option for each individual bridge via Netlink. It should also be
possible to do this for all bridges in a network namespace via sysctls.

Test Case: Tested with LXD on a kernel with the patches applied and per-network namespace iptables.

--
kernel-team mailing list
[hidden email]
https://lists.ubuntu.com/mailman/listinfo/kernel-team
Reply | Threaded
Open this post in threaded view
|

[SRU][B][PATCH 1/5] net: bridge: add bitfield for options and convert vlan opts

Connor Kuehl
From: Nikolay Aleksandrov <[hidden email]>

BugLink: https://bugs.launchpad.net/bugs/1836910

Bridge options have usually been added as separate fields all over the
net_bridge struct taking up space and ending up in different cache lines.
Let's move them to a single bitfield to save up space and speedup lookups.
This patch adds a simple API for option modifying and retrieving using
bitops and converts the first user of the API - the bridge vlan options
(vlan_enabled and vlan_stats_enabled).

Signed-off-by: Nikolay Aleksandrov <[hidden email]>
Reviewed-by: Stephen Hemminger <[hidden email]>
Signed-off-by: David S. Miller <[hidden email]>
(cherry picked from commit ae75767ec206c6f445973e5e6c5af8a865016e15)
Signed-off-by: Connor Kuehl <[hidden email]>
---
 net/bridge/br.c          | 16 ++++++++++++++++
 net/bridge/br_netlink.c  |  3 ++-
 net/bridge/br_private.h  | 16 ++++++++++++++--
 net/bridge/br_sysfs_br.c |  4 ++--
 net/bridge/br_vlan.c     | 28 +++++++++++++++-------------
 5 files changed, 49 insertions(+), 18 deletions(-)

diff --git a/net/bridge/br.c b/net/bridge/br.c
index 6bf06e756df2..ee05f91baf0d 100644
--- a/net/bridge/br.c
+++ b/net/bridge/br.c
@@ -171,6 +171,22 @@ static struct notifier_block br_switchdev_notifier = {
  .notifier_call = br_switchdev_event,
 };
 
+void br_opt_toggle(struct net_bridge *br, enum net_bridge_opts opt, bool on)
+{
+ bool cur = !!br_opt_get(br, opt);
+
+ br_debug(br, "toggle option: %d state: %d -> %d\n",
+ opt, cur, on);
+
+ if (cur == on)
+ return;
+
+ if (on)
+ set_bit(opt, &br->options);
+ else
+ clear_bit(opt, &br->options);
+}
+
 static void __net_exit br_net_exit(struct net *net)
 {
  struct net_device *dev;
diff --git a/net/bridge/br_netlink.c b/net/bridge/br_netlink.c
index 015f465c514b..cdc298e3ab9f 100644
--- a/net/bridge/br_netlink.c
+++ b/net/bridge/br_netlink.c
@@ -1381,7 +1381,8 @@ static int br_fill_info(struct sk_buff *skb, const struct net_device *brdev)
 #ifdef CONFIG_BRIDGE_VLAN_FILTERING
  if (nla_put_be16(skb, IFLA_BR_VLAN_PROTOCOL, br->vlan_proto) ||
     nla_put_u16(skb, IFLA_BR_VLAN_DEFAULT_PVID, br->default_pvid) ||
-    nla_put_u8(skb, IFLA_BR_VLAN_STATS_ENABLED, br->vlan_stats_enabled))
+    nla_put_u8(skb, IFLA_BR_VLAN_STATS_ENABLED,
+       br_opt_get(br, BROPT_VLAN_STATS_ENABLED)))
  return -EMSGSIZE;
 #endif
 #ifdef CONFIG_BRIDGE_IGMP_SNOOPING
diff --git a/net/bridge/br_private.h b/net/bridge/br_private.h
index 14224384e4d5..bfd6541de1dc 100644
--- a/net/bridge/br_private.h
+++ b/net/bridge/br_private.h
@@ -302,16 +302,20 @@ static inline struct net_bridge_port *br_port_get_rtnl_rcu(const struct net_devi
  rcu_dereference_rtnl(dev->rx_handler_data) : NULL;
 }
 
+enum net_bridge_opts {
+ BROPT_VLAN_ENABLED,
+ BROPT_VLAN_STATS_ENABLED,
+};
+
 struct net_bridge {
  spinlock_t lock;
  spinlock_t hash_lock;
  struct list_head port_list;
  struct net_device *dev;
  struct pcpu_sw_netstats __percpu *stats;
+ unsigned long options;
  /* These fields are accessed on each packet */
 #ifdef CONFIG_BRIDGE_VLAN_FILTERING
- u8 vlan_enabled;
- u8 vlan_stats_enabled;
  __be16 vlan_proto;
  u16 default_pvid;
  struct net_bridge_vlan_group __rcu *vlgrp;
@@ -482,6 +486,14 @@ static inline bool br_vlan_should_use(const struct net_bridge_vlan *v)
  return true;
 }
 
+static inline int br_opt_get(const struct net_bridge *br,
+     enum net_bridge_opts opt)
+{
+ return test_bit(opt, &br->options);
+}
+
+void br_opt_toggle(struct net_bridge *br, enum net_bridge_opts opt, bool on);
+
 /* br_device.c */
 void br_dev_setup(struct net_device *dev);
 void br_dev_delete(struct net_device *dev, struct list_head *list);
diff --git a/net/bridge/br_sysfs_br.c b/net/bridge/br_sysfs_br.c
index 723f25eed8ea..8912c7c645ed 100644
--- a/net/bridge/br_sysfs_br.c
+++ b/net/bridge/br_sysfs_br.c
@@ -750,7 +750,7 @@ static ssize_t vlan_filtering_show(struct device *d,
    char *buf)
 {
  struct net_bridge *br = to_bridge(d);
- return sprintf(buf, "%d\n", br->vlan_enabled);
+ return sprintf(buf, "%d\n", br_opt_get(br, BROPT_VLAN_ENABLED));
 }
 
 static ssize_t vlan_filtering_store(struct device *d,
@@ -798,7 +798,7 @@ static ssize_t vlan_stats_enabled_show(struct device *d,
        char *buf)
 {
  struct net_bridge *br = to_bridge(d);
- return sprintf(buf, "%u\n", br->vlan_stats_enabled);
+ return sprintf(buf, "%u\n", br_opt_get(br, BROPT_VLAN_STATS_ENABLED));
 }
 
 static ssize_t vlan_stats_enabled_store(struct device *d,
diff --git a/net/bridge/br_vlan.c b/net/bridge/br_vlan.c
index 9896f4975353..7466e0273435 100644
--- a/net/bridge/br_vlan.c
+++ b/net/bridge/br_vlan.c
@@ -388,7 +388,7 @@ struct sk_buff *br_handle_vlan(struct net_bridge *br,
  return NULL;
  }
  }
- if (br->vlan_stats_enabled) {
+ if (br_opt_get(br, BROPT_VLAN_STATS_ENABLED)) {
  stats = this_cpu_ptr(v->stats);
  u64_stats_update_begin(&stats->syncp);
  stats->tx_bytes += skb->len;
@@ -477,14 +477,14 @@ static bool __allowed_ingress(const struct net_bridge *br,
  skb->vlan_tci |= pvid;
 
  /* if stats are disabled we can avoid the lookup */
- if (!br->vlan_stats_enabled)
+ if (!br_opt_get(br, BROPT_VLAN_STATS_ENABLED))
  return true;
  }
  v = br_vlan_find(vg, *vid);
  if (!v || !br_vlan_should_use(v))
  goto drop;
 
- if (br->vlan_stats_enabled) {
+ if (br_opt_get(br, BROPT_VLAN_STATS_ENABLED)) {
  stats = this_cpu_ptr(v->stats);
  u64_stats_update_begin(&stats->syncp);
  stats->rx_bytes += skb->len;
@@ -506,7 +506,7 @@ bool br_allowed_ingress(const struct net_bridge *br,
  /* If VLAN filtering is disabled on the bridge, all packets are
  * permitted.
  */
- if (!br->vlan_enabled) {
+ if (!br_opt_get(br, BROPT_VLAN_ENABLED)) {
  BR_INPUT_SKB_CB(skb)->vlan_filtered = false;
  return true;
  }
@@ -540,7 +540,7 @@ bool br_should_learn(struct net_bridge_port *p, struct sk_buff *skb, u16 *vid)
  struct net_bridge *br = p->br;
 
  /* If filtering was disabled at input, let it pass. */
- if (!br->vlan_enabled)
+ if (!br_opt_get(br, BROPT_VLAN_ENABLED))
  return true;
 
  vg = nbp_vlan_group_rcu(p);
@@ -679,7 +679,8 @@ static void recalculate_group_addr(struct net_bridge *br)
  return;
 
  spin_lock_bh(&br->lock);
- if (!br->vlan_enabled || br->vlan_proto == htons(ETH_P_8021Q)) {
+ if (!br_opt_get(br, BROPT_VLAN_ENABLED) ||
+    br->vlan_proto == htons(ETH_P_8021Q)) {
  /* Bridge Group Address */
  br->group_addr[5] = 0x00;
  } else { /* vlan_enabled && ETH_P_8021AD */
@@ -692,7 +693,8 @@ static void recalculate_group_addr(struct net_bridge *br)
 /* Must be protected by RTNL. */
 void br_recalculate_fwd_mask(struct net_bridge *br)
 {
- if (!br->vlan_enabled || br->vlan_proto == htons(ETH_P_8021Q))
+ if (!br_opt_get(br, BROPT_VLAN_ENABLED) ||
+    br->vlan_proto == htons(ETH_P_8021Q))
  br->group_fwd_mask_required = BR_GROUPFWD_DEFAULT;
  else /* vlan_enabled && ETH_P_8021AD */
  br->group_fwd_mask_required = BR_GROUPFWD_8021AD &
@@ -709,14 +711,14 @@ int __br_vlan_filter_toggle(struct net_bridge *br, unsigned long val)
  };
  int err;
 
- if (br->vlan_enabled == val)
+ if (br_opt_get(br, BROPT_VLAN_ENABLED) == !!val)
  return 0;
 
  err = switchdev_port_attr_set(br->dev, &attr);
  if (err && err != -EOPNOTSUPP)
  return err;
 
- br->vlan_enabled = val;
+ br_opt_toggle(br, BROPT_VLAN_ENABLED, !!val);
  br_manage_promisc(br);
  recalculate_group_addr(br);
  br_recalculate_fwd_mask(br);
@@ -733,7 +735,7 @@ bool br_vlan_enabled(const struct net_device *dev)
 {
  struct net_bridge *br = netdev_priv(dev);
 
- return !!br->vlan_enabled;
+ return br_opt_get(br, BROPT_VLAN_ENABLED);
 }
 EXPORT_SYMBOL_GPL(br_vlan_enabled);
 
@@ -799,7 +801,7 @@ int br_vlan_set_stats(struct net_bridge *br, unsigned long val)
  switch (val) {
  case 0:
  case 1:
- br->vlan_stats_enabled = val;
+ br_opt_toggle(br, BROPT_VLAN_STATS_ENABLED, !!val);
  break;
  default:
  return -EINVAL;
@@ -945,7 +947,7 @@ int br_vlan_set_default_pvid(struct net_bridge *br, unsigned long val)
  goto out;
 
  /* Only allow default pvid change when filtering is disabled */
- if (br->vlan_enabled) {
+ if (br_opt_get(br, BROPT_VLAN_ENABLED)) {
  pr_info_once("Please disable vlan filtering to change default_pvid\n");
  err = -EPERM;
  goto out;
@@ -999,7 +1001,7 @@ int nbp_vlan_init(struct net_bridge_port *p)
  .orig_dev = p->br->dev,
  .id = SWITCHDEV_ATTR_ID_BRIDGE_VLAN_FILTERING,
  .flags = SWITCHDEV_F_SKIP_EOPNOTSUPP,
- .u.vlan_filtering = p->br->vlan_enabled,
+ .u.vlan_filtering = br_opt_get(p->br, BROPT_VLAN_ENABLED),
  };
  struct net_bridge_vlan_group *vg;
  int ret = -ENOMEM;
--
2.20.1


--
kernel-team mailing list
[hidden email]
https://lists.ubuntu.com/mailman/listinfo/kernel-team
Reply | Threaded
Open this post in threaded view
|

[SRU][B][PATCH 2/5] net: bridge: convert nf call options to bits

Connor Kuehl
In reply to this post by Connor Kuehl
From: Nikolay Aleksandrov <[hidden email]>

BugLink: https://bugs.launchpad.net/bugs/1836910

No functional change, convert of nf_call_[ip|ip6|arp]tables to bits.

Signed-off-by: Nikolay Aleksandrov <[hidden email]>
Reviewed-by: Stephen Hemminger <[hidden email]>
Signed-off-by: David S. Miller <[hidden email]>
(backported from commit 8df3510f28e5cba2e94fecb40585e9b7e8a0c6ec)
[ Connor Kuehl: required minor adjustments to use updated symbols
  due to their changes in capitalization/relocation to wrappers. ]
Signed-off-by: Connor Kuehl <[hidden email]>
---
 net/bridge/br_netfilter_hooks.c |  7 ++++---
 net/bridge/br_netlink.c         | 12 ++++++------
 net/bridge/br_private.h         |  6 +++---
 net/bridge/br_sysfs_br.c        | 12 ++++++------
 4 files changed, 19 insertions(+), 18 deletions(-)

diff --git a/net/bridge/br_netfilter_hooks.c b/net/bridge/br_netfilter_hooks.c
index 7582f28ab306..d52db64832e2 100644
--- a/net/bridge/br_netfilter_hooks.c
+++ b/net/bridge/br_netfilter_hooks.c
@@ -484,14 +484,15 @@ static unsigned int br_nf_pre_routing(void *priv,
  br = p->br;
 
  if (IS_IPV6(skb) || IS_VLAN_IPV6(skb) || IS_PPPOE_IPV6(skb)) {
- if (!brnf_call_ip6tables && !br->nf_call_ip6tables)
+ if (!brnf_call_ip6tables &&
+    !br_opt_get(br, BROPT_NF_CALL_IP6TABLES))
  return NF_ACCEPT;
 
  nf_bridge_pull_encap_header_rcsum(skb);
  return br_nf_pre_routing_ipv6(priv, skb, state);
  }
 
- if (!brnf_call_iptables && !br->nf_call_iptables)
+ if (!brnf_call_iptables && !br_opt_get(br, BROPT_NF_CALL_IPTABLES))
  return NF_ACCEPT;
 
  if (!IS_IP(skb) && !IS_VLAN_IP(skb) && !IS_PPPOE_IP(skb))
@@ -633,7 +634,7 @@ static unsigned int br_nf_forward_arp(void *priv,
  return NF_ACCEPT;
  br = p->br;
 
- if (!brnf_call_arptables && !br->nf_call_arptables)
+ if (!brnf_call_arptables && !br_opt_get(br, BROPT_NF_CALL_ARPTABLES))
  return NF_ACCEPT;
 
  if (!IS_ARP(skb)) {
diff --git a/net/bridge/br_netlink.c b/net/bridge/br_netlink.c
index cdc298e3ab9f..e21a741e1e72 100644
--- a/net/bridge/br_netlink.c
+++ b/net/bridge/br_netlink.c
@@ -1236,19 +1236,19 @@ static int br_changelink(struct net_device *brdev, struct nlattr *tb[],
  if (data[IFLA_BR_NF_CALL_IPTABLES]) {
  u8 val = nla_get_u8(data[IFLA_BR_NF_CALL_IPTABLES]);
 
- br->nf_call_iptables = val ? true : false;
+ br_opt_toggle(br, BROPT_NF_CALL_IPTABLES, !!val);
  }
 
  if (data[IFLA_BR_NF_CALL_IP6TABLES]) {
  u8 val = nla_get_u8(data[IFLA_BR_NF_CALL_IP6TABLES]);
 
- br->nf_call_ip6tables = val ? true : false;
+ br_opt_toggle(br, BROPT_NF_CALL_IP6TABLES, !!val);
  }
 
  if (data[IFLA_BR_NF_CALL_ARPTABLES]) {
  u8 val = nla_get_u8(data[IFLA_BR_NF_CALL_ARPTABLES]);
 
- br->nf_call_arptables = val ? true : false;
+ br_opt_toggle(br, BROPT_NF_CALL_ARPTABLES, !!val);
  }
 #endif
 
@@ -1435,11 +1435,11 @@ static int br_fill_info(struct sk_buff *skb, const struct net_device *brdev)
 #endif
 #if IS_ENABLED(CONFIG_BRIDGE_NETFILTER)
  if (nla_put_u8(skb, IFLA_BR_NF_CALL_IPTABLES,
-       br->nf_call_iptables ? 1 : 0) ||
+       br_opt_get(br, BROPT_NF_CALL_IPTABLES) ? 1 : 0) ||
     nla_put_u8(skb, IFLA_BR_NF_CALL_IP6TABLES,
-       br->nf_call_ip6tables ? 1 : 0) ||
+       br_opt_get(br, BROPT_NF_CALL_IP6TABLES) ? 1 : 0) ||
     nla_put_u8(skb, IFLA_BR_NF_CALL_ARPTABLES,
-       br->nf_call_arptables ? 1 : 0))
+       br_opt_get(br, BROPT_NF_CALL_ARPTABLES) ? 1 : 0))
  return -EMSGSIZE;
 #endif
 
diff --git a/net/bridge/br_private.h b/net/bridge/br_private.h
index bfd6541de1dc..e02092be096f 100644
--- a/net/bridge/br_private.h
+++ b/net/bridge/br_private.h
@@ -305,6 +305,9 @@ static inline struct net_bridge_port *br_port_get_rtnl_rcu(const struct net_devi
 enum net_bridge_opts {
  BROPT_VLAN_ENABLED,
  BROPT_VLAN_STATS_ENABLED,
+ BROPT_NF_CALL_IPTABLES,
+ BROPT_NF_CALL_IP6TABLES,
+ BROPT_NF_CALL_ARPTABLES,
 };
 
 struct net_bridge {
@@ -327,9 +330,6 @@ struct net_bridge {
  struct rtable fake_rtable;
  struct rt6_info fake_rt6_info;
  };
- bool nf_call_iptables;
- bool nf_call_ip6tables;
- bool nf_call_arptables;
 #endif
  u16 group_fwd_mask;
  u16 group_fwd_mask_required;
diff --git a/net/bridge/br_sysfs_br.c b/net/bridge/br_sysfs_br.c
index 8912c7c645ed..00718dc6ec3d 100644
--- a/net/bridge/br_sysfs_br.c
+++ b/net/bridge/br_sysfs_br.c
@@ -685,12 +685,12 @@ static ssize_t nf_call_iptables_show(
  struct device *d, struct device_attribute *attr, char *buf)
 {
  struct net_bridge *br = to_bridge(d);
- return sprintf(buf, "%u\n", br->nf_call_iptables);
+ return sprintf(buf, "%u\n", br_opt_get(br, BROPT_NF_CALL_IPTABLES));
 }
 
 static int set_nf_call_iptables(struct net_bridge *br, unsigned long val)
 {
- br->nf_call_iptables = val ? true : false;
+ br_opt_toggle(br, BROPT_NF_CALL_IPTABLES, !!val);
  return 0;
 }
 
@@ -706,12 +706,12 @@ static ssize_t nf_call_ip6tables_show(
  struct device *d, struct device_attribute *attr, char *buf)
 {
  struct net_bridge *br = to_bridge(d);
- return sprintf(buf, "%u\n", br->nf_call_ip6tables);
+ return sprintf(buf, "%u\n", br_opt_get(br, BROPT_NF_CALL_IP6TABLES));
 }
 
 static int set_nf_call_ip6tables(struct net_bridge *br, unsigned long val)
 {
- br->nf_call_ip6tables = val ? true : false;
+ br_opt_toggle(br, BROPT_NF_CALL_IP6TABLES, !!val);
  return 0;
 }
 
@@ -727,12 +727,12 @@ static ssize_t nf_call_arptables_show(
  struct device *d, struct device_attribute *attr, char *buf)
 {
  struct net_bridge *br = to_bridge(d);
- return sprintf(buf, "%u\n", br->nf_call_arptables);
+ return sprintf(buf, "%u\n", br_opt_get(br, BROPT_NF_CALL_ARPTABLES));
 }
 
 static int set_nf_call_arptables(struct net_bridge *br, unsigned long val)
 {
- br->nf_call_arptables = val ? true : false;
+ br_opt_toggle(br, BROPT_NF_CALL_ARPTABLES, !!val);
  return 0;
 }
 
--
2.20.1


--
kernel-team mailing list
[hidden email]
https://lists.ubuntu.com/mailman/listinfo/kernel-team
Reply | Threaded
Open this post in threaded view
|

[SRU][B][PATCH 3/5] netfilter: bridge: port sysctls to use brnf_net

Connor Kuehl
In reply to this post by Connor Kuehl
From: Christian Brauner <[hidden email]>

BugLink: https://bugs.launchpad.net/bugs/1836910

This ports the sysctls to use struct brnf_net.

With this patch we make it possible to namespace the br_netfilter module in
the following patch.

Signed-off-by: Christian Brauner <[hidden email]>
Signed-off-by: Pablo Neira Ayuso <[hidden email]>
(cherry picked from commit ff6d090d0db41425aef0cfe5dc58bb3cc12514a2)
Signed-off-by: Connor Kuehl <[hidden email]>
---
 include/net/netfilter/br_netfilter.h |   3 +-
 net/bridge/br_netfilter_hooks.c      | 162 +++++++++++++++++----------
 net/bridge/br_netfilter_ipv6.c       |   2 +-
 3 files changed, 107 insertions(+), 60 deletions(-)

diff --git a/include/net/netfilter/br_netfilter.h b/include/net/netfilter/br_netfilter.h
index 74af19c3a8f7..e51f5961272b 100644
--- a/include/net/netfilter/br_netfilter.h
+++ b/include/net/netfilter/br_netfilter.h
@@ -48,7 +48,8 @@ static inline struct rtable *bridge_parent_rtable(const struct net_device *dev)
  return port ? &port->br->fake_rtable : NULL;
 }
 
-struct net_device *setup_pre_routing(struct sk_buff *skb);
+struct net_device *setup_pre_routing(struct sk_buff *skb,
+     const struct net *net);
 void br_netfilter_enable(void);
 
 #if IS_ENABLED(CONFIG_IPV6)
diff --git a/net/bridge/br_netfilter_hooks.c b/net/bridge/br_netfilter_hooks.c
index d52db64832e2..b8a365ade2d4 100644
--- a/net/bridge/br_netfilter_hooks.c
+++ b/net/bridge/br_netfilter_hooks.c
@@ -48,27 +48,24 @@
 
 static unsigned int brnf_net_id __read_mostly;
 
-struct brnf_net {
- bool enabled;
-};
-
 #ifdef CONFIG_SYSCTL
 static struct ctl_table_header *brnf_sysctl_header;
-static int brnf_call_iptables __read_mostly = 1;
-static int brnf_call_ip6tables __read_mostly = 1;
-static int brnf_call_arptables __read_mostly = 1;
-static int brnf_filter_vlan_tagged __read_mostly;
-static int brnf_filter_pppoe_tagged __read_mostly;
-static int brnf_pass_vlan_indev __read_mostly;
-#else
-#define brnf_call_iptables 1
-#define brnf_call_ip6tables 1
-#define brnf_call_arptables 1
-#define brnf_filter_vlan_tagged 0
-#define brnf_filter_pppoe_tagged 0
-#define brnf_pass_vlan_indev 0
 #endif
 
+struct brnf_net {
+ bool enabled;
+
+ /* default value is 1 */
+ int call_iptables;
+ int call_ip6tables;
+ int call_arptables;
+
+ /* default value is 0 */
+ int filter_vlan_tagged;
+ int filter_pppoe_tagged;
+ int pass_vlan_indev;
+};
+
 #define IS_IP(skb) \
  (!skb_vlan_tag_present(skb) && skb->protocol == htons(ETH_P_IP))
 
@@ -88,17 +85,28 @@ static inline __be16 vlan_proto(const struct sk_buff *skb)
  return 0;
 }
 
-#define IS_VLAN_IP(skb) \
- (vlan_proto(skb) == htons(ETH_P_IP) && \
- brnf_filter_vlan_tagged)
+static inline bool is_vlan_ip(const struct sk_buff *skb, const struct net *net)
+{
+ struct brnf_net *brnet = net_generic(net, brnf_net_id);
+
+ return vlan_proto(skb) == htons(ETH_P_IP) && brnet->filter_vlan_tagged;
+}
+
+static inline bool is_vlan_ipv6(const struct sk_buff *skb,
+ const struct net *net)
+{
+ struct brnf_net *brnet = net_generic(net, brnf_net_id);
 
-#define IS_VLAN_IPV6(skb) \
- (vlan_proto(skb) == htons(ETH_P_IPV6) && \
- brnf_filter_vlan_tagged)
+ return vlan_proto(skb) == htons(ETH_P_IPV6) &&
+       brnet->filter_vlan_tagged;
+}
 
-#define IS_VLAN_ARP(skb) \
- (vlan_proto(skb) == htons(ETH_P_ARP) && \
- brnf_filter_vlan_tagged)
+static inline bool is_vlan_arp(const struct sk_buff *skb, const struct net *net)
+{
+ struct brnf_net *brnet = net_generic(net, brnf_net_id);
+
+ return vlan_proto(skb) == htons(ETH_P_ARP) && brnet->filter_vlan_tagged;
+}
 
 static inline __be16 pppoe_proto(const struct sk_buff *skb)
 {
@@ -106,15 +114,23 @@ static inline __be16 pppoe_proto(const struct sk_buff *skb)
     sizeof(struct pppoe_hdr)));
 }
 
-#define IS_PPPOE_IP(skb) \
- (skb->protocol == htons(ETH_P_PPP_SES) && \
- pppoe_proto(skb) == htons(PPP_IP) && \
- brnf_filter_pppoe_tagged)
+static inline bool is_pppoe_ip(const struct sk_buff *skb, const struct net *net)
+{
+ struct brnf_net *brnet = net_generic(net, brnf_net_id);
+
+ return skb->protocol == htons(ETH_P_PPP_SES) &&
+       pppoe_proto(skb) == htons(PPP_IP) && brnet->filter_pppoe_tagged;
+}
+
+static inline bool is_pppoe_ipv6(const struct sk_buff *skb,
+ const struct net *net)
+{
+ struct brnf_net *brnet = net_generic(net, brnf_net_id);
 
-#define IS_PPPOE_IPV6(skb) \
- (skb->protocol == htons(ETH_P_PPP_SES) && \
- pppoe_proto(skb) == htons(PPP_IPV6) && \
- brnf_filter_pppoe_tagged)
+ return skb->protocol == htons(ETH_P_PPP_SES) &&
+       pppoe_proto(skb) == htons(PPP_IPV6) &&
+       brnet->filter_pppoe_tagged;
+}
 
 /* largest possible L2 header, see br_nf_dev_queue_xmit() */
 #define NF_BRIDGE_MAX_MAC_HEADER_LENGTH (PPPOE_SES_HLEN + ETH_HLEN)
@@ -422,12 +438,16 @@ static int br_nf_pre_routing_finish(struct net *net, struct sock *sk, struct sk_
  return 0;
 }
 
-static struct net_device *brnf_get_logical_dev(struct sk_buff *skb, const struct net_device *dev)
+static struct net_device *brnf_get_logical_dev(struct sk_buff *skb,
+       const struct net_device *dev,
+       const struct net *net)
 {
  struct net_device *vlan, *br;
+ struct brnf_net *brnet = net_generic(net, brnf_net_id);
 
  br = bridge_parent(dev);
- if (brnf_pass_vlan_indev == 0 || !skb_vlan_tag_present(skb))
+
+ if (brnet->pass_vlan_indev == 0 || !skb_vlan_tag_present(skb))
  return br;
 
  vlan = __vlan_find_dev_deep_rcu(br, skb->vlan_proto,
@@ -437,7 +457,7 @@ static struct net_device *brnf_get_logical_dev(struct sk_buff *skb, const struct
 }
 
 /* Some common code for IPv4/IPv6 */
-struct net_device *setup_pre_routing(struct sk_buff *skb)
+struct net_device *setup_pre_routing(struct sk_buff *skb, const struct net *net)
 {
  struct nf_bridge_info *nf_bridge = nf_bridge_info_get(skb);
 
@@ -448,7 +468,7 @@ struct net_device *setup_pre_routing(struct sk_buff *skb)
 
  nf_bridge->in_prerouting = 1;
  nf_bridge->physindev = skb->dev;
- skb->dev = brnf_get_logical_dev(skb, skb->dev);
+ skb->dev = brnf_get_logical_dev(skb, skb->dev, net);
 
  if (skb->protocol == htons(ETH_P_8021Q))
  nf_bridge->orig_proto = BRNF_PROTO_8021Q;
@@ -474,6 +494,7 @@ static unsigned int br_nf_pre_routing(void *priv,
  struct net_bridge_port *p;
  struct net_bridge *br;
  __u32 len = nf_bridge_encap_header_len(skb);
+ struct brnf_net *brnet;
 
  if (unlikely(!pskb_may_pull(skb, len)))
  return NF_DROP;
@@ -483,8 +504,10 @@ static unsigned int br_nf_pre_routing(void *priv,
  return NF_DROP;
  br = p->br;
 
- if (IS_IPV6(skb) || IS_VLAN_IPV6(skb) || IS_PPPOE_IPV6(skb)) {
- if (!brnf_call_ip6tables &&
+ brnet = net_generic(state->net, brnf_net_id);
+ if (IS_IPV6(skb) || is_vlan_ipv6(skb, state->net) ||
+    is_pppoe_ipv6(skb, state->net)) {
+ if (!brnet->call_ip6tables &&
     !br_opt_get(br, BROPT_NF_CALL_IP6TABLES))
  return NF_ACCEPT;
 
@@ -492,10 +515,11 @@ static unsigned int br_nf_pre_routing(void *priv,
  return br_nf_pre_routing_ipv6(priv, skb, state);
  }
 
- if (!brnf_call_iptables && !br_opt_get(br, BROPT_NF_CALL_IPTABLES))
+ if (!brnet->call_iptables && !br_opt_get(br, BROPT_NF_CALL_IPTABLES))
  return NF_ACCEPT;
 
- if (!IS_IP(skb) && !IS_VLAN_IP(skb) && !IS_PPPOE_IP(skb))
+ if (!IS_IP(skb) && !is_vlan_ip(skb, state->net) &&
+    !is_pppoe_ip(skb, state->net))
  return NF_ACCEPT;
 
  nf_bridge_pull_encap_header_rcsum(skb);
@@ -506,7 +530,7 @@ static unsigned int br_nf_pre_routing(void *priv,
  nf_bridge_put(skb->nf_bridge);
  if (!nf_bridge_alloc(skb))
  return NF_DROP;
- if (!setup_pre_routing(skb))
+ if (!setup_pre_routing(skb, state->net))
  return NF_DROP;
 
  nf_bridge = nf_bridge_info_get(skb);
@@ -528,7 +552,7 @@ static int br_nf_forward_finish(struct net *net, struct sock *sk, struct sk_buff
  struct nf_bridge_info *nf_bridge = nf_bridge_info_get(skb);
  struct net_device *in;
 
- if (!IS_ARP(skb) && !IS_VLAN_ARP(skb)) {
+ if (!IS_ARP(skb) && !is_vlan_arp(skb, net)) {
 
  if (skb->protocol == htons(ETH_P_IP))
  nf_bridge->frag_max_size = IPCB(skb)->frag_max_size;
@@ -582,9 +606,11 @@ static unsigned int br_nf_forward_ip(void *priv,
  if (!parent)
  return NF_DROP;
 
- if (IS_IP(skb) || IS_VLAN_IP(skb) || IS_PPPOE_IP(skb))
+ if (IS_IP(skb) || is_vlan_ip(skb, state->net) ||
+    is_pppoe_ip(skb, state->net))
  pf = NFPROTO_IPV4;
- else if (IS_IPV6(skb) || IS_VLAN_IPV6(skb) || IS_PPPOE_IPV6(skb))
+ else if (IS_IPV6(skb) || is_vlan_ipv6(skb, state->net) ||
+ is_pppoe_ipv6(skb, state->net))
  pf = NFPROTO_IPV6;
  else
  return NF_ACCEPT;
@@ -615,7 +641,7 @@ static unsigned int br_nf_forward_ip(void *priv,
  skb->protocol = htons(ETH_P_IPV6);
 
  NF_HOOK(pf, NF_INET_FORWARD, state->net, NULL, skb,
- brnf_get_logical_dev(skb, state->in),
+ brnf_get_logical_dev(skb, state->in, state->net),
  parent, br_nf_forward_finish);
 
  return NF_STOLEN;
@@ -628,23 +654,25 @@ static unsigned int br_nf_forward_arp(void *priv,
  struct net_bridge_port *p;
  struct net_bridge *br;
  struct net_device **d = (struct net_device **)(skb->cb);
+ struct brnf_net *brnet;
 
  p = br_port_get_rcu(state->out);
  if (p == NULL)
  return NF_ACCEPT;
  br = p->br;
 
- if (!brnf_call_arptables && !br_opt_get(br, BROPT_NF_CALL_ARPTABLES))
+ brnet = net_generic(state->net, brnf_net_id);
+ if (!brnet->call_arptables && !br_opt_get(br, BROPT_NF_CALL_ARPTABLES))
  return NF_ACCEPT;
 
  if (!IS_ARP(skb)) {
- if (!IS_VLAN_ARP(skb))
+ if (!is_vlan_arp(skb, state->net))
  return NF_ACCEPT;
  nf_bridge_pull_encap_header(skb);
  }
 
  if (arp_hdr(skb)->ar_pln != 4) {
- if (IS_VLAN_ARP(skb))
+ if (is_vlan_arp(skb, state->net))
  nf_bridge_push_encap_header(skb);
  return NF_ACCEPT;
  }
@@ -799,9 +827,11 @@ static unsigned int br_nf_post_routing(void *priv,
  if (!realoutdev)
  return NF_DROP;
 
- if (IS_IP(skb) || IS_VLAN_IP(skb) || IS_PPPOE_IP(skb))
+ if (IS_IP(skb) || is_vlan_ip(skb, state->net) ||
+    is_pppoe_ip(skb, state->net))
  pf = NFPROTO_IPV4;
- else if (IS_IPV6(skb) || IS_VLAN_IPV6(skb) || IS_PPPOE_IPV6(skb))
+ else if (IS_IPV6(skb) || is_vlan_ipv6(skb, state->net) ||
+ is_pppoe_ipv6(skb, state->net))
  pf = NFPROTO_IPV6;
  else
  return NF_ACCEPT;
@@ -1029,53 +1059,59 @@ int brnf_sysctl_call_tables(struct ctl_table *ctl, int write,
 static struct ctl_table brnf_table[] = {
  {
  .procname = "bridge-nf-call-arptables",
- .data = &brnf_call_arptables,
  .maxlen = sizeof(int),
  .mode = 0644,
  .proc_handler = brnf_sysctl_call_tables,
  },
  {
  .procname = "bridge-nf-call-iptables",
- .data = &brnf_call_iptables,
  .maxlen = sizeof(int),
  .mode = 0644,
  .proc_handler = brnf_sysctl_call_tables,
  },
  {
  .procname = "bridge-nf-call-ip6tables",
- .data = &brnf_call_ip6tables,
  .maxlen = sizeof(int),
  .mode = 0644,
  .proc_handler = brnf_sysctl_call_tables,
  },
  {
  .procname = "bridge-nf-filter-vlan-tagged",
- .data = &brnf_filter_vlan_tagged,
  .maxlen = sizeof(int),
  .mode = 0644,
  .proc_handler = brnf_sysctl_call_tables,
  },
  {
  .procname = "bridge-nf-filter-pppoe-tagged",
- .data = &brnf_filter_pppoe_tagged,
  .maxlen = sizeof(int),
  .mode = 0644,
  .proc_handler = brnf_sysctl_call_tables,
  },
  {
  .procname = "bridge-nf-pass-vlan-input-dev",
- .data = &brnf_pass_vlan_indev,
  .maxlen = sizeof(int),
  .mode = 0644,
  .proc_handler = brnf_sysctl_call_tables,
  },
  { }
 };
+
+static inline void br_netfilter_sysctl_default(struct brnf_net *brnf)
+{
+ brnf->call_iptables = 1;
+ brnf->call_ip6tables = 1;
+ brnf->call_arptables = 1;
+ brnf->filter_vlan_tagged = 0;
+ brnf->filter_pppoe_tagged = 0;
+ brnf->pass_vlan_indev = 0;
+}
+
 #endif
 
 static int __init br_netfilter_init(void)
 {
  int ret;
+ struct brnf_net *brnet;
 
  ret = register_pernet_subsys(&brnf_net_ops);
  if (ret < 0)
@@ -1088,6 +1124,16 @@ static int __init br_netfilter_init(void)
  }
 
 #ifdef CONFIG_SYSCTL
+ brnet = net_generic(&init_net, brnf_net_id);
+ brnf_table[0].data = &brnet->call_arptables;
+ brnf_table[1].data = &brnet->call_iptables;
+ brnf_table[2].data = &brnet->call_ip6tables;
+ brnf_table[3].data = &brnet->filter_vlan_tagged;
+ brnf_table[4].data = &brnet->filter_pppoe_tagged;
+ brnf_table[5].data = &brnet->pass_vlan_indev;
+
+ br_netfilter_sysctl_default(brnet);
+
  brnf_sysctl_header = register_net_sysctl(&init_net, "net/bridge", brnf_table);
  if (brnf_sysctl_header == NULL) {
  printk(KERN_WARNING
diff --git a/net/bridge/br_netfilter_ipv6.c b/net/bridge/br_netfilter_ipv6.c
index 96c072e71ea2..d2220e502b6f 100644
--- a/net/bridge/br_netfilter_ipv6.c
+++ b/net/bridge/br_netfilter_ipv6.c
@@ -227,7 +227,7 @@ unsigned int br_nf_pre_routing_ipv6(void *priv,
  nf_bridge_put(skb->nf_bridge);
  if (!nf_bridge_alloc(skb))
  return NF_DROP;
- if (!setup_pre_routing(skb))
+ if (!setup_pre_routing(skb, state->net))
  return NF_DROP;
 
  nf_bridge = nf_bridge_info_get(skb);
--
2.20.1


--
kernel-team mailing list
[hidden email]
https://lists.ubuntu.com/mailman/listinfo/kernel-team
Reply | Threaded
Open this post in threaded view
|

[SRU][B][PATCH 4/5] netfilter: bridge: namespace bridge netfilter sysctls

Connor Kuehl
In reply to this post by Connor Kuehl
From: Christian Brauner <[hidden email]>

BugLink: https://bugs.launchpad.net/bugs/1836910

Currently, the /proc/sys/net/bridge folder is only created in the initial
network namespace. This patch ensures that the /proc/sys/net/bridge folder
is available in each network namespace if the module is loaded and
disappears from all network namespaces when the module is unloaded.

In doing so the patch makes the sysctls:

bridge-nf-call-arptables
bridge-nf-call-ip6tables
bridge-nf-call-iptables
bridge-nf-filter-pppoe-tagged
bridge-nf-filter-vlan-tagged
bridge-nf-pass-vlan-input-dev

apply per network namespace. This unblocks some use-cases where users would
like to e.g. not do bridge filtering for bridges in a specific network
namespace while doing so for bridges located in another network namespace.

The netfilter rules are afaict already per network namespace so it should
be safe for users to specify whether bridge devices inside a network
namespace are supposed to go through iptables et al. or not. Also, this can
already be done per-bridge by setting an option for each individual bridge
via Netlink. It should also be possible to do this for all bridges in a
network namespace via sysctls.

Cc: Tyler Hicks <[hidden email]>
Signed-off-by: Christian Brauner <[hidden email]>
Signed-off-by: Pablo Neira Ayuso <[hidden email]>
(cherry picked from commit 22567590b2e634247931b3d2351384ba45720ebe)
Signed-off-by: Connor Kuehl <[hidden email]>
---
 net/bridge/br_netfilter_hooks.c | 117 ++++++++++++++++++++------------
 1 file changed, 72 insertions(+), 45 deletions(-)

diff --git a/net/bridge/br_netfilter_hooks.c b/net/bridge/br_netfilter_hooks.c
index b8a365ade2d4..f12904ab50e5 100644
--- a/net/bridge/br_netfilter_hooks.c
+++ b/net/bridge/br_netfilter_hooks.c
@@ -48,13 +48,13 @@
 
 static unsigned int brnf_net_id __read_mostly;
 
-#ifdef CONFIG_SYSCTL
-static struct ctl_table_header *brnf_sysctl_header;
-#endif
-
 struct brnf_net {
  bool enabled;
 
+#ifdef CONFIG_SYSCTL
+ struct ctl_table_header *ctl_hdr;
+#endif
+
  /* default value is 1 */
  int call_iptables;
  int call_ip6tables;
@@ -984,23 +984,6 @@ static int brnf_device_event(struct notifier_block *unused, unsigned long event,
  return NOTIFY_OK;
 }
 
-static void __net_exit brnf_exit_net(struct net *net)
-{
- struct brnf_net *brnet = net_generic(net, brnf_net_id);
-
- if (!brnet->enabled)
- return;
-
- nf_unregister_net_hooks(net, br_nf_ops, ARRAY_SIZE(br_nf_ops));
- brnet->enabled = false;
-}
-
-static struct pernet_operations brnf_net_ops __read_mostly = {
- .exit = brnf_exit_net,
- .id   = &brnf_net_id,
- .size = sizeof(struct brnf_net),
-};
-
 static struct notifier_block brnf_notifier __read_mostly = {
  .notifier_call = brnf_device_event,
 };
@@ -1106,12 +1089,79 @@ static inline void br_netfilter_sysctl_default(struct brnf_net *brnf)
  brnf->pass_vlan_indev = 0;
 }
 
+static int br_netfilter_sysctl_init_net(struct net *net)
+{
+ struct ctl_table *table = brnf_table;
+ struct brnf_net *brnet;
+
+ if (!net_eq(net, &init_net)) {
+ table = kmemdup(table, sizeof(brnf_table), GFP_KERNEL);
+ if (!table)
+ return -ENOMEM;
+ }
+
+ brnet = net_generic(net, brnf_net_id);
+ table[0].data = &brnet->call_arptables;
+ table[1].data = &brnet->call_iptables;
+ table[2].data = &brnet->call_ip6tables;
+ table[3].data = &brnet->filter_vlan_tagged;
+ table[4].data = &brnet->filter_pppoe_tagged;
+ table[5].data = &brnet->pass_vlan_indev;
+
+ br_netfilter_sysctl_default(brnet);
+
+ brnet->ctl_hdr = register_net_sysctl(net, "net/bridge", table);
+ if (!brnet->ctl_hdr) {
+ if (!net_eq(net, &init_net))
+ kfree(table);
+
+ return -ENOMEM;
+ }
+
+ return 0;
+}
+
+static void br_netfilter_sysctl_exit_net(struct net *net,
+ struct brnf_net *brnet)
+{
+ unregister_net_sysctl_table(brnet->ctl_hdr);
+ if (!net_eq(net, &init_net))
+ kfree(brnet->ctl_hdr->ctl_table_arg);
+}
+
+static int __net_init brnf_init_net(struct net *net)
+{
+ return br_netfilter_sysctl_init_net(net);
+}
+#endif
+
+static void __net_exit brnf_exit_net(struct net *net)
+{
+ struct brnf_net *brnet;
+
+ brnet = net_generic(net, brnf_net_id);
+ if (brnet->enabled) {
+ nf_unregister_net_hooks(net, br_nf_ops, ARRAY_SIZE(br_nf_ops));
+ brnet->enabled = false;
+ }
+
+#ifdef CONFIG_SYSCTL
+ br_netfilter_sysctl_exit_net(net, brnet);
 #endif
+}
+
+static struct pernet_operations brnf_net_ops __read_mostly = {
+#ifdef CONFIG_SYSCTL
+ .init = brnf_init_net,
+#endif
+ .exit = brnf_exit_net,
+ .id   = &brnf_net_id,
+ .size = sizeof(struct brnf_net),
+};
 
 static int __init br_netfilter_init(void)
 {
  int ret;
- struct brnf_net *brnet;
 
  ret = register_pernet_subsys(&brnf_net_ops);
  if (ret < 0)
@@ -1123,26 +1173,6 @@ static int __init br_netfilter_init(void)
  return ret;
  }
 
-#ifdef CONFIG_SYSCTL
- brnet = net_generic(&init_net, brnf_net_id);
- brnf_table[0].data = &brnet->call_arptables;
- brnf_table[1].data = &brnet->call_iptables;
- brnf_table[2].data = &brnet->call_ip6tables;
- brnf_table[3].data = &brnet->filter_vlan_tagged;
- brnf_table[4].data = &brnet->filter_pppoe_tagged;
- brnf_table[5].data = &brnet->pass_vlan_indev;
-
- br_netfilter_sysctl_default(brnet);
-
- brnf_sysctl_header = register_net_sysctl(&init_net, "net/bridge", brnf_table);
- if (brnf_sysctl_header == NULL) {
- printk(KERN_WARNING
-       "br_netfilter: can't register to sysctl.\n");
- unregister_netdevice_notifier(&brnf_notifier);
- unregister_pernet_subsys(&brnf_net_ops);
- return -ENOMEM;
- }
-#endif
  RCU_INIT_POINTER(nf_br_ops, &br_ops);
  printk(KERN_NOTICE "Bridge firewalling registered\n");
  return 0;
@@ -1153,9 +1183,6 @@ static void __exit br_netfilter_fini(void)
  RCU_INIT_POINTER(nf_br_ops, NULL);
  unregister_netdevice_notifier(&brnf_notifier);
  unregister_pernet_subsys(&brnf_net_ops);
-#ifdef CONFIG_SYSCTL
- unregister_net_sysctl_table(brnf_sysctl_header);
-#endif
 }
 
 module_init(br_netfilter_init);
--
2.20.1


--
kernel-team mailing list
[hidden email]
https://lists.ubuntu.com/mailman/listinfo/kernel-team
Reply | Threaded
Open this post in threaded view
|

[SRU][B][PATCH 5/5] netfilter: bridge: prevent UAF in brnf_exit_net()

Connor Kuehl
In reply to this post by Connor Kuehl
From: Christian Brauner <[hidden email]>

BugLink: https://bugs.launchpad.net/bugs/1836910

Prevent a UAF in brnf_exit_net().

When unregister_net_sysctl_table() is called the ctl_hdr pointer will
obviously be freed and so accessing it righter after is invalid. Fix
this by stashing a pointer to the table we want to free before we
unregister the sysctl header.

Note that syzkaller falsely chased this down to the drm tree so the
Fixes tag that syzkaller requested would be wrong. This commit uses a
different but the correct Fixes tag.

/* Splat */

BUG: KASAN: use-after-free in br_netfilter_sysctl_exit_net
net/bridge/br_netfilter_hooks.c:1121 [inline]
BUG: KASAN: use-after-free in brnf_exit_net+0x38c/0x3a0
net/bridge/br_netfilter_hooks.c:1141
Read of size 8 at addr ffff8880a4078d60 by task kworker/u4:4/8749

CPU: 0 PID: 8749 Comm: kworker/u4:4 Not tainted 5.2.0-rc5-next-20190618 #17
Hardware name: Google Google Compute Engine/Google Compute Engine, BIOS Google
01/01/2011
Workqueue: netns cleanup_net
Call Trace:
 __dump_stack lib/dump_stack.c:77 [inline]
 dump_stack+0x172/0x1f0 lib/dump_stack.c:113
 print_address_description.cold+0xd4/0x306 mm/kasan/report.c:351
 __kasan_report.cold+0x1b/0x36 mm/kasan/report.c:482
 kasan_report+0x12/0x20 mm/kasan/common.c:614
 __asan_report_load8_noabort+0x14/0x20 mm/kasan/generic_report.c:132
 br_netfilter_sysctl_exit_net net/bridge/br_netfilter_hooks.c:1121 [inline]
 brnf_exit_net+0x38c/0x3a0 net/bridge/br_netfilter_hooks.c:1141
 ops_exit_list.isra.0+0xaa/0x150 net/core/net_namespace.c:154
 cleanup_net+0x3fb/0x960 net/core/net_namespace.c:553
 process_one_work+0x989/0x1790 kernel/workqueue.c:2269
 worker_thread+0x98/0xe40 kernel/workqueue.c:2415
 kthread+0x354/0x420 kernel/kthread.c:255
 ret_from_fork+0x24/0x30 arch/x86/entry/entry_64.S:352

Allocated by task 11374:
 save_stack+0x23/0x90 mm/kasan/common.c:71
 set_track mm/kasan/common.c:79 [inline]
 __kasan_kmalloc mm/kasan/common.c:489 [inline]
 __kasan_kmalloc.constprop.0+0xcf/0xe0 mm/kasan/common.c:462
 kasan_kmalloc+0x9/0x10 mm/kasan/common.c:503
 __do_kmalloc mm/slab.c:3645 [inline]
 __kmalloc+0x15c/0x740 mm/slab.c:3654
 kmalloc include/linux/slab.h:552 [inline]
 kzalloc include/linux/slab.h:743 [inline]
 __register_sysctl_table+0xc7/0xef0 fs/proc/proc_sysctl.c:1327
 register_net_sysctl+0x29/0x30 net/sysctl_net.c:121
 br_netfilter_sysctl_init_net net/bridge/br_netfilter_hooks.c:1105 [inline]
 brnf_init_net+0x379/0x6a0 net/bridge/br_netfilter_hooks.c:1126
 ops_init+0xb3/0x410 net/core/net_namespace.c:130
 setup_net+0x2d3/0x740 net/core/net_namespace.c:316
 copy_net_ns+0x1df/0x340 net/core/net_namespace.c:439
 create_new_namespaces+0x400/0x7b0 kernel/nsproxy.c:103
 unshare_nsproxy_namespaces+0xc2/0x200 kernel/nsproxy.c:202
 ksys_unshare+0x444/0x980 kernel/fork.c:2822
 __do_sys_unshare kernel/fork.c:2890 [inline]
 __se_sys_unshare kernel/fork.c:2888 [inline]
 __x64_sys_unshare+0x31/0x40 kernel/fork.c:2888
 do_syscall_64+0xfd/0x680 arch/x86/entry/common.c:301
 entry_SYSCALL_64_after_hwframe+0x49/0xbe

Freed by task 9:
 save_stack+0x23/0x90 mm/kasan/common.c:71
 set_track mm/kasan/common.c:79 [inline]
 __kasan_slab_free+0x102/0x150 mm/kasan/common.c:451
 kasan_slab_free+0xe/0x10 mm/kasan/common.c:459
 __cache_free mm/slab.c:3417 [inline]
 kfree+0x10a/0x2c0 mm/slab.c:3746
 __rcu_reclaim kernel/rcu/rcu.h:215 [inline]
 rcu_do_batch kernel/rcu/tree.c:2092 [inline]
 invoke_rcu_callbacks kernel/rcu/tree.c:2310 [inline]
 rcu_core+0xcc7/0x1500 kernel/rcu/tree.c:2291
 __do_softirq+0x25c/0x94c kernel/softirq.c:292

The buggy address belongs to the object at ffff8880a4078d40
 which belongs to the cache kmalloc-512 of size 512
The buggy address is located 32 bytes inside of
 512-byte region [ffff8880a4078d40, ffff8880a4078f40)
The buggy address belongs to the page:
page:ffffea0002901e00 refcount:1 mapcount:0 mapping:ffff8880aa400a80
index:0xffff8880a40785c0
flags: 0x1fffc0000000200(slab)
raw: 01fffc0000000200 ffffea0001d636c8 ffffea0001b07308 ffff8880aa400a80
raw: ffff8880a40785c0 ffff8880a40780c0 0000000100000004 0000000000000000
page dumped because: kasan: bad access detected

Memory state around the buggy address:
 ffff8880a4078c00: fb fb fb fb fb fb fb fb fb fb fb fb fb fb fb fb
 ffff8880a4078c80: fb fb fb fb fb fb fb fb fc fc fc fc fc fc fc fc
> ffff8880a4078d00: fc fc fc fc fc fc fc fc fb fb fb fb fb fb fb fb
                                                       ^
 ffff8880a4078d80: fb fb fb fb fb fb fb fb fb fb fb fb fb fb fb fb
 ffff8880a4078e00: fb fb fb fb fb fb fb fb fb fb fb fb fb fb fb fb

Reported-by: [hidden email]
Fixes: 22567590b2e6 ("netfilter: bridge: namespace bridge netfilter sysctls")
Signed-off-by: Christian Brauner <[hidden email]>
Signed-off-by: Pablo Neira Ayuso <[hidden email]>
(cherry picked from commit 7e6daf50e1f4ea0ecd56406beb64ffc66e1e94db)
Signed-off-by: Connor Kuehl <[hidden email]>
---
 net/bridge/br_netfilter_hooks.c | 4 +++-
 1 file changed, 3 insertions(+), 1 deletion(-)

diff --git a/net/bridge/br_netfilter_hooks.c b/net/bridge/br_netfilter_hooks.c
index f12904ab50e5..c22f1bcb394f 100644
--- a/net/bridge/br_netfilter_hooks.c
+++ b/net/bridge/br_netfilter_hooks.c
@@ -1124,9 +1124,11 @@ static int br_netfilter_sysctl_init_net(struct net *net)
 static void br_netfilter_sysctl_exit_net(struct net *net,
  struct brnf_net *brnet)
 {
+ struct ctl_table *table = brnet->ctl_hdr->ctl_table_arg;
+
  unregister_net_sysctl_table(brnet->ctl_hdr);
  if (!net_eq(net, &init_net))
- kfree(brnet->ctl_hdr->ctl_table_arg);
+ kfree(table);
 }
 
 static int __net_init brnf_init_net(struct net *net)
--
2.20.1


--
kernel-team mailing list
[hidden email]
https://lists.ubuntu.com/mailman/listinfo/kernel-team
Reply | Threaded
Open this post in threaded view
|

[SRU][D][PATCH 1/3] netfilter: bridge: port sysctls to use brnf_net

Connor Kuehl
In reply to this post by Connor Kuehl
From: Christian Brauner <[hidden email]>

BugLink: https://bugs.launchpad.net/bugs/1836910

This ports the sysctls to use struct brnf_net.

With this patch we make it possible to namespace the br_netfilter module in
the following patch.

Signed-off-by: Christian Brauner <[hidden email]>
Signed-off-by: Pablo Neira Ayuso <[hidden email]>
(cherry picked from commit ff6d090d0db41425aef0cfe5dc58bb3cc12514a2)
Signed-off-by: Connor Kuehl <[hidden email]>
---
 include/net/netfilter/br_netfilter.h |   3 +-
 net/bridge/br_netfilter_hooks.c      | 162 +++++++++++++++++----------
 net/bridge/br_netfilter_ipv6.c       |   2 +-
 3 files changed, 107 insertions(+), 60 deletions(-)

diff --git a/include/net/netfilter/br_netfilter.h b/include/net/netfilter/br_netfilter.h
index 89808ce293c4..302fcd3aade2 100644
--- a/include/net/netfilter/br_netfilter.h
+++ b/include/net/netfilter/br_netfilter.h
@@ -42,7 +42,8 @@ static inline struct rtable *bridge_parent_rtable(const struct net_device *dev)
  return port ? &port->br->fake_rtable : NULL;
 }
 
-struct net_device *setup_pre_routing(struct sk_buff *skb);
+struct net_device *setup_pre_routing(struct sk_buff *skb,
+     const struct net *net);
 
 #if IS_ENABLED(CONFIG_IPV6)
 int br_validate_ipv6(struct net *net, struct sk_buff *skb);
diff --git a/net/bridge/br_netfilter_hooks.c b/net/bridge/br_netfilter_hooks.c
index fc605758323b..484c37b762e7 100644
--- a/net/bridge/br_netfilter_hooks.c
+++ b/net/bridge/br_netfilter_hooks.c
@@ -49,27 +49,24 @@
 
 static unsigned int brnf_net_id __read_mostly;
 
-struct brnf_net {
- bool enabled;
-};
-
 #ifdef CONFIG_SYSCTL
 static struct ctl_table_header *brnf_sysctl_header;
-static int brnf_call_iptables __read_mostly = 1;
-static int brnf_call_ip6tables __read_mostly = 1;
-static int brnf_call_arptables __read_mostly = 1;
-static int brnf_filter_vlan_tagged __read_mostly;
-static int brnf_filter_pppoe_tagged __read_mostly;
-static int brnf_pass_vlan_indev __read_mostly;
-#else
-#define brnf_call_iptables 1
-#define brnf_call_ip6tables 1
-#define brnf_call_arptables 1
-#define brnf_filter_vlan_tagged 0
-#define brnf_filter_pppoe_tagged 0
-#define brnf_pass_vlan_indev 0
 #endif
 
+struct brnf_net {
+ bool enabled;
+
+ /* default value is 1 */
+ int call_iptables;
+ int call_ip6tables;
+ int call_arptables;
+
+ /* default value is 0 */
+ int filter_vlan_tagged;
+ int filter_pppoe_tagged;
+ int pass_vlan_indev;
+};
+
 #define IS_IP(skb) \
  (!skb_vlan_tag_present(skb) && skb->protocol == htons(ETH_P_IP))
 
@@ -89,17 +86,28 @@ static inline __be16 vlan_proto(const struct sk_buff *skb)
  return 0;
 }
 
-#define IS_VLAN_IP(skb) \
- (vlan_proto(skb) == htons(ETH_P_IP) && \
- brnf_filter_vlan_tagged)
+static inline bool is_vlan_ip(const struct sk_buff *skb, const struct net *net)
+{
+ struct brnf_net *brnet = net_generic(net, brnf_net_id);
+
+ return vlan_proto(skb) == htons(ETH_P_IP) && brnet->filter_vlan_tagged;
+}
+
+static inline bool is_vlan_ipv6(const struct sk_buff *skb,
+ const struct net *net)
+{
+ struct brnf_net *brnet = net_generic(net, brnf_net_id);
 
-#define IS_VLAN_IPV6(skb) \
- (vlan_proto(skb) == htons(ETH_P_IPV6) && \
- brnf_filter_vlan_tagged)
+ return vlan_proto(skb) == htons(ETH_P_IPV6) &&
+       brnet->filter_vlan_tagged;
+}
 
-#define IS_VLAN_ARP(skb) \
- (vlan_proto(skb) == htons(ETH_P_ARP) && \
- brnf_filter_vlan_tagged)
+static inline bool is_vlan_arp(const struct sk_buff *skb, const struct net *net)
+{
+ struct brnf_net *brnet = net_generic(net, brnf_net_id);
+
+ return vlan_proto(skb) == htons(ETH_P_ARP) && brnet->filter_vlan_tagged;
+}
 
 static inline __be16 pppoe_proto(const struct sk_buff *skb)
 {
@@ -107,15 +115,23 @@ static inline __be16 pppoe_proto(const struct sk_buff *skb)
     sizeof(struct pppoe_hdr)));
 }
 
-#define IS_PPPOE_IP(skb) \
- (skb->protocol == htons(ETH_P_PPP_SES) && \
- pppoe_proto(skb) == htons(PPP_IP) && \
- brnf_filter_pppoe_tagged)
+static inline bool is_pppoe_ip(const struct sk_buff *skb, const struct net *net)
+{
+ struct brnf_net *brnet = net_generic(net, brnf_net_id);
+
+ return skb->protocol == htons(ETH_P_PPP_SES) &&
+       pppoe_proto(skb) == htons(PPP_IP) && brnet->filter_pppoe_tagged;
+}
+
+static inline bool is_pppoe_ipv6(const struct sk_buff *skb,
+ const struct net *net)
+{
+ struct brnf_net *brnet = net_generic(net, brnf_net_id);
 
-#define IS_PPPOE_IPV6(skb) \
- (skb->protocol == htons(ETH_P_PPP_SES) && \
- pppoe_proto(skb) == htons(PPP_IPV6) && \
- brnf_filter_pppoe_tagged)
+ return skb->protocol == htons(ETH_P_PPP_SES) &&
+       pppoe_proto(skb) == htons(PPP_IPV6) &&
+       brnet->filter_pppoe_tagged;
+}
 
 /* largest possible L2 header, see br_nf_dev_queue_xmit() */
 #define NF_BRIDGE_MAX_MAC_HEADER_LENGTH (PPPOE_SES_HLEN + ETH_HLEN)
@@ -412,12 +428,16 @@ static int br_nf_pre_routing_finish(struct net *net, struct sock *sk, struct sk_
  return 0;
 }
 
-static struct net_device *brnf_get_logical_dev(struct sk_buff *skb, const struct net_device *dev)
+static struct net_device *brnf_get_logical_dev(struct sk_buff *skb,
+       const struct net_device *dev,
+       const struct net *net)
 {
  struct net_device *vlan, *br;
+ struct brnf_net *brnet = net_generic(net, brnf_net_id);
 
  br = bridge_parent(dev);
- if (brnf_pass_vlan_indev == 0 || !skb_vlan_tag_present(skb))
+
+ if (brnet->pass_vlan_indev == 0 || !skb_vlan_tag_present(skb))
  return br;
 
  vlan = __vlan_find_dev_deep_rcu(br, skb->vlan_proto,
@@ -427,7 +447,7 @@ static struct net_device *brnf_get_logical_dev(struct sk_buff *skb, const struct
 }
 
 /* Some common code for IPv4/IPv6 */
-struct net_device *setup_pre_routing(struct sk_buff *skb)
+struct net_device *setup_pre_routing(struct sk_buff *skb, const struct net *net)
 {
  struct nf_bridge_info *nf_bridge = nf_bridge_info_get(skb);
 
@@ -438,7 +458,7 @@ struct net_device *setup_pre_routing(struct sk_buff *skb)
 
  nf_bridge->in_prerouting = 1;
  nf_bridge->physindev = skb->dev;
- skb->dev = brnf_get_logical_dev(skb, skb->dev);
+ skb->dev = brnf_get_logical_dev(skb, skb->dev, net);
 
  if (skb->protocol == htons(ETH_P_8021Q))
  nf_bridge->orig_proto = BRNF_PROTO_8021Q;
@@ -464,6 +484,7 @@ static unsigned int br_nf_pre_routing(void *priv,
  struct net_bridge_port *p;
  struct net_bridge *br;
  __u32 len = nf_bridge_encap_header_len(skb);
+ struct brnf_net *brnet;
 
  if (unlikely(!pskb_may_pull(skb, len)))
  return NF_DROP;
@@ -473,8 +494,10 @@ static unsigned int br_nf_pre_routing(void *priv,
  return NF_DROP;
  br = p->br;
 
- if (IS_IPV6(skb) || IS_VLAN_IPV6(skb) || IS_PPPOE_IPV6(skb)) {
- if (!brnf_call_ip6tables &&
+ brnet = net_generic(state->net, brnf_net_id);
+ if (IS_IPV6(skb) || is_vlan_ipv6(skb, state->net) ||
+    is_pppoe_ipv6(skb, state->net)) {
+ if (!brnet->call_ip6tables &&
     !br_opt_get(br, BROPT_NF_CALL_IP6TABLES))
  return NF_ACCEPT;
 
@@ -482,10 +505,11 @@ static unsigned int br_nf_pre_routing(void *priv,
  return br_nf_pre_routing_ipv6(priv, skb, state);
  }
 
- if (!brnf_call_iptables && !br_opt_get(br, BROPT_NF_CALL_IPTABLES))
+ if (!brnet->call_iptables && !br_opt_get(br, BROPT_NF_CALL_IPTABLES))
  return NF_ACCEPT;
 
- if (!IS_IP(skb) && !IS_VLAN_IP(skb) && !IS_PPPOE_IP(skb))
+ if (!IS_IP(skb) && !is_vlan_ip(skb, state->net) &&
+    !is_pppoe_ip(skb, state->net))
  return NF_ACCEPT;
 
  nf_bridge_pull_encap_header_rcsum(skb);
@@ -495,7 +519,7 @@ static unsigned int br_nf_pre_routing(void *priv,
 
  if (!nf_bridge_alloc(skb))
  return NF_DROP;
- if (!setup_pre_routing(skb))
+ if (!setup_pre_routing(skb, state->net))
  return NF_DROP;
 
  nf_bridge = nf_bridge_info_get(skb);
@@ -518,7 +542,7 @@ static int br_nf_forward_finish(struct net *net, struct sock *sk, struct sk_buff
  struct nf_bridge_info *nf_bridge = nf_bridge_info_get(skb);
  struct net_device *in;
 
- if (!IS_ARP(skb) && !IS_VLAN_ARP(skb)) {
+ if (!IS_ARP(skb) && !is_vlan_arp(skb, net)) {
 
  if (skb->protocol == htons(ETH_P_IP))
  nf_bridge->frag_max_size = IPCB(skb)->frag_max_size;
@@ -573,9 +597,11 @@ static unsigned int br_nf_forward_ip(void *priv,
  if (!parent)
  return NF_DROP;
 
- if (IS_IP(skb) || IS_VLAN_IP(skb) || IS_PPPOE_IP(skb))
+ if (IS_IP(skb) || is_vlan_ip(skb, state->net) ||
+    is_pppoe_ip(skb, state->net))
  pf = NFPROTO_IPV4;
- else if (IS_IPV6(skb) || IS_VLAN_IPV6(skb) || IS_PPPOE_IPV6(skb))
+ else if (IS_IPV6(skb) || is_vlan_ipv6(skb, state->net) ||
+ is_pppoe_ipv6(skb, state->net))
  pf = NFPROTO_IPV6;
  else
  return NF_ACCEPT;
@@ -606,7 +632,7 @@ static unsigned int br_nf_forward_ip(void *priv,
  skb->protocol = htons(ETH_P_IPV6);
 
  NF_HOOK(pf, NF_INET_FORWARD, state->net, NULL, skb,
- brnf_get_logical_dev(skb, state->in),
+ brnf_get_logical_dev(skb, state->in, state->net),
  parent, br_nf_forward_finish);
 
  return NF_STOLEN;
@@ -619,23 +645,25 @@ static unsigned int br_nf_forward_arp(void *priv,
  struct net_bridge_port *p;
  struct net_bridge *br;
  struct net_device **d = (struct net_device **)(skb->cb);
+ struct brnf_net *brnet;
 
  p = br_port_get_rcu(state->out);
  if (p == NULL)
  return NF_ACCEPT;
  br = p->br;
 
- if (!brnf_call_arptables && !br_opt_get(br, BROPT_NF_CALL_ARPTABLES))
+ brnet = net_generic(state->net, brnf_net_id);
+ if (!brnet->call_arptables && !br_opt_get(br, BROPT_NF_CALL_ARPTABLES))
  return NF_ACCEPT;
 
  if (!IS_ARP(skb)) {
- if (!IS_VLAN_ARP(skb))
+ if (!is_vlan_arp(skb, state->net))
  return NF_ACCEPT;
  nf_bridge_pull_encap_header(skb);
  }
 
  if (arp_hdr(skb)->ar_pln != 4) {
- if (IS_VLAN_ARP(skb))
+ if (is_vlan_arp(skb, state->net))
  nf_bridge_push_encap_header(skb);
  return NF_ACCEPT;
  }
@@ -795,9 +823,11 @@ static unsigned int br_nf_post_routing(void *priv,
  if (!realoutdev)
  return NF_DROP;
 
- if (IS_IP(skb) || IS_VLAN_IP(skb) || IS_PPPOE_IP(skb))
+ if (IS_IP(skb) || is_vlan_ip(skb, state->net) ||
+    is_pppoe_ip(skb, state->net))
  pf = NFPROTO_IPV4;
- else if (IS_IPV6(skb) || IS_VLAN_IPV6(skb) || IS_PPPOE_IPV6(skb))
+ else if (IS_IPV6(skb) || is_vlan_ipv6(skb, state->net) ||
+ is_pppoe_ipv6(skb, state->net))
  pf = NFPROTO_IPV6;
  else
  return NF_ACCEPT;
@@ -1024,53 +1054,59 @@ int brnf_sysctl_call_tables(struct ctl_table *ctl, int write,
 static struct ctl_table brnf_table[] = {
  {
  .procname = "bridge-nf-call-arptables",
- .data = &brnf_call_arptables,
  .maxlen = sizeof(int),
  .mode = 0644,
  .proc_handler = brnf_sysctl_call_tables,
  },
  {
  .procname = "bridge-nf-call-iptables",
- .data = &brnf_call_iptables,
  .maxlen = sizeof(int),
  .mode = 0644,
  .proc_handler = brnf_sysctl_call_tables,
  },
  {
  .procname = "bridge-nf-call-ip6tables",
- .data = &brnf_call_ip6tables,
  .maxlen = sizeof(int),
  .mode = 0644,
  .proc_handler = brnf_sysctl_call_tables,
  },
  {
  .procname = "bridge-nf-filter-vlan-tagged",
- .data = &brnf_filter_vlan_tagged,
  .maxlen = sizeof(int),
  .mode = 0644,
  .proc_handler = brnf_sysctl_call_tables,
  },
  {
  .procname = "bridge-nf-filter-pppoe-tagged",
- .data = &brnf_filter_pppoe_tagged,
  .maxlen = sizeof(int),
  .mode = 0644,
  .proc_handler = brnf_sysctl_call_tables,
  },
  {
  .procname = "bridge-nf-pass-vlan-input-dev",
- .data = &brnf_pass_vlan_indev,
  .maxlen = sizeof(int),
  .mode = 0644,
  .proc_handler = brnf_sysctl_call_tables,
  },
  { }
 };
+
+static inline void br_netfilter_sysctl_default(struct brnf_net *brnf)
+{
+ brnf->call_iptables = 1;
+ brnf->call_ip6tables = 1;
+ brnf->call_arptables = 1;
+ brnf->filter_vlan_tagged = 0;
+ brnf->filter_pppoe_tagged = 0;
+ brnf->pass_vlan_indev = 0;
+}
+
 #endif
 
 static int __init br_netfilter_init(void)
 {
  int ret;
+ struct brnf_net *brnet;
 
  ret = register_pernet_subsys(&brnf_net_ops);
  if (ret < 0)
@@ -1083,6 +1119,16 @@ static int __init br_netfilter_init(void)
  }
 
 #ifdef CONFIG_SYSCTL
+ brnet = net_generic(&init_net, brnf_net_id);
+ brnf_table[0].data = &brnet->call_arptables;
+ brnf_table[1].data = &brnet->call_iptables;
+ brnf_table[2].data = &brnet->call_ip6tables;
+ brnf_table[3].data = &brnet->filter_vlan_tagged;
+ brnf_table[4].data = &brnet->filter_pppoe_tagged;
+ brnf_table[5].data = &brnet->pass_vlan_indev;
+
+ br_netfilter_sysctl_default(brnet);
+
  brnf_sysctl_header = register_net_sysctl(&init_net, "net/bridge", brnf_table);
  if (brnf_sysctl_header == NULL) {
  printk(KERN_WARNING
diff --git a/net/bridge/br_netfilter_ipv6.c b/net/bridge/br_netfilter_ipv6.c
index e88d6641647b..d77304e4e31a 100644
--- a/net/bridge/br_netfilter_ipv6.c
+++ b/net/bridge/br_netfilter_ipv6.c
@@ -228,7 +228,7 @@ unsigned int br_nf_pre_routing_ipv6(void *priv,
  nf_bridge = nf_bridge_alloc(skb);
  if (!nf_bridge)
  return NF_DROP;
- if (!setup_pre_routing(skb))
+ if (!setup_pre_routing(skb, state->net))
  return NF_DROP;
 
  nf_bridge = nf_bridge_info_get(skb);
--
2.20.1


--
kernel-team mailing list
[hidden email]
https://lists.ubuntu.com/mailman/listinfo/kernel-team
Reply | Threaded
Open this post in threaded view
|

[SRU][D][PATCH 2/3] netfilter: bridge: namespace bridge netfilter sysctls

Connor Kuehl
In reply to this post by Connor Kuehl
From: Christian Brauner <[hidden email]>

BugLink: https://bugs.launchpad.net/bugs/1836910

Currently, the /proc/sys/net/bridge folder is only created in the initial
network namespace. This patch ensures that the /proc/sys/net/bridge folder
is available in each network namespace if the module is loaded and
disappears from all network namespaces when the module is unloaded.

In doing so the patch makes the sysctls:

bridge-nf-call-arptables
bridge-nf-call-ip6tables
bridge-nf-call-iptables
bridge-nf-filter-pppoe-tagged
bridge-nf-filter-vlan-tagged
bridge-nf-pass-vlan-input-dev

apply per network namespace. This unblocks some use-cases where users would
like to e.g. not do bridge filtering for bridges in a specific network
namespace while doing so for bridges located in another network namespace.

The netfilter rules are afaict already per network namespace so it should
be safe for users to specify whether bridge devices inside a network
namespace are supposed to go through iptables et al. or not. Also, this can
already be done per-bridge by setting an option for each individual bridge
via Netlink. It should also be possible to do this for all bridges in a
network namespace via sysctls.

Cc: Tyler Hicks <[hidden email]>
Signed-off-by: Christian Brauner <[hidden email]>
Signed-off-by: Pablo Neira Ayuso <[hidden email]>
(cherry picked from commit 22567590b2e634247931b3d2351384ba45720ebe)
Signed-off-by: Connor Kuehl <[hidden email]>
---
 net/bridge/br_netfilter_hooks.c | 117 ++++++++++++++++++++------------
 1 file changed, 72 insertions(+), 45 deletions(-)

diff --git a/net/bridge/br_netfilter_hooks.c b/net/bridge/br_netfilter_hooks.c
index 484c37b762e7..794834d343fb 100644
--- a/net/bridge/br_netfilter_hooks.c
+++ b/net/bridge/br_netfilter_hooks.c
@@ -49,13 +49,13 @@
 
 static unsigned int brnf_net_id __read_mostly;
 
-#ifdef CONFIG_SYSCTL
-static struct ctl_table_header *brnf_sysctl_header;
-#endif
-
 struct brnf_net {
  bool enabled;
 
+#ifdef CONFIG_SYSCTL
+ struct ctl_table_header *ctl_hdr;
+#endif
+
  /* default value is 1 */
  int call_iptables;
  int call_ip6tables;
@@ -979,23 +979,6 @@ static int brnf_device_event(struct notifier_block *unused, unsigned long event,
  return NOTIFY_OK;
 }
 
-static void __net_exit brnf_exit_net(struct net *net)
-{
- struct brnf_net *brnet = net_generic(net, brnf_net_id);
-
- if (!brnet->enabled)
- return;
-
- nf_unregister_net_hooks(net, br_nf_ops, ARRAY_SIZE(br_nf_ops));
- brnet->enabled = false;
-}
-
-static struct pernet_operations brnf_net_ops __read_mostly = {
- .exit = brnf_exit_net,
- .id   = &brnf_net_id,
- .size = sizeof(struct brnf_net),
-};
-
 static struct notifier_block brnf_notifier __read_mostly = {
  .notifier_call = brnf_device_event,
 };
@@ -1101,12 +1084,79 @@ static inline void br_netfilter_sysctl_default(struct brnf_net *brnf)
  brnf->pass_vlan_indev = 0;
 }
 
+static int br_netfilter_sysctl_init_net(struct net *net)
+{
+ struct ctl_table *table = brnf_table;
+ struct brnf_net *brnet;
+
+ if (!net_eq(net, &init_net)) {
+ table = kmemdup(table, sizeof(brnf_table), GFP_KERNEL);
+ if (!table)
+ return -ENOMEM;
+ }
+
+ brnet = net_generic(net, brnf_net_id);
+ table[0].data = &brnet->call_arptables;
+ table[1].data = &brnet->call_iptables;
+ table[2].data = &brnet->call_ip6tables;
+ table[3].data = &brnet->filter_vlan_tagged;
+ table[4].data = &brnet->filter_pppoe_tagged;
+ table[5].data = &brnet->pass_vlan_indev;
+
+ br_netfilter_sysctl_default(brnet);
+
+ brnet->ctl_hdr = register_net_sysctl(net, "net/bridge", table);
+ if (!brnet->ctl_hdr) {
+ if (!net_eq(net, &init_net))
+ kfree(table);
+
+ return -ENOMEM;
+ }
+
+ return 0;
+}
+
+static void br_netfilter_sysctl_exit_net(struct net *net,
+ struct brnf_net *brnet)
+{
+ unregister_net_sysctl_table(brnet->ctl_hdr);
+ if (!net_eq(net, &init_net))
+ kfree(brnet->ctl_hdr->ctl_table_arg);
+}
+
+static int __net_init brnf_init_net(struct net *net)
+{
+ return br_netfilter_sysctl_init_net(net);
+}
+#endif
+
+static void __net_exit brnf_exit_net(struct net *net)
+{
+ struct brnf_net *brnet;
+
+ brnet = net_generic(net, brnf_net_id);
+ if (brnet->enabled) {
+ nf_unregister_net_hooks(net, br_nf_ops, ARRAY_SIZE(br_nf_ops));
+ brnet->enabled = false;
+ }
+
+#ifdef CONFIG_SYSCTL
+ br_netfilter_sysctl_exit_net(net, brnet);
 #endif
+}
+
+static struct pernet_operations brnf_net_ops __read_mostly = {
+#ifdef CONFIG_SYSCTL
+ .init = brnf_init_net,
+#endif
+ .exit = brnf_exit_net,
+ .id   = &brnf_net_id,
+ .size = sizeof(struct brnf_net),
+};
 
 static int __init br_netfilter_init(void)
 {
  int ret;
- struct brnf_net *brnet;
 
  ret = register_pernet_subsys(&brnf_net_ops);
  if (ret < 0)
@@ -1118,26 +1168,6 @@ static int __init br_netfilter_init(void)
  return ret;
  }
 
-#ifdef CONFIG_SYSCTL
- brnet = net_generic(&init_net, brnf_net_id);
- brnf_table[0].data = &brnet->call_arptables;
- brnf_table[1].data = &brnet->call_iptables;
- brnf_table[2].data = &brnet->call_ip6tables;
- brnf_table[3].data = &brnet->filter_vlan_tagged;
- brnf_table[4].data = &brnet->filter_pppoe_tagged;
- brnf_table[5].data = &brnet->pass_vlan_indev;
-
- br_netfilter_sysctl_default(brnet);
-
- brnf_sysctl_header = register_net_sysctl(&init_net, "net/bridge", brnf_table);
- if (brnf_sysctl_header == NULL) {
- printk(KERN_WARNING
-       "br_netfilter: can't register to sysctl.\n");
- unregister_netdevice_notifier(&brnf_notifier);
- unregister_pernet_subsys(&brnf_net_ops);
- return -ENOMEM;
- }
-#endif
  RCU_INIT_POINTER(nf_br_ops, &br_ops);
  printk(KERN_NOTICE "Bridge firewalling registered\n");
  return 0;
@@ -1148,9 +1178,6 @@ static void __exit br_netfilter_fini(void)
  RCU_INIT_POINTER(nf_br_ops, NULL);
  unregister_netdevice_notifier(&brnf_notifier);
  unregister_pernet_subsys(&brnf_net_ops);
-#ifdef CONFIG_SYSCTL
- unregister_net_sysctl_table(brnf_sysctl_header);
-#endif
 }
 
 module_init(br_netfilter_init);
--
2.20.1


--
kernel-team mailing list
[hidden email]
https://lists.ubuntu.com/mailman/listinfo/kernel-team
Reply | Threaded
Open this post in threaded view
|

[SRU][D][PATCH 3/3] netfilter: bridge: prevent UAF in brnf_exit_net()

Connor Kuehl
In reply to this post by Connor Kuehl
From: Christian Brauner <[hidden email]>

BugLink: https://bugs.launchpad.net/bugs/1836910

Prevent a UAF in brnf_exit_net().

When unregister_net_sysctl_table() is called the ctl_hdr pointer will
obviously be freed and so accessing it righter after is invalid. Fix
this by stashing a pointer to the table we want to free before we
unregister the sysctl header.

Note that syzkaller falsely chased this down to the drm tree so the
Fixes tag that syzkaller requested would be wrong. This commit uses a
different but the correct Fixes tag.

/* Splat */

BUG: KASAN: use-after-free in br_netfilter_sysctl_exit_net
net/bridge/br_netfilter_hooks.c:1121 [inline]
BUG: KASAN: use-after-free in brnf_exit_net+0x38c/0x3a0
net/bridge/br_netfilter_hooks.c:1141
Read of size 8 at addr ffff8880a4078d60 by task kworker/u4:4/8749

CPU: 0 PID: 8749 Comm: kworker/u4:4 Not tainted 5.2.0-rc5-next-20190618 #17
Hardware name: Google Google Compute Engine/Google Compute Engine, BIOS Google
01/01/2011
Workqueue: netns cleanup_net
Call Trace:
 __dump_stack lib/dump_stack.c:77 [inline]
 dump_stack+0x172/0x1f0 lib/dump_stack.c:113
 print_address_description.cold+0xd4/0x306 mm/kasan/report.c:351
 __kasan_report.cold+0x1b/0x36 mm/kasan/report.c:482
 kasan_report+0x12/0x20 mm/kasan/common.c:614
 __asan_report_load8_noabort+0x14/0x20 mm/kasan/generic_report.c:132
 br_netfilter_sysctl_exit_net net/bridge/br_netfilter_hooks.c:1121 [inline]
 brnf_exit_net+0x38c/0x3a0 net/bridge/br_netfilter_hooks.c:1141
 ops_exit_list.isra.0+0xaa/0x150 net/core/net_namespace.c:154
 cleanup_net+0x3fb/0x960 net/core/net_namespace.c:553
 process_one_work+0x989/0x1790 kernel/workqueue.c:2269
 worker_thread+0x98/0xe40 kernel/workqueue.c:2415
 kthread+0x354/0x420 kernel/kthread.c:255
 ret_from_fork+0x24/0x30 arch/x86/entry/entry_64.S:352

Allocated by task 11374:
 save_stack+0x23/0x90 mm/kasan/common.c:71
 set_track mm/kasan/common.c:79 [inline]
 __kasan_kmalloc mm/kasan/common.c:489 [inline]
 __kasan_kmalloc.constprop.0+0xcf/0xe0 mm/kasan/common.c:462
 kasan_kmalloc+0x9/0x10 mm/kasan/common.c:503
 __do_kmalloc mm/slab.c:3645 [inline]
 __kmalloc+0x15c/0x740 mm/slab.c:3654
 kmalloc include/linux/slab.h:552 [inline]
 kzalloc include/linux/slab.h:743 [inline]
 __register_sysctl_table+0xc7/0xef0 fs/proc/proc_sysctl.c:1327
 register_net_sysctl+0x29/0x30 net/sysctl_net.c:121
 br_netfilter_sysctl_init_net net/bridge/br_netfilter_hooks.c:1105 [inline]
 brnf_init_net+0x379/0x6a0 net/bridge/br_netfilter_hooks.c:1126
 ops_init+0xb3/0x410 net/core/net_namespace.c:130
 setup_net+0x2d3/0x740 net/core/net_namespace.c:316
 copy_net_ns+0x1df/0x340 net/core/net_namespace.c:439
 create_new_namespaces+0x400/0x7b0 kernel/nsproxy.c:103
 unshare_nsproxy_namespaces+0xc2/0x200 kernel/nsproxy.c:202
 ksys_unshare+0x444/0x980 kernel/fork.c:2822
 __do_sys_unshare kernel/fork.c:2890 [inline]
 __se_sys_unshare kernel/fork.c:2888 [inline]
 __x64_sys_unshare+0x31/0x40 kernel/fork.c:2888
 do_syscall_64+0xfd/0x680 arch/x86/entry/common.c:301
 entry_SYSCALL_64_after_hwframe+0x49/0xbe

Freed by task 9:
 save_stack+0x23/0x90 mm/kasan/common.c:71
 set_track mm/kasan/common.c:79 [inline]
 __kasan_slab_free+0x102/0x150 mm/kasan/common.c:451
 kasan_slab_free+0xe/0x10 mm/kasan/common.c:459
 __cache_free mm/slab.c:3417 [inline]
 kfree+0x10a/0x2c0 mm/slab.c:3746
 __rcu_reclaim kernel/rcu/rcu.h:215 [inline]
 rcu_do_batch kernel/rcu/tree.c:2092 [inline]
 invoke_rcu_callbacks kernel/rcu/tree.c:2310 [inline]
 rcu_core+0xcc7/0x1500 kernel/rcu/tree.c:2291
 __do_softirq+0x25c/0x94c kernel/softirq.c:292

The buggy address belongs to the object at ffff8880a4078d40
 which belongs to the cache kmalloc-512 of size 512
The buggy address is located 32 bytes inside of
 512-byte region [ffff8880a4078d40, ffff8880a4078f40)
The buggy address belongs to the page:
page:ffffea0002901e00 refcount:1 mapcount:0 mapping:ffff8880aa400a80
index:0xffff8880a40785c0
flags: 0x1fffc0000000200(slab)
raw: 01fffc0000000200 ffffea0001d636c8 ffffea0001b07308 ffff8880aa400a80
raw: ffff8880a40785c0 ffff8880a40780c0 0000000100000004 0000000000000000
page dumped because: kasan: bad access detected

Memory state around the buggy address:
 ffff8880a4078c00: fb fb fb fb fb fb fb fb fb fb fb fb fb fb fb fb
 ffff8880a4078c80: fb fb fb fb fb fb fb fb fc fc fc fc fc fc fc fc
> ffff8880a4078d00: fc fc fc fc fc fc fc fc fb fb fb fb fb fb fb fb
                                                       ^
 ffff8880a4078d80: fb fb fb fb fb fb fb fb fb fb fb fb fb fb fb fb
 ffff8880a4078e00: fb fb fb fb fb fb fb fb fb fb fb fb fb fb fb fb

Reported-by: [hidden email]
Fixes: 22567590b2e6 ("netfilter: bridge: namespace bridge netfilter sysctls")
Signed-off-by: Christian Brauner <[hidden email]>
Signed-off-by: Pablo Neira Ayuso <[hidden email]>
(cherry picked from commit 7e6daf50e1f4ea0ecd56406beb64ffc66e1e94db)
Signed-off-by: Connor Kuehl <[hidden email]>
---
 net/bridge/br_netfilter_hooks.c | 4 +++-
 1 file changed, 3 insertions(+), 1 deletion(-)

diff --git a/net/bridge/br_netfilter_hooks.c b/net/bridge/br_netfilter_hooks.c
index 794834d343fb..e7e65c57a927 100644
--- a/net/bridge/br_netfilter_hooks.c
+++ b/net/bridge/br_netfilter_hooks.c
@@ -1119,9 +1119,11 @@ static int br_netfilter_sysctl_init_net(struct net *net)
 static void br_netfilter_sysctl_exit_net(struct net *net,
  struct brnf_net *brnet)
 {
+ struct ctl_table *table = brnet->ctl_hdr->ctl_table_arg;
+
  unregister_net_sysctl_table(brnet->ctl_hdr);
  if (!net_eq(net, &init_net))
- kfree(brnet->ctl_hdr->ctl_table_arg);
+ kfree(table);
 }
 
 static int __net_init brnf_init_net(struct net *net)
--
2.20.1


--
kernel-team mailing list
[hidden email]
https://lists.ubuntu.com/mailman/listinfo/kernel-team
Reply | Threaded
Open this post in threaded view
|

APPLIED[E]/cmt: [SRU][B/D] Ensure /proc/sys/net/bridge folders (dis)appear appropriately

Seth Forshee
In reply to this post by Connor Kuehl
+Cc Christian.

On Thu, Jul 25, 2019 at 05:20:46PM -0700, Connor Kuehl wrote:

> Note: Bionic required two additional patches in order for these to apply cleanly, one
> of which required minor backporting to use the updated wrappers/symbols.
>
> BugLink: https://bugs.launchpad.net/bugs/1836910
>
> Justification taken from the link above ^
>
> SRU Justification
>
> Impact: Currently, the /proc/sys/net/bridge folder is only created in the initial network namespace.
> This blocks use-cases where users would like to e.g. not do bridge filtering for bridges in a specific
> network namespace while doing so for bridges located in another network namespace.
>
> Fix: The patches linked below ensure that the /proc/sys/net/bridge folder is available in each network
> namespace if the module is loaded and disappears from all network namespaces when the module is unloaded.
>
> In doing so the patch makes the sysctls:
>
> bridge-nf-call-arptables
> bridge-nf-call-ip6tables
> bridge-nf-call-iptables
> bridge-nf-filter-pppoe-tagged
> bridge-nf-filter-vlan-tagged
> bridge-nf-pass-vlan-input-dev
>
> apply per network namespace.
>
> Regression Potential: None, since this didn't use to work before. Otherwise limited to the br_netfilter module.
> The netfilter rules are afaict already per network namespace so it should be safe for users to specify whether
> bridge devices inside a network namespace are supposed to go through iptables et al. or not. Also, this can
> already be done per-bridge by setting an option for each individual bridge via Netlink. It should also be
> possible to do this for all bridges in a network namespace via sysctls.
>
> Test Case: Tested with LXD on a kernel with the patches applied and per-network namespace iptables.

I don't agree with the regression potential here, which is particularly
ironic given that the final patch fixes a user-after-free introduced by
the patch before it (I'm sitting next to Christian as I type this and
have already given him a hard time about that one). They are also fairly
new, being upstream as of 5.3-rc1. So I think we need a better statement
of regression potential and what kind of regression testing as been done
before considering them for SRU.

Applied to eoan/master-next.

Thanks,
Seth

--
kernel-team mailing list
[hidden email]
https://lists.ubuntu.com/mailman/listinfo/kernel-team
Reply | Threaded
Open this post in threaded view
|

Re: APPLIED[E]/cmt: [SRU][B/D] Ensure /proc/sys/net/bridge folders (dis)appear appropriately

Christian Brauner-2
On Tue, Jul 30, 2019 at 12:01:08PM -0400, Seth Forshee wrote:

> +Cc Christian.
>
> On Thu, Jul 25, 2019 at 05:20:46PM -0700, Connor Kuehl wrote:
> > Note: Bionic required two additional patches in order for these to apply cleanly, one
> > of which required minor backporting to use the updated wrappers/symbols.
> >
> > BugLink: https://bugs.launchpad.net/bugs/1836910
> >
> > Justification taken from the link above ^
> >
> > SRU Justification
> >
> > Impact: Currently, the /proc/sys/net/bridge folder is only created in the initial network namespace.
> > This blocks use-cases where users would like to e.g. not do bridge filtering for bridges in a specific
> > network namespace while doing so for bridges located in another network namespace.
> >
> > Fix: The patches linked below ensure that the /proc/sys/net/bridge folder is available in each network
> > namespace if the module is loaded and disappears from all network namespaces when the module is unloaded.
> >
> > In doing so the patch makes the sysctls:
> >
> > bridge-nf-call-arptables
> > bridge-nf-call-ip6tables
> > bridge-nf-call-iptables
> > bridge-nf-filter-pppoe-tagged
> > bridge-nf-filter-vlan-tagged
> > bridge-nf-pass-vlan-input-dev
> >
> > apply per network namespace.
> >
> > Regression Potential: None, since this didn't use to work before. Otherwise limited to the br_netfilter module.
> > The netfilter rules are afaict already per network namespace so it should be safe for users to specify whether
> > bridge devices inside a network namespace are supposed to go through iptables et al. or not. Also, this can
> > already be done per-bridge by setting an option for each individual bridge via Netlink. It should also be
> > possible to do this for all bridges in a network namespace via sysctls.
> >
> > Test Case: Tested with LXD on a kernel with the patches applied and per-network namespace iptables.
>
> I don't agree with the regression potential here, which is particularly
> ironic given that the final patch fixes a user-after-free introduced by
> the patch before it (I'm sitting next to Christian as I type this and
> have already given him a hard time about that one). They are also fairly

I'm very happy that I only received a Not-Yet-Nacked-by which I vote
should be a new standard DCO-type.

> new, being upstream as of 5.3-rc1. So I think we need a better statement
> of regression potential and what kind of regression testing as been done
> before considering them for SRU.

I expanded the regression potential section and hopefully that'll turn
this into a Not-Nacked-by at least. ;)

>
> Applied to eoan/master-next.

Thanks!
Christian

--
kernel-team mailing list
[hidden email]
https://lists.ubuntu.com/mailman/listinfo/kernel-team
Reply | Threaded
Open this post in threaded view
|

ACK: APPLIED[E]/cmt: [SRU][B/D] Ensure /proc/sys/net/bridge folders (dis)appear appropriately

Seth Forshee
On Wed, Jul 31, 2019 at 08:18:50PM +0200, Christian Brauner wrote:

> On Tue, Jul 30, 2019 at 12:01:08PM -0400, Seth Forshee wrote:
> > +Cc Christian.
> >
> > On Thu, Jul 25, 2019 at 05:20:46PM -0700, Connor Kuehl wrote:
> > > Note: Bionic required two additional patches in order for these to apply cleanly, one
> > > of which required minor backporting to use the updated wrappers/symbols.
> > >
> > > BugLink: https://bugs.launchpad.net/bugs/1836910
> > >
> > > Justification taken from the link above ^
> > >
> > > SRU Justification
> > >
> > > Impact: Currently, the /proc/sys/net/bridge folder is only created in the initial network namespace.
> > > This blocks use-cases where users would like to e.g. not do bridge filtering for bridges in a specific
> > > network namespace while doing so for bridges located in another network namespace.
> > >
> > > Fix: The patches linked below ensure that the /proc/sys/net/bridge folder is available in each network
> > > namespace if the module is loaded and disappears from all network namespaces when the module is unloaded.
> > >
> > > In doing so the patch makes the sysctls:
> > >
> > > bridge-nf-call-arptables
> > > bridge-nf-call-ip6tables
> > > bridge-nf-call-iptables
> > > bridge-nf-filter-pppoe-tagged
> > > bridge-nf-filter-vlan-tagged
> > > bridge-nf-pass-vlan-input-dev
> > >
> > > apply per network namespace.
> > >
> > > Regression Potential: None, since this didn't use to work before. Otherwise limited to the br_netfilter module.
> > > The netfilter rules are afaict already per network namespace so it should be safe for users to specify whether
> > > bridge devices inside a network namespace are supposed to go through iptables et al. or not. Also, this can
> > > already be done per-bridge by setting an option for each individual bridge via Netlink. It should also be
> > > possible to do this for all bridges in a network namespace via sysctls.
> > >
> > > Test Case: Tested with LXD on a kernel with the patches applied and per-network namespace iptables.
> >
> > I don't agree with the regression potential here, which is particularly
> > ironic given that the final patch fixes a user-after-free introduced by
> > the patch before it (I'm sitting next to Christian as I type this and
> > have already given him a hard time about that one). They are also fairly
>
> I'm very happy that I only received a Not-Yet-Nacked-by which I vote
> should be a new standard DCO-type.
>
> > new, being upstream as of 5.3-rc1. So I think we need a better statement
> > of regression potential and what kind of regression testing as been done
> > before considering them for SRU.
>
> I expanded the regression potential section and hopefully that'll turn
> this into a Not-Nacked-by at least. ;)

I can do better than that ;-)

Thanks for improving the SRU justification. Based on the testing that's
been done and the fact that I've seen no test regressions in eoan:

Acked-by: Seth Forshee <[hidden email]>

--
kernel-team mailing list
[hidden email]
https://lists.ubuntu.com/mailman/listinfo/kernel-team
Reply | Threaded
Open this post in threaded view
|

ACK/Cmnt: [SRU][D] Ensure /proc/sys/net/bridge folders (dis)appear appropriately

Stefan Bader-2
In reply to this post by Connor Kuehl
On 26.07.19 02:20, Connor Kuehl wrote:

> Note: Bionic required two additional patches in order for these to apply cleanly, one
> of which required minor backporting to use the updated wrappers/symbols.
>
> BugLink: https://bugs.launchpad.net/bugs/1836910
>
> Justification taken from the link above ^
>
> SRU Justification
>
> Impact: Currently, the /proc/sys/net/bridge folder is only created in the initial network namespace.
> This blocks use-cases where users would like to e.g. not do bridge filtering for bridges in a specific
> network namespace while doing so for bridges located in another network namespace.
>
> Fix: The patches linked below ensure that the /proc/sys/net/bridge folder is available in each network
> namespace if the module is loaded and disappears from all network namespaces when the module is unloaded.
>
> In doing so the patch makes the sysctls:
>
> bridge-nf-call-arptables
> bridge-nf-call-ip6tables
> bridge-nf-call-iptables
> bridge-nf-filter-pppoe-tagged
> bridge-nf-filter-vlan-tagged
> bridge-nf-pass-vlan-input-dev
>
> apply per network namespace.
>
> Regression Potential: None, since this didn't use to work before. Otherwise limited to the br_netfilter module.
> The netfilter rules are afaict already per network namespace so it should be safe for users to specify whether
> bridge devices inside a network namespace are supposed to go through iptables et al. or not. Also, this can
> already be done per-bridge by setting an option for each individual bridge via Netlink. It should also be
> possible to do this for all bridges in a network namespace via sysctls.
>
> Test Case: Tested with LXD on a kernel with the patches applied and per-network namespace iptables.
>

Since this is quite a bit of change and I agree with Seth disagreeing about the
regression potential. Especially since in the Bionic case there is change to the
generic bridge code. Can this cause issues with user-space tools? Like iproute2?

So for now only for Disco:

Acked-by: Stefan Bader <[hidden email]>

--
kernel-team mailing list
[hidden email]
https://lists.ubuntu.com/mailman/listinfo/kernel-team
Reply | Threaded
Open this post in threaded view
|

Re: ACK/Cmnt: [SRU][D] Ensure /proc/sys/net/bridge folders (dis)appear appropriately

Christian Brauner-2
The whole regression potential section has been completely reworked as requested by Seth.
That should be pretty obvious from the updated launchpad BugLink.

On August 12, 2019 3:48:44 PM GMT+02:00, Stefan Bader <[hidden email]> wrote:
On 26.07.19 02:20, Connor Kuehl wrote:
Note: Bionic required two additional patches in order for these to apply cleanly, one
of which required minor backporting to use the updated wrappers/symbols.

BugLink: https://bugs.launchpad.net/bugs/1836910

Justification taken from the link above ^

SRU Justification

Impact: Currently, the /proc/sys/net/bridge folder is only created in the initial network namespace.
This blocks use-cases where users would like to e.g. not do bridge filtering for bridges in a specific
network namespace while doing so for bridges located in another network namespace.

Fix: The patches linked below ensure that the /proc/sys/net/bridge folder is available in each network
namespace if the module is loaded and disappears from all network namespaces when the module is unloaded.

In doing so the patch makes the sysctls:

bridge-nf-call-arptables
bridge-nf-call-ip6tables
bridge-nf-call-iptables
bridge-nf-filter-pppoe-tagged
bridge-nf-filter-vlan-tagged
bridge-nf-pass-vlan-input-dev

apply per network namespace.

Regression Potential: None, since this didn't use to work before. Otherwise limited to the br_netfilter module.
The netfilter rules are afaict already per network namespace so it should be safe for users to specify whether
bridge devices inside a network namespace are supposed to go through iptables et al. or not. Also, this can
already be done per-bridge by setting an option for each individual bridge via Netlink. It should also be
possible to do this for all bridges in a network namespace via sysctls.

Test Case: Tested with LXD on a kernel with the patches applied and per-network namespace iptables.


Since this is quite a bit of change and I agree with Seth disagreeing about the
regression potential. Especially since in the Bionic case there is change to the
generic bridge code. Can this cause issues with user-space tools? Like iproute2?

So for now only for Disco:

Acked-by: Stefan Bader <[hidden email]>

--
Sent from my Android device with K-9 Mail. Please excuse my brevity.
--
kernel-team mailing list
[hidden email]
https://lists.ubuntu.com/mailman/listinfo/kernel-team
Reply | Threaded
Open this post in threaded view
|

APPLIED(D)/cmt: [SRU][B/D] Ensure /proc/sys/net/bridge folders (dis)appear appropriately

Khaled Elmously
In reply to this post by Connor Kuehl
Applied to D only based on Stefan's cmt



On 2019-07-25 17:20:46 , Connor Kuehl wrote:

> Note: Bionic required two additional patches in order for these to apply cleanly, one
> of which required minor backporting to use the updated wrappers/symbols.
>
> BugLink: https://bugs.launchpad.net/bugs/1836910
>
> Justification taken from the link above ^
>
> SRU Justification
>
> Impact: Currently, the /proc/sys/net/bridge folder is only created in the initial network namespace.
> This blocks use-cases where users would like to e.g. not do bridge filtering for bridges in a specific
> network namespace while doing so for bridges located in another network namespace.
>
> Fix: The patches linked below ensure that the /proc/sys/net/bridge folder is available in each network
> namespace if the module is loaded and disappears from all network namespaces when the module is unloaded.
>
> In doing so the patch makes the sysctls:
>
> bridge-nf-call-arptables
> bridge-nf-call-ip6tables
> bridge-nf-call-iptables
> bridge-nf-filter-pppoe-tagged
> bridge-nf-filter-vlan-tagged
> bridge-nf-pass-vlan-input-dev
>
> apply per network namespace.
>
> Regression Potential: None, since this didn't use to work before. Otherwise limited to the br_netfilter module.
> The netfilter rules are afaict already per network namespace so it should be safe for users to specify whether
> bridge devices inside a network namespace are supposed to go through iptables et al. or not. Also, this can
> already be done per-bridge by setting an option for each individual bridge via Netlink. It should also be
> possible to do this for all bridges in a network namespace via sysctls.
>
> Test Case: Tested with LXD on a kernel with the patches applied and per-network namespace iptables.
>
> --
> kernel-team mailing list
> [hidden email]
> https://lists.ubuntu.com/mailman/listinfo/kernel-team

--
kernel-team mailing list
[hidden email]
https://lists.ubuntu.com/mailman/listinfo/kernel-team
Reply | Threaded
Open this post in threaded view
|

Re: [SRU][B/D] Ensure /proc/sys/net/bridge folders (dis)appear appropriately

Connor Kuehl
In reply to this post by Connor Kuehl
On 7/25/19 5:20 PM, Connor Kuehl wrote:

> Note: Bionic required two additional patches in order for these to apply cleanly, one
> of which required minor backporting to use the updated wrappers/symbols.
>
> BugLink: https://bugs.launchpad.net/bugs/1836910
>
> Justification taken from the link above ^
>
> SRU Justification
>
> Impact: Currently, the /proc/sys/net/bridge folder is only created in the initial network namespace.
> This blocks use-cases where users would like to e.g. not do bridge filtering for bridges in a specific
> network namespace while doing so for bridges located in another network namespace.
>
> Fix: The patches linked below ensure that the /proc/sys/net/bridge folder is available in each network
> namespace if the module is loaded and disappears from all network namespaces when the module is unloaded.
>
> In doing so the patch makes the sysctls:
>
> bridge-nf-call-arptables
> bridge-nf-call-ip6tables
> bridge-nf-call-iptables
> bridge-nf-filter-pppoe-tagged
> bridge-nf-filter-vlan-tagged
> bridge-nf-pass-vlan-input-dev
>
> apply per network namespace.
>
> Regression Potential: None, since this didn't use to work before. Otherwise limited to the br_netfilter module.
> The netfilter rules are afaict already per network namespace so it should be safe for users to specify whether
> bridge devices inside a network namespace are supposed to go through iptables et al. or not. Also, this can
> already be done per-bridge by setting an option for each individual bridge via Netlink. It should also be
> possible to do this for all bridges in a network namespace via sysctls.
>
> Test Case: Tested with LXD on a kernel with the patches applied and per-network namespace iptables.
>

Ping for SRU consideration into Bionic. Updated regression potential
from the bug report below:

"Regression Potential: Low since it is limited to the br_netfilter
module. I tested the patchset extensively by compiling a kernel with the
patches applied. I loaded and unloaded the module and verified that it
works correctly for the container usecase and does not crash. The Google
ChromeOS team has also backported this patchset to their kernel and has
not seen any issues so far:
https://bugs.chromium.org/p/chromium/issues/detail?id=878034
Security considerations around netfilter rules are also low. The
netfilter rules are already per network namespace so it should be safe
for users to specify whether bridge devices inside a network namespace
are supposed to go through iptables et al. or not. Also, this can
already be done per-bridge by setting an option for each individual
bridge via Netlink. It should also be possible to do this for all
bridges in a network namespace via sysctls."

--
kernel-team mailing list
[hidden email]
https://lists.ubuntu.com/mailman/listinfo/kernel-team
Reply | Threaded
Open this post in threaded view
|

ACK/cmnt: [SRU][B/D] Ensure /proc/sys/net/bridge folders (dis)appear appropriately

Kleber Souza
In reply to this post by Connor Kuehl
On 7/26/19 2:20 AM, Connor Kuehl wrote:

> Note: Bionic required two additional patches in order for these to apply cleanly, one
> of which required minor backporting to use the updated wrappers/symbols.
>
> BugLink: https://bugs.launchpad.net/bugs/1836910
>
> Justification taken from the link above ^
>
> SRU Justification
>
> Impact: Currently, the /proc/sys/net/bridge folder is only created in the initial network namespace.
> This blocks use-cases where users would like to e.g. not do bridge filtering for bridges in a specific
> network namespace while doing so for bridges located in another network namespace.
>
> Fix: The patches linked below ensure that the /proc/sys/net/bridge folder is available in each network
> namespace if the module is loaded and disappears from all network namespaces when the module is unloaded.
>
> In doing so the patch makes the sysctls:
>
> bridge-nf-call-arptables
> bridge-nf-call-ip6tables
> bridge-nf-call-iptables
> bridge-nf-filter-pppoe-tagged
> bridge-nf-filter-vlan-tagged
> bridge-nf-pass-vlan-input-dev
>
> apply per network namespace.
>
> Regression Potential: None, since this didn't use to work before. Otherwise limited to the br_netfilter module.
> The netfilter rules are afaict already per network namespace so it should be safe for users to specify whether
> bridge devices inside a network namespace are supposed to go through iptables et al. or not. Also, this can
> already be done per-bridge by setting an option for each individual bridge via Netlink. It should also be
> possible to do this for all bridges in a network namespace via sysctls.
>
> Test Case: Tested with LXD on a kernel with the patches applied and per-network namespace iptables.
>

Based on the additional information provided regarding the regression
potential and testcases, and that the patches have been applied to Disco
and no bug has been reported, I believe it's OK to apply these patches
for Bionic now.

Acked-by: Kleber Sacilotto de Souza <[hidden email]>

--
kernel-team mailing list
[hidden email]
https://lists.ubuntu.com/mailman/listinfo/kernel-team
Reply | Threaded
Open this post in threaded view
|

APPLIED[B]/cmnt: [SRU][B/D] Ensure /proc/sys/net/bridge folders (dis)appear appropriately

Kleber Souza
In reply to this post by Connor Kuehl
On 7/26/19 2:20 AM, Connor Kuehl wrote:

> Note: Bionic required two additional patches in order for these to apply cleanly, one
> of which required minor backporting to use the updated wrappers/symbols.
>
> BugLink: https://bugs.launchpad.net/bugs/1836910
>
> Justification taken from the link above ^
>
> SRU Justification
>
> Impact: Currently, the /proc/sys/net/bridge folder is only created in the initial network namespace.
> This blocks use-cases where users would like to e.g. not do bridge filtering for bridges in a specific
> network namespace while doing so for bridges located in another network namespace.
>
> Fix: The patches linked below ensure that the /proc/sys/net/bridge folder is available in each network
> namespace if the module is loaded and disappears from all network namespaces when the module is unloaded.
>
> In doing so the patch makes the sysctls:
>
> bridge-nf-call-arptables
> bridge-nf-call-ip6tables
> bridge-nf-call-iptables
> bridge-nf-filter-pppoe-tagged
> bridge-nf-filter-vlan-tagged
> bridge-nf-pass-vlan-input-dev
>
> apply per network namespace.
>
> Regression Potential: None, since this didn't use to work before. Otherwise limited to the br_netfilter module.
> The netfilter rules are afaict already per network namespace so it should be safe for users to specify whether
> bridge devices inside a network namespace are supposed to go through iptables et al. or not. Also, this can
> already be done per-bridge by setting an option for each individual bridge via Netlink. It should also be
> possible to do this for all bridges in a network namespace via sysctls.
>
> Test Case: Tested with LXD on a kernel with the patches applied and per-network namespace iptables.
>

Applied to bionic/master-next branch, with some context adjustments to the
following patch due do changes applied in the meantime:

- netfilter: bridge: port sysctls to use brnf_net


Thanks,
Kleber

--
kernel-team mailing list
[hidden email]
https://lists.ubuntu.com/mailman/listinfo/kernel-team