-
Notifications
You must be signed in to change notification settings - Fork 91
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
add bypass logic for kmesh #297
Conversation
@@ -190,6 +293,16 @@ func checkSidecar(client kubernetes.Interface, pod *corev1.Pod) (bool, error) { | |||
return false, nil | |||
} | |||
|
|||
func checkKmesh(pod *corev1.Pod) (bool, error) { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Is the error here a redundant return value?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Is the error here a redundant return value?
done
bpf/kmesh/cgroup_sock.c
Outdated
static inline bool conn_from_bypass_sim_add(struct bpf_sock_addr *ctx) | ||
{ | ||
// daemon sim connect 0.0.0.0:931(0x3a3) | ||
// 0x3a3 is the specific port handled by the daemon for enable Kmesh |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Annotation Error.
It's not to mark enable kmesh but enable bypass.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Annotation Error. It's not to mark enable kmesh but enable bypass.
I've changed it.
bpf/kmesh/cgroup_sock.c
Outdated
static inline bool conn_from_bypass_sim_delete(struct bpf_sock_addr *ctx) | ||
{ | ||
// daemon sim connect 0.0.0.1:932(0x3a4) | ||
// 0x3a4 is the specific port handled by the daemon for disable Kmesh |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Same as above.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Same as above.
done
bpf/kmesh/cgroup_sock.c
Outdated
static inline int sock4_traffic_control(struct bpf_sock_addr *ctx) | ||
{ | ||
int ret; | ||
|
||
Listener__Listener *listener = NULL; | ||
|
||
if (check_bypass_enabled(ctx)) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Why does the system check whether the bypass function is enabled before checking whether the kmesh is enabled? Most connections in the system are not manager by kmesh. Check whether bypass reduces the rate of traffic that is not taken over by kmesh.
bpf/kmesh/workload/cgroup_sock.c
Outdated
return bpf_map_lookup_elem(&map_of_bypass, &cookie); | ||
} | ||
|
||
static inline void record_bypass_netns_cookie(struct bpf_sock_addr *ctx) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The logic seems to be consistent with record_netns_cookie? Can they be combined into one function?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The logic seems to be consistent with record_netns_cookie? Can they be combined into one function?
I've extract the function
bpf/kmesh/workload/cgroup_sock.c
Outdated
BPF_LOG(ERR, KMESH, "record netcookie failed!, err is %d\n", err); | ||
} | ||
|
||
static inline void remove_bypass_netns_cookie(struct bpf_sock_addr *ctx) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
the same as above
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
the same as above
done
bpf/kmesh/workload/cgroup_sock.c
Outdated
static inline bool conn_from_bypass_sim_add(struct bpf_sock_addr *ctx) | ||
{ | ||
// daemon sim connect 0.0.0.0:931(0x3a3) | ||
// 0x3a3 is the specific port handled by the daemon for enable Kmesh |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Annotation Error
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Annotation Error
done
bpf/kmesh/workload/cgroup_sock.c
Outdated
static inline bool conn_from_bypass_sim_delete(struct bpf_sock_addr *ctx) | ||
{ | ||
// daemon sim connect 0.0.0.1:932(0x3a4) | ||
// 0x3a4 is the specific port handled by the daemon for disable Kmesh |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Annotation Error
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Annotation Error
done
bpf/kmesh/workload/cgroup_sock.c
Outdated
@@ -128,6 +171,14 @@ int cgroup_connect4_prog(struct bpf_sock_addr *ctx) | |||
remove_netns_cookie(ctx); | |||
return CGROUP_SOCK_OK; | |||
} | |||
if (conn_from_bypass_sim_add(ctx)) { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Same problem, can it be merged?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Same problem, can it be merged?
yes, done
return nil | ||
} | ||
|
||
func enableKmeshControl(ns string) error { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Incorrect function name.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Incorrect function name.
I've changed it to bypassControlFunc
* 0.0.0.1:932(0x3a4) is "cipher key" for cgroup/connect4 | ||
* ebpf program. | ||
*/ | ||
simip := net.ParseIP("0.0.0.1") |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
execFunc considers whether it is possible to extract
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
execFunc considers whether it is possible to extract
done
c7f50b6
to
f9574e6
Compare
@@ -103,16 +115,28 @@ func StartByPassController(client kubernetes.Interface) error { | |||
log.Debugf("%s/%s: Pod is being deleted, skipping further processing", pod.GetNamespace(), pod.GetName()) | |||
return | |||
} | |||
|
|||
log.Infof("%s/%s: DELETED", pod.GetNamespace(), pod.GetName()) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This message is misleading, this pod is only being relabeld
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This message is misleading, this pod is only being relabeld
I have modified it
@@ -124,6 +148,46 @@ func StartByPassController(client kubernetes.Interface) error { | |||
return nil | |||
} | |||
|
|||
func bypassControlFunc(ns string, port int) error { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Please give a more explicit name
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Please give a more explicit name
done
bpf/kmesh/cgroup_sock.c
Outdated
__type(value, __u32); | ||
__uint(max_entries, MAP_SIZE_OF_BYPASS); | ||
__uint(map_flags, 0); | ||
} map_of_bypass SEC(".maps"); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
can we use the map_of_manager, when bypass set, we remove the key for the particular pod
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
good idea~ map_of_manager
table can be reused, which reduces the memory required by Kmesh. However, it is not recommended that delete record when bypass set. Otherwise, whether the Pod is bypassed or not managed cannot be determined.
FYI, we can add the value semantics of the records in the map_of_manager
table.
- If the Pod record exists, the pod is managed by Kmesh.
- The first byte of the value indicates whether to be bypassed.
- The remaining part of the value can be used for scenario extension in the future.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I have reused map_of_manager .
- If the Pod record exists, the pod is managed by Kmesh.
- 1: Pod has been bypassed
- 0: default value, pod is not been bypassed
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
BTW, for the ingress, we do some check in xdp, it is missing here
bpf/kmesh/cgroup_sock.c
Outdated
{ | ||
// daemon sim connect 0.0.0.0:931(0x3a3) | ||
// 0x3a3 is the specific port handled by the daemon for enable bypass | ||
return ((bpf_ntohl(ctx->user_ip4) == 1) && (bpf_ntohl(ctx->user_port) == 0x3a30000)); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
adapt for openEuler 23.03
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
2303 adaptation is required for sockops. Currently, no adaptation is required for cgroup/connect4.
5aa0f24
to
fd8cd9c
Compare
bpf/kmesh/cgroup_sock.c
Outdated
* status. Whether it is managed by kmesh is unrelated | ||
* to the value. The only determining factor is whether | ||
* there is cookie information for this pod in the map. | ||
*/ |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Pls add design description to PR description and update the test report
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Pls add design description to PR description and update the test report
done
febce81
to
3188327
Compare
bpf/kmesh/cgroup_sock.c
Outdated
if (err) | ||
BPF_LOG(ERR, KMESH, "record netcookie failed!, err is %d\n", err); | ||
} | ||
|
||
static inline void remove_netns_cookie(struct bpf_sock_addr *ctx) | ||
void record_kmesh_netns_cookie(struct bpf_sock_addr *ctx) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
void record_kmesh_netns_cookie(struct bpf_sock_addr *ctx) | |
void record_manager_netns_cookie(struct bpf_sock_addr *ctx) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
done
eeda3b1
to
1553f7d
Compare
bpf/kmesh/cgroup_sock.c
Outdated
static inline bool is_kmesh_enabled(struct bpf_sock_addr *ctx) | ||
{ | ||
__u64 cookie = bpf_get_netns_cookie(ctx); | ||
return bpf_map_lookup_elem(&map_of_manager, &cookie); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
should we make it align with is_by_pass_enabled
Check the
return value of bpf_map_lookup_elem !=NULL
bpf/kmesh/workload/cgroup_sock.c
Outdated
@@ -89,20 +147,49 @@ static inline bool conn_from_cni_sim_delete(struct bpf_sock_addr *ctx) | |||
return ((bpf_ntohl(ctx->user_ip4) == 1) && (bpf_ntohl(ctx->user_port) == 0x3a20000)); | |||
} | |||
|
|||
SEC("cgroup/connect4") | |||
int cgroup_connect4_prog(struct bpf_sock_addr *ctx) | |||
static inline bool handle_kmesh_manage_process(struct bpf_sock_addr *ctx) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
please add some comments, what the return value mean
Signed-off-by: weli-l <1289113577@qq.com>
@@ -124,6 +146,46 @@ func StartByPassController(client kubernetes.Interface) error { | |||
return nil | |||
} | |||
|
|||
func handleKmeshBypass(ns string, port int) error { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
port int -> oper int(add:0, del:1)
if enableKmesh { | ||
nspath, _ := getnspath(pod) | ||
if err := handleKmeshBypass(nspath, 932); err != nil { | ||
log.Errorf("failed to enable kmesh control") |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The log information may cause misunderstanding. Pls update the log description.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The log information may cause misunderstanding. Pls update the log description.
modified
* to determine whether the netns is been bypass. | ||
* 0.0.0.1:<port> is "cipher key" for cgroup/connect4 | ||
* ebpf program. | ||
*/ |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The meanings of the special ports(931/932) need to described in the comments.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The meanings of the special ports(931/932) need to described in the comments.
Added
bpf/kmesh/workload/sockops_tuple.c
Outdated
{ | ||
// bypass sim connect 0.0.0.1:931(0x3a3) | ||
// 0x3a3 is the specific port handled by the cni for enable bypass | ||
return conn_from_sim(skops, 1, 0x3a3); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
It is recommended to define marcos instead of hard-coding numbers.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
It is recommended to define marcos instead of hard-coding numbers.
I have defined in bpf_common.h
|
||
static inline bool conn_from_sim(struct bpf_sock_ops *skops, __u32 ip, __u32 port) | ||
{ | ||
__u32 rev_port = bpf_ntohl(skops->remote_port); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
IMO, in oe_23.03 os scenario, it is better to correct the value of rev_port
instead port
Codecov ReportAttention: Patch coverage is ❗ Your organization needs to install the Codecov GitHub app to enable full functionality.
Flags with carried forward coverage won't be shown. Click here to find out more.
|
b893d61
to
098f56d
Compare
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
BTW, we have no test coverage for bypass and also lack of design docs
#define ENABLE_KMESH_PORT 0x3a10000 | ||
#define DISABLE_KMESH_PORT 0x3a20000 | ||
#define ENABLE_BYPASS_PORT 0x3a30000 | ||
#define DISABLE_BYPASS_PORT 0x3a40000 |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
please add comment what are they? 829 ~ 831
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
please add comment what are they? 829 ~ 831
Added
__u64 cookie = bpf_get_netns_cookie(ctx); | ||
err = bpf_map_update_elem(&map_of_manager, &cookie, &value, BPF_NOEXIST); | ||
err = bpf_map_update_elem(map, &cookie, &value, BPF_NOEXIST); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
should we change BPF_NOEXIST to ANY
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
should we change BPF_NOEXIST to ANY
I think the data is only stored if there is no data in the map. NO_EXIST fits the scenario.
bpf/include/bpf_common.h
Outdated
__uint(max_entries, MAP_SIZE_OF_MANAGER); | ||
__uint(map_flags, 0); | ||
} map_of_manager SEC(".maps"); | ||
|
||
static inline void record_netns_cookie(struct bpf_sock_addr *ctx) | ||
static inline void record_netns_cookie(struct bpf_map *map, struct bpf_sock_addr *ctx) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
should we just rename it to record_manager_netns_cookie
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
should we just rename it to
record_manager_netns_cookie
done
bpf/include/bpf_common.h
Outdated
return value->is_bypassed; | ||
} | ||
|
||
static inline void remove_netns_cookie(struct bpf_map *map, struct bpf_sock_addr *ctx) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
ditto: should we just rename to remove_manager_netns_cookie
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
ditto: should we just rename to
remove_manager_netns_cookie
yes,modified
bpf/kmesh/workload/sockops_tuple.c
Outdated
static inline void skops_handle_kmesh_managed_process(struct bpf_sock_ops *skops) | ||
{ | ||
if (skops_conn_from_cni_sim_add(skops)) | ||
record_ip(skops->local_ip4); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
can we also rename record_ip
, we cannot get the meaning of it
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
can we also rename
record_ip
, we cannot get the meaning of it
done
bpf/kmesh/workload/sockops_tuple.c
Outdated
if (skops_conn_from_cni_sim_add(skops)) | ||
record_ip(skops->local_ip4); | ||
if (skops_conn_from_cni_sim_delete(skops)) | ||
remove_ip(skops->local_ip4); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
same as above
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
same as above
done
} | ||
|
||
if err := netns.WithNetNSPath(ns, execFunc); err != nil { | ||
err = fmt.Errorf("enter ns path :%v, run execFunc failed: %v", ns, err) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
can we add more info abou which pod is it handling for
f089bd8
to
8be90de
Compare
Signed-off-by: weli-l <1289113577@qq.com>
|
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Overall lgtm.
Just one cent:
should we define the port macros with right value like 0x3A1 instead of 0x3a10000
It's not just this place. The code is a bit messed up with |
/lgtm |
[APPROVALNOTIFIER] This PR is APPROVED This pull-request has been approved by: nlgwcy The full list of commands accepted by this bot can be found here. The pull request process is described here
Needs approval from an approver in each of these files:
Approvers can indicate their approval by writing |
What type of PR is this?
What this PR does / why we need it:
This PR introduces the implementation of the bypass kmesh function and reuses the map_of_manager table.
If the map contains pod data and the value of the corresponding record is 0, the traffic of the current pod is managed by kmesh.
If the map contains pod data and the value of the corresponding record is 1, the traffic of the current pod is bypassed.
Which issue(s) this PR fixes:
Fixes #
Special notes for your reviewer:
Self-verification report
ads module
enable bypass control
disable bypass control
workload module
enable bypass control
disable bypass control
Does this PR introduce a user-facing change?: