Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

kmesh support restart by reload old bpf map and prog #475

Merged
merged 8 commits into from
Jul 26, 2024

Conversation

lec-bit
Copy link
Contributor

@lec-bit lec-bit commented Jul 2, 2024

What type of PR is this?
/kind feature

What this PR does / why we need it:

Which issue(s) this PR fixes:
Fixes #

Special notes for your reviewer:

Does this PR introduce a user-facing change?:


@@ -101,17 +101,23 @@ func Execute(configs *options.BootstrapConfigs) error {
log.Info("command Start cni successful")
defer cniInstaller.Stop()

setupCloseHandler()
*configs.Status = setupCloseHandler()
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Assign an int to a pointer?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Because I want the data in configs.Status and configs.BpfConfig.Status to be the same at the same time

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

you can change status from pointer to int

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I add a KmeshStartStatus in bpf

@@ -101,17 +101,23 @@ func Execute(configs *options.BootstrapConfigs) error {
log.Info("command Start cni successful")
defer cniInstaller.Stop()

setupCloseHandler()
*configs.Status = setupCloseHandler()
log.Printf("bpfLoader.Restart %v", *configs.Status)
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
log.Printf("bpfLoader.Restart %v", *configs.Status)
log.Printf("Kmesh Restart %d", *configs.Status)

pkg/bpf/bpf.go Outdated
@@ -30,6 +30,7 @@ import (

"kmesh.net/kmesh/daemon/options"
"kmesh.net/kmesh/pkg/logger"
pkgversion "kmesh.net/kmesh/pkg/version"
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Why not use version directly?

Suggested change
pkgversion "kmesh.net/kmesh/pkg/version"
"kmesh.net/kmesh/pkg/version"

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

done

pkg/bpf/bpf.go Outdated
@@ -49,11 +50,15 @@ type BpfLoader struct {

obj *BpfKmesh
workloadObj *BpfKmeshWorkload
Status *int
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
Status *int
Status int

}

if err := sc.Link.Pin(sc.Info.BpfFsPath + "sockconn_prog"); err != nil {
log.Printf("file exit %v", sc.Info.BpfFsPath+"sockconn_prog")
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
log.Printf("file exit %v", sc.Info.BpfFsPath+"sockconn_prog")
log.Printf("file exited %s", sc.Info.BpfFsPath+"sockconn_prog")

@@ -0,0 +1,57 @@
/*
* Copyright 2023 The Kmesh Authors.
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
* Copyright 2023 The Kmesh Authors.
* Copyright 2024 The Kmesh Authors.


err = m.Pin(versionPath + "kmesh_version")
if err != nil {
log.Errorf("kmesh_version failed to pin: %v", err)
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
log.Errorf("kmesh_version failed to pin: %v", err)
log.Errorf("kmesh_version pin failed: %v", err)

@hzxuzhonghu
Copy link
Member

@lec-bit Can you fix and prevent force pushing? So it will be firendly to reviewers

Copy link
Member

@hzxuzhonghu hzxuzhonghu left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

should we make version map as a global config used by kmesh, in future we could expand it.

In addition, can we store it in the files instead of bpf map, which seems unnecessary.

umount -t cgroup2 /mnt/kmesh_cgroup2/
rm -rf /mnt/kmesh_cgroup2
rm -rf /sys/fs/bpf/bpf_kmesh
echo "kmesh close"
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
echo "kmesh close"
echo "kmesh exit"

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

OK

@@ -101,17 +101,23 @@ func Execute(configs *options.BootstrapConfigs) error {
log.Info("command Start cni successful")
defer cniInstaller.Stop()

setupCloseHandler()
*configs.Status = setupCloseHandler()
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

you can change status from pointer to int

daemon/manager/manager.go Show resolved Hide resolved
log.Printf("daemonset err:%v", err)
return false
}
return true
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Not right, you should check the kmesh name.

if config.AdsEnabled() {
versionPath = config.BpfFsPath + "/bpf_kmesh/map/kmesh_version"
} else if config.WdsEnabled() {
versionPath = config.BpfFsPath + "/bpf_kmesh_workload/map/kmesh_version"
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yoiu can remove the duplicate path construct within NewVersionMap

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

move the logic of map operation out to bpf loader of bpf.go

NewStart = iota
Restart
Update
Reload
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

what does reload mean

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

mean we need to reload bpf map, it is restart

const (
NewStart = iota
Restart
Update
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

and update means what?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

For the future update scenario, if the two gitversions are different after the kmesh is restarted, it indicates that the update scenario is used and the previous configuration is not load.

return nil
}

func setupCloseHandler() {
func setupCloseHandler() int {
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This function seems to be used to handle os signal and it's weired to get status and return here.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

it doesn‘t matter, i can move func GetDaemonset out

ch := make(chan os.Signal, 1)
signal.Notify(ch, syscall.SIGINT, syscall.SIGTERM, syscall.SIGQUIT, syscall.SIGHUP, syscall.SIGABRT, syscall.SIGTSTP)

<-ch

log.Warn("exiting...")

if bpf.GetDaemonset() {
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Why if we get Kmesh daemonset object, we could make sure it's restart?

When Kmesh is create for the first time, it always creates the Kmesh DaemonSet object first and then creates the Kmesh pods.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

yes, always creates the Kmesh DaemonSet object first and then creates the Kmesh pods.
and by end, there is some confusion,
although the current I can judge if restart by daemonset in bpfLoader.Stop, there is still a lack of theoretical basis

return true
}

func cleanupMountPath() {
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This function is not used?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

sorry, forget
now in bpfLoader.Stop

NewStart = iota
Restart
Update
Reload
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

These constants also defined in pkg/version, can they be defined in one place?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

moved

Copy link

codecov bot commented Jul 12, 2024

Codecov Report

Attention: Patch coverage is 65.06849% with 51 lines in your changes missing coverage. Please review.

Project coverage is 48.88%. Comparing base (cc9a829) to head (481937a).
Report is 16 commits behind head on main.

Files Coverage Δ
pkg/utils/test/bpf_map.go 43.58% <20.00%> (+0.73%) ⬆️
pkg/bpf/bpf_kmesh_workload.go 63.22% <66.66%> (+0.21%) ⬆️
pkg/bpf/bpf_restart.go 65.11% <65.11%> (ø)
pkg/bpf/bpf.go 41.17% <67.50%> (+20.90%) ⬆️

... and 5 files with indirect coverage changes


Continue to review full report in Codecov by Sentry.

Legend - Click here to learn more
Δ = absolute <relative> (impact), ø = not affected, ? = missing data
Powered by Codecov. Last update 064614b...481937a. Read the comment docs.

@hzxuzhonghu
Copy link
Member

@lec-bit ping me when it is ready

@lec-bit lec-bit force-pushed the kmesh-build branch 3 times, most recently from 5b9a91a to e72518c Compare July 22, 2024 07:55
@lec-bit
Copy link
Contributor Author

lec-bit commented Jul 22, 2024

@hzxuzhonghu I have added test cases and fully verified this part of the function, and it can be merged now.

@hzxuzhonghu
Copy link
Member

Will take a look later

@@ -8,3 +8,6 @@ rules:
- apiGroups: [""]
resources: ["pods","services","namespaces"]
verbs: ["get", "update", "patch", "list", "watch"]
- apiGroups: ["apps"]
resources: ["daemonsets"]
verbs: ["list", "get"]
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

to align with above, i think you donot need list permission

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

yes, i will delete

pkg/bpf/bpf.go Outdated
}

func NewBpfLoader(config *options.BpfConfig) *BpfLoader {
return &BpfLoader{
config: config,
config: config,
StatusMap: NewVersionMap(config),
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

StatusMap or VersionMap

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

ok, VersionMap


const (
NewStart = iota
Restart
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Add some comments

@@ -0,0 +1,100 @@
/*
* Copyright 2024 The Kmesh Authors.
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

remove year

}

func CleanupBpfMap() {
err := syscall.Unmount("/mnt/kmesh_cgroup2", 0)
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

prevent hardcoding

if GetDaemonset() {
SetKmeshStartStatus(Restart)
} else {
SetKmeshStartStatus(NewStart)
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Not sure i understand the intention here, since it is closing

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

i will check the name

kmeshStartStatus = Status
}

func GetDaemonset() bool {
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

To be more clear, we can move this implement detail into a function InferRestartStatus


var kmeshStartStatus int

func GetKmeshStartStatus() int {
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Confusing with GetStartStatus

lec-bit added 4 commits July 23, 2024 14:04
Signed-off-by: lec-bit <glfhmzmy@126.com>
Signed-off-by: let-bit <glfhmzmy@126.com>
Signed-off-by: let-bit <glfhmzmy@126.com>
Signed-off-by: let-bit <glfhmzmy@126.com>
Signed-off-by: let-bit <glfhmzmy@126.com>
Comment on lines 35 to 37
// Normal: a normal new start
// Restart: reusing the previous kmesh configuration
// Update: upgrading kmesh and reusing part of previous kmesh configuration
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

better add some indents

// Restart: reusing the previous kmesh configuration
// Update: upgrading kmesh and reusing part of previous kmesh configuration
// Close:
// Normal: normal close
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

clean up all the bpf progs and maps

Signed-off-by: let-bit <glfhmzmy@126.com>
return kmeshStatus
}

func SetKmeshStatus(Status int) {
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Can we remove it, i get confused by it with SetStartStatus


const (
daemonSetName = "kmesh"
namespace = "kmesh-system"
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This can be changed, we can get it by
podNamespace := env.Register("POD_NAMESPACE", "", "").Get()

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

done

// Normal: normal close
// Restart: not clean kmesh configuration, for next launch
// Update: not clean part of kmesh configuration, for next launch
const (
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

we may define a
type StartType uint8

and

const ( 
    Normal StartType = iota

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

ok

kmeshStatus = Status
}

func InferRestartStatus() bool {
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

make it return StartType

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

ok

kmeshStatus = Status
}

func InferRestartStatus() bool {
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

and use small cased functionname to keep it private

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

ok

pkg/bpf/bpf.go Outdated
}
}

func GetMap(m *ebpf.Map, key uint32) [16]byte {
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

getOldVersionFromMap

pkg/bpf/bpf.go Outdated

versionMap, err := ebpf.LoadPinnedMap(versionPath+"kmesh_version", opts)
if err != nil {
log.Debugf("kmesh version map LoadPinnedMap failed: %v, Start normal", err)
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
log.Debugf("kmesh version map LoadPinnedMap failed: %v, Start normal", err)
log.Debugf("kmesh version map loadfailed: %v, start normally", err)

pkg/bpf/bpf.go Outdated
return valueBytes
}

func RecoverVersionMap(config *options.BpfConfig, versionPath string) *ebpf.Map {
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
func RecoverVersionMap(config *options.BpfConfig, versionPath string) *ebpf.Map {
func recoverVersionMap(config *options.BpfConfig, versionPath string) *ebpf.Map {

pkg/bpf/bpf.go Outdated
return nil
}

err = m.Pin(filepath.Join(versionPath + "kmesh_version"))
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

If L222 returns err, wouldn't m.Pin panic?

Copy link
Contributor Author

@lec-bit lec-bit Jul 25, 2024

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

yes,will Panic;
I will add an error handling, generally no error will be returned here unless there are insufficient resources

}

// Test Kmesh Restart Normal
func runTestRestart(t *testing.T) {
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

is there a update case?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Restart, no Update now, there is currently no Update scenario

Signed-off-by: let-bit <glfhmzmy@126.com>
@@ -38,6 +39,10 @@ import (

var log = logger.NewLoggerField("pkg/bpf")

var (
hash = fnv.New32a()
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

move to the function

pkg/bpf/bpf.go Outdated
if m != nil {
SetStartStatus(m)
CompareOldGitVersion(m)
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The function name does not reflect the functionality.

return kmeshStatus
}

func SetKmeshStatus(Status int) {
func SetKmeshStatus(Status StartType) {
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

SetStartType


func GetKmeshStatus() int {
func GetKmeshStatus() StartType {
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

GetStartType

Signed-off-by: let-bit <glfhmzmy@126.com>
@hzxuzhonghu
Copy link
Member

/lgtm

/approve

@kmesh-bot
Copy link
Collaborator

[APPROVALNOTIFIER] This PR is APPROVED

This pull-request has been approved by: hzxuzhonghu

The full list of commands accepted by this bot can be found here.

The pull request process is described here

Needs approval from an approver in each of these files:

Approvers can indicate their approval by writing /approve in a comment
Approvers can cancel approval by writing /approve cancel in a comment

@kmesh-bot kmesh-bot merged commit 7069d8c into kmesh-net:main Jul 26, 2024
8 checks passed
@lec-bit lec-bit deleted the kmesh-build branch October 19, 2024 08:54
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging this pull request may close these issues.

5 participants