kubevirt zedbox service zedkube #3827

naiming-zededa · 2024-03-20T01:39:57Z

a zedbox service for edge-node kubevirt
currently handles pod's functions
- container direct-connected ethernet
- remote console assess with VNC for VMI
- App log from the pods
- may add more functionalities in Edge-node clustering

codecov · 2024-03-20T01:55:24Z

Codecov Report

All modified and coverable lines are covered by tests ✅

Project coverage is 17.51%. Comparing base (e4f2710) to head (9826f96).
Report is 66 commits behind head on master.

❗ Current head 9826f96 differs from pull request most recent head 84c7fb3. Consider uploading reports for the commit 84c7fb3 to get more accurate results

Additional details and impacted files

@@           Coverage Diff           @@
##           master    #3827   +/-   ##
=======================================
  Coverage   17.51%   17.51%           
=======================================
  Files           3        3           
  Lines         805      805           
=======================================
  Hits          141      141           
  Misses        629      629           
  Partials       35       35

☔ View full report in Codecov by Sentry.
📢 Have feedback on the report? Share it here.

pkg/pillar/cmd/zedkube/aitoapiserver.go

deitch · 2024-03-26T08:39:57Z

pkg/pillar/scripts/device-steps.sh

@@ -19,6 +19,9 @@ DPCDIR=$ZTMPDIR/DevicePortConfig
 FIRSTBOOTFILE=$ZTMPDIR/first-boot
 FIRSTBOOT=
 AGENTS="diag zedagent ledmanager nim nodeagent domainmgr loguploader tpmmgr vaultmgr zedmanager zedrouter downloader verifier baseosmgr wstunnelclient volumemgr watcher zfsmanager usbmanager"
+if grep -q kubevirt /run/eve-hv-type; then


Could we just add zedkube, and the zedkube service itself would recognize, "oh no, I do not run under kubevirt, so I will do the touch and exit"?

I am concerned with polluting this with more specific logic. Everything else in device-steps.sh is generic. We have so many if/then and properties all over the place, keeping this clean would be nice.

Yes, we could do that in pkg/pillar/cmd/zedkube/nokube.go. But note that we would have to update touch file periodically and could not just exit (otherwise watchdog will get triggered). Or we do this, but then we just have another if/then.

I didn't think about that. It isn't just the wait_for_touch, but the ongoing watchdog, isn't it? What happens with diag and watchdog?

Or we do this, but then we just have another if/then

Better in one place than scattered about. The best solution would be to update this generically, so that we have some way of knowing without if/then scattered everywhere

Yes, below wait_for_touch there is touch "$WATCHDOG_FILE/$AGENT.touch", which enables periodic watchdog monitoring.
I actually do not know why we do not enable watchdog for diag. Probably because it is not a critical microservice so we do not care if it will get stuck.

But that makes it different than zedkube, which we do care about in watchdog.

I know it is "just one more if/then", but all of our areas across eve where we have too many of them and too much complexity started that way. It would be good to have a cleaner solution.

//go:build !kubevirt

This is the file that is built when not kubevirt variant? So this will turn the service into a perpetual loop?

Does this mean that all services always start and run. However, their behavior may vary:

kubevirt: normal service running

non-kubevirt: just an infinite loop, so that the service runs, but nothing happens

So from device-steps, it just looks like, "I have a list of services, I start them all, I run them all, I watch them all."

And since these all are threads, there is no overhead from launching an additional process with go runtime overhead, right?

That's quite nice and elegant.

The question though is if we need to introduce a new microservice in the first place.
We are already enhancing the existing microservices, like zedrouter, volumemgr, zedmanager and domainmgr, to handle kubevirt-specific workflow. So why to put these few things into a separate new microservice? Will there be more applications for zedkube in the cluster scenario?

That is a much better question than I asked. Do we need an additional µservice? I worked under the assumption that we do. If we do, then this approach is good. But best to answer Milan's question first.

And since these all are threads, there is no overhead from launching an additional process with go runtime overhead, right?

The only overhead is the thread running and touching the file about every 15 seconds.

The question though is if we need to introduce a new microservice in the first place. We are already enhancing the existing microservices, like zedrouter, volumemgr, zedmanager and domainmgr, to handle kubevirt-specific workflow. So why to put these few things into a separate new microservice? Will there be more applications for zedkube in the cluster scenario?

There should be more things to come for the new service in cluster, we'll have it the publish like EdgeNodeClusterStatus, etc. Even the provision of cluster prefix can be done in that. Among the nodes in cluster we'll elect a leader to be the reporter to report k3s cluster stats and info to the controller, etc. and there may be a lot more interesting things it needs to do with kubernetes

deitch

I left one comment about device-steps.sh

Beyond that, mostly looks good.

I don't understand the vnc part. Why do we need remote vnc access to enable zedkube?

milan-zededa · 2024-03-26T11:39:35Z

pkg/pillar/cmd/zedkube/aitoapiserver.go

+			nadname := "host-" + io.Name
+			_, ok := ctx.networkInstanceStatusMap.Load(nadname)
+			if !ok {
+				bringupInterface(io.Name)


If we fail to prepare interface for assignment, shouldn't we propagate the error to AppInstanceStatus (and potentially stop the application from being deployed)? Seems we only log errors here.

The problem is how to bubble up this error into AppInstanceStatus, when we launch the pod in domainmgr, it will get an error there, since the net-attach-def of this does not exist. We'll get the error to the user this way.

Yes, but the error message will be something else and not useful for troubleshooting.

Ok, I can let zedkube in the error condition to publish the IOAdapter status error. and zedmanager listening on this, and it can include this into the AppInatnceStatus error portion. Does this sound reasonable?

actually I don't need to do that. I can do this in the domainmgr, during the pod configure setup stage, query for the net-attach-def for this ether port, if kubernetes side does not have it, it can generate this specific error condition for the UI.

Since the domainmgr side the of kubevirt patch has not been made into the mainline, we'll change to add that error condition for checking the io-ethernet from k3s and express error over there.

naiming-zededa · 2024-03-26T18:04:21Z

I left one comment about device-steps.sh

Beyond that, mostly looks good.

I don't understand the vnc part. Why do we need remote vnc access to enable zedkube?

@deitch since we use the kubevirt to launch the VMI, the way to access the VNC console is to use the 'virtctl vnc' tool to let the VMI to pass over the VNC port number over to us, thus the regular 'remote-console' from the controller will work as usual. since this 'extra' thing is only related to kubevirt type of operation, and 'zedkube' is for that currently. That is the only reason this 'vnc.go' is in zedkube side.

deitch · 2024-03-26T18:12:03Z

So this is for enabling remote console in kubevirt eve? Got it.

naiming-zededa · 2024-03-26T18:13:22Z

So this is for enabling remote console in kubevirt eve? Got it.

yes, allow remote-console into the VMI

milan-zededa · 2024-03-27T08:51:38Z

pkg/pillar/cmd/zedkube/aitoapiserver.go

@@ -0,0 +1,187 @@
+// Copyright (c) 2024 Zededa, Inc.


What does the file name aitoapiserver.go mean (in particular the aito part)?
What about creating two separate go files, such as ethpassthrough.go and applogs.go?

naiming-zededa · 2024-04-01T19:02:35Z

updated the PR:

have the 'still-running' touch in nokube.go for the dummy Run(), remove condition in device-steps.sh
split the aitoapiserver.go into two files as suggested
will have check in domainmgr side for verifying ether-passthrough NAD exist before launch app( not in this PR)
code such as:

			// Check if the NAD is created in the cluster, return error if not
			err := kubeapi.CheckEtherPassThroughNAD(nadName)
			if err != nil {
				logrus.Errorf("CreatePodConfig: check ether NAD failed, %v", err)
				return err
			}

modified the nokube.go to remove 'select' due to yetus warning.

eriknordmark

An old comment from me which was stuck in a pending review :-(

eriknordmark · 2024-04-07T12:59:45Z

pkg/kube/cluster-init.sh

+        esac
+    done < "$VMICONFIG_FILENAME"
+
+    # Check if vminame and vncport were found and assign default values if not


This comment doesn't match the code which returns an error

eriknordmark

LGTM except the comment nit.

christoph-zededa · 2024-04-19T09:34:24Z

pkg/pillar/cmd/zedkube/etherpassthrough.go

+				bringupInterface(io.Name)
+				err := ioEtherCreate(ctx, &io)
+				if err != nil {
+					log.Errorf("checkIoAdapterEthernet: create io adapter error %v", err)


Suggested change

log.Errorf("checkIoAdapterEthernet: create io adapter error %v", err)

log.Errorf("checkIoAdapterEthernet: create io adapter error %v", err)

continue

if it is an error, why store the nadname?

I think in here, there is a case the NAD already exist on the kubernetes side, we want to count this as created but log an error.

in another outstanding PR #3841, we have added the function 'func CheckEtherPassThroughNAD(nadName string)', we can later to add a check to see if it already exists condition if not, then do the 'continue' as suggested.

christoph-zededa · 2024-04-19T09:39:39Z

pkg/pillar/cmd/zedkube/applogs.go

+		if scanner.Err() != nil {
+			if scanner.Err() == io.EOF {


Suggested change

if scanner.Err() != nil {

if scanner.Err() == io.EOF {

if scanner.Err() == io.EOF {

Hmm, shouldn't it also log error and break when there is an error? (It currently doesn't do that.)

Sure, let me add an error here.

pkg/pillar/cmd/zedkube/etherpassthrough.go

Signed-off-by: Naiming Shen <naiming@Admins-MacBook-Pro-3.local> Signed-off-by: Naiming Shen <naiming@admins-mbp-3.lan> Signed-off-by: Naiming Shen <naiming@Admins-MacBook-Pro-3.local> Signed-off-by: Naiming Shen <naiming@admins-mbp-3.lan>

eriknordmark

LGTM

naiming-zededa requested review from zedi-pramodh and eriknordmark as code owners March 20, 2024 01:39

naiming-zededa force-pushed the naiming-zedkube branch 2 times, most recently from 7547502 to 0012210 Compare March 20, 2024 05:10

zedi-pramodh reviewed Mar 26, 2024

View reviewed changes

pkg/pillar/cmd/zedkube/aitoapiserver.go Outdated Show resolved Hide resolved

zedi-pramodh reviewed Mar 26, 2024

View reviewed changes

pkg/pillar/cmd/zedkube/aitoapiserver.go Outdated Show resolved Hide resolved

zedi-pramodh requested a review from deitch March 26, 2024 00:05

naiming-zededa force-pushed the naiming-zedkube branch from 0012210 to 0226a39 Compare March 26, 2024 00:26

github-actions bot requested a review from zedi-pramodh March 26, 2024 00:26

deitch reviewed Mar 26, 2024

View reviewed changes

eriknordmark requested a review from milan-zededa March 26, 2024 11:30

milan-zededa reviewed Mar 26, 2024

View reviewed changes

milan-zededa reviewed Mar 27, 2024

View reviewed changes

naiming-zededa force-pushed the naiming-zedkube branch from 0226a39 to f6ec861 Compare April 1, 2024 18:55

naiming-zededa force-pushed the naiming-zedkube branch from f6ec861 to 9826f96 Compare April 1, 2024 19:47

eriknordmark reviewed Apr 19, 2024

View reviewed changes

eriknordmark approved these changes Apr 19, 2024

View reviewed changes

christoph-zededa reviewed Apr 19, 2024

View reviewed changes

christoph-zededa reviewed Apr 22, 2024

View reviewed changes

pkg/pillar/cmd/zedkube/etherpassthrough.go Outdated Show resolved Hide resolved

kubevirt zedbox service zedkube

84c7fb3

Signed-off-by: Naiming Shen <naiming@Admins-MacBook-Pro-3.local> Signed-off-by: Naiming Shen <naiming@admins-mbp-3.lan> Signed-off-by: Naiming Shen <naiming@Admins-MacBook-Pro-3.local> Signed-off-by: Naiming Shen <naiming@admins-mbp-3.lan>

naiming-zededa force-pushed the naiming-zedkube branch from 9826f96 to 84c7fb3 Compare April 22, 2024 19:47

github-actions bot requested a review from eriknordmark April 22, 2024 19:48

eriknordmark approved these changes Apr 24, 2024

View reviewed changes

eriknordmark merged commit 74e3d47 into lf-edge:master Apr 25, 2024
29 of 32 checks passed

	log.Errorf("checkIoAdapterEthernet: create io adapter error %v", err)
	log.Errorf("checkIoAdapterEthernet: create io adapter error %v", err)
	continue

	if scanner.Err() != nil {
	if scanner.Err() == io.EOF {
	if scanner.Err() == io.EOF {

kubevirt zedbox service zedkube #3827

kubevirt zedbox service zedkube #3827

Conversation

naiming-zededa commented Mar 20, 2024

codecov bot commented Mar 20, 2024 • edited Loading

Codecov Report

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

naiming-zededa Mar 27, 2024 • edited Loading

Choose a reason for hiding this comment

deitch left a comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

naiming-zededa Mar 26, 2024 • edited Loading

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

naiming-zededa commented Mar 26, 2024

deitch commented Mar 26, 2024

naiming-zededa commented Mar 26, 2024

Choose a reason for hiding this comment

Choose a reason for hiding this comment

naiming-zededa commented Apr 1, 2024 • edited Loading

eriknordmark left a comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

eriknordmark left a comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

eriknordmark left a comment

Choose a reason for hiding this comment

codecov bot commented Mar 20, 2024 •

edited

Loading

naiming-zededa Mar 27, 2024 •

edited

Loading

naiming-zededa Mar 26, 2024 •

edited

Loading

naiming-zededa commented Apr 1, 2024 •

edited

Loading