Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Including generated Kubernetes definition in tooling layer causes very long processing time #391

Closed
cueckoo opened this issue Jul 3, 2021 · 3 comments

Comments

@cueckoo
Copy link
Collaborator

cueckoo commented Jul 3, 2021

Originally opened by @seh in cuelang/cue#391

What version of CUE are you using (cue version)?

cue version devel darwin/amd64

Built via go get on 16 May 2020 against commit 8fcefc8.

Does this issue reproduce with the latest release?

Yes, insofar as this version is the latest, though not actually released.

What did you do?

Run cue get go k8s.io/api/storage/v1 to generate CUE definitions for the Kubernetes "storage/v1" API group and version.

Write a few objects that embed the v1.#StorageClass definition, and serialize just one of them as YAML using the encoding/yaml package in the tooling layer.

See the following txtar file summarizing a small set of files that set up this arrangement:

CUE module using Kubernetes #StorageClass definition

NOTE: CUE-generated Kubernetes definitions are omitted here for brevity.

-- cue.mod/module.cue --
module: "example.com"
-- kubernetes/common/storage.cue --
package common

import (
	"example.com/kubernetes"
)

#storage: {
	objects: [
		kubernetes.#StorageClass & {
			metadata: name: "sc1"
			provisioner: "something"
		},
		kubernetes.#StorageClass & {
			metadata: name: "sc2"
			provisioner: "something"
		},
		kubernetes.#StorageClass & {
			metadata: name: "sc3"
			provisioner: "something"
		},
		kubernetes.#StorageClass & {
			metadata: name: "sc4"
			provisioner: "something"
		},
		kubernetes.#StorageClass & {
			metadata: name: "sc5"
			provisioner: "something"
		},
	]
}
-- kubernetes/common/storage_tool.cue --
package common

import (
	"encoding/yaml"
	"tool/cli"
)

let sc = #storage.objects[0]

command: "test": {
	task: print: cli.Print & {
		text: yaml.Marshal(sc)
	}
}
-- kubernetes/storage.cue --
package kubernetes

import (
        storagev1 "k8s.io/api/storage/v1"
)

let baseObject = {
	apiVersion: "storage.k8s.io/v1"
}

#StorageClass: baseObject & {
	storagev1.#StorageClass

	kind: "StorageClass"
	volumeBindingMode: "WaitForFirstConsumer"
}
-- with-kubernetes-def.txtar --

What did you expect to see?

Running cue test 'example.com/kubernetes/common' would emit a single YAML document quickly, in a fraction of a second.

What did you see instead?

Running cue test 'example.com/kubernetes/common' takes approximately 70 seconds per CUE struct that embeds the v1.#StorageClass definition. I adjusted the entries in the #storage.objects list in file kubernetes/common/storage.cue in each of these runs.

Object Count Elapsed Wall Time (seconds)
1 67.15
2 140.20
3 212.71
4 277.97
5 346.80

If instead we omit the embedding of the v1.#StorageClass definition from file kubernetes/storage.cue, and define a few of the lost fields ourselves, we see the same output, but with dramatically lower processing time. Here is an amended txtar file showing the files without that generated dependency:

CUE module omitting use of Kubernetes #StorageClass definition
-- cue.mod/module.cue --
module: "example.com"
-- kubernetes/common/storage.cue --
package common

import (
	"example.com/kubernetes"
)

#storage: {
	objects: [
		kubernetes.#StorageClass & {
			metadata: name: "sc1"
			provisioner: "something"
		},
		kubernetes.#StorageClass & {
			metadata: name: "sc2"
			provisioner: "something"
		},
		kubernetes.#StorageClass & {
			metadata: name: "sc3"
			provisioner: "something"
		},
		kubernetes.#StorageClass & {
			metadata: name: "sc4"
			provisioner: "something"
		},
		kubernetes.#StorageClass & {
			metadata: name: "sc5"
			provisioner: "something"
		},
	]
}
-- kubernetes/common/storage_tool.cue --
package common

import (
	"encoding/yaml"
	"tool/cli"
)

let sc = #storage.objects[0]

command: "test": {
	task: print: cli.Print & {
		text: yaml.Marshal(sc)
	}
}
-- kubernetes/storage.cue --
package kubernetes

let baseObject = {
	apiVersion: "storage.k8s.io/v1"
}

#StorageClass: baseObject & {
	kind:              "StorageClass"
	volumeBindingMode: "WaitForFirstConsumer"

	metadata: name: !=""
	provisioner: !=""
}
-- sans-kubernetes-def.txtar --

Once again, if we now run cue test 'example.com/kubernetes/common' with this modified file, it takes the same amount of time regardless of whether we emit one or five objects.

Object Count Elapsed Wall Time (seconds)
1 0.04
2 0.04
3 0.04
4 0.04
5 0.04

This example is whittled down from a larger code base. Trying to emit just five objects there takes at least five minutes; I've rarely let it run long enough to see it complete. During that time, CUE burns around four CPUs and consumes approximately 9 GB of memory.

Note that I'm able to run commands like cue eval, cue vet, and cue export against the non-tool files, and they complete almost immediately. Something about the split into the tooling layer for the YAML serialization triggers this behavior—even though in each of these tests we're only serializing one of the objects (per the "sc" let declaration in file kubernetes/common/storage_tool.cue), regardless of how many we define.

Why does my use of this generated definition perturb the processing time like this? Why does this processing time increase linearly with each object that embeds such a definition?

@cueckoo
Copy link
Collaborator Author

cueckoo commented Jul 3, 2021

Original reply by @mpvl in cuelang/cue#391 (comment)

The overlay layer is gone in v0.3.0-alpha1. I was not able to reproduce this with that version.

As a result there may be name clashes now, see release notes.

@cueckoo
Copy link
Collaborator Author

cueckoo commented Jul 3, 2021

Original reply by @mpvl in cuelang/cue#391 (comment)

This seems to have been fixed in v0.3.0:
time cue test 'example.com/kubernetes/common'
...
real 0m1.849s
user 0m3.928s
sys 0m0.118s

Up to v0.2.2: very long running times.

@cueckoo
Copy link
Collaborator Author

cueckoo commented Jul 3, 2021

Original reply by @seh in cuelang/cue#391 (comment)

I confirmed similar timing with version 0.0.3-alpha3 on macOS: 3.64s user 0.15s system 235%!c(MISSING)pu 1.615 total.
Thank you!

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

No branches or pull requests

1 participant