Including generated Kubernetes definition in tooling layer causes very long processing time #391

cueckoo · 2021-07-03T10:26:59Z

Originally opened by @seh in cuelang/cue#391

What version of CUE are you using (`cue version`)?

cue version devel darwin/amd64

Built via go get on 16 May 2020 against commit 8fcefc8.

Does this issue reproduce with the latest release?

Yes, insofar as this version is the latest, though not actually released.

What did you do?

Run cue get go k8s.io/api/storage/v1 to generate CUE definitions for the Kubernetes "storage/v1" API group and version.

Write a few objects that embed the v1.#StorageClass definition, and serialize just one of them as YAML using the encoding/yaml package in the tooling layer.

See the following txtar file summarizing a small set of files that set up this arrangement:

CUE module using Kubernetes #StorageClass definition

NOTE: CUE-generated Kubernetes definitions are omitted here for brevity.

-- cue.mod/module.cue --
module: "example.com"
-- kubernetes/common/storage.cue --
package common

import (
	"example.com/kubernetes"
)

#storage: {
	objects: [
		kubernetes.#StorageClass & {
			metadata: name: "sc1"
			provisioner: "something"
		},
		kubernetes.#StorageClass & {
			metadata: name: "sc2"
			provisioner: "something"
		},
		kubernetes.#StorageClass & {
			metadata: name: "sc3"
			provisioner: "something"
		},
		kubernetes.#StorageClass & {
			metadata: name: "sc4"
			provisioner: "something"
		},
		kubernetes.#StorageClass & {
			metadata: name: "sc5"
			provisioner: "something"
		},
	]
}
-- kubernetes/common/storage_tool.cue --
package common

import (
	"encoding/yaml"
	"tool/cli"
)

let sc = #storage.objects[0]

command: "test": {
	task: print: cli.Print & {
		text: yaml.Marshal(sc)
	}
}
-- kubernetes/storage.cue --
package kubernetes

import (
        storagev1 "k8s.io/api/storage/v1"
)

let baseObject = {
	apiVersion: "storage.k8s.io/v1"
}

#StorageClass: baseObject & {
	storagev1.#StorageClass

	kind: "StorageClass"
	volumeBindingMode: "WaitForFirstConsumer"
}
-- with-kubernetes-def.txtar --

What did you expect to see?

Running cue test 'example.com/kubernetes/common' would emit a single YAML document quickly, in a fraction of a second.

What did you see instead?

Running cue test 'example.com/kubernetes/common' takes approximately 70 seconds per CUE struct that embeds the v1.#StorageClass definition. I adjusted the entries in the #storage.objects list in file kubernetes/common/storage.cue in each of these runs.

Object Count	Elapsed Wall Time (seconds)
1	67.15
2	140.20
3	212.71
4	277.97
5	346.80

If instead we omit the embedding of the v1.#StorageClass definition from file kubernetes/storage.cue, and define a few of the lost fields ourselves, we see the same output, but with dramatically lower processing time. Here is an amended txtar file showing the files without that generated dependency:

CUE module omitting use of Kubernetes #StorageClass definition

-- cue.mod/module.cue --
module: "example.com"
-- kubernetes/common/storage.cue --
package common

import (
	"example.com/kubernetes"
)

#storage: {
	objects: [
		kubernetes.#StorageClass & {
			metadata: name: "sc1"
			provisioner: "something"
		},
		kubernetes.#StorageClass & {
			metadata: name: "sc2"
			provisioner: "something"
		},
		kubernetes.#StorageClass & {
			metadata: name: "sc3"
			provisioner: "something"
		},
		kubernetes.#StorageClass & {
			metadata: name: "sc4"
			provisioner: "something"
		},
		kubernetes.#StorageClass & {
			metadata: name: "sc5"
			provisioner: "something"
		},
	]
}
-- kubernetes/common/storage_tool.cue --
package common

import (
	"encoding/yaml"
	"tool/cli"
)

let sc = #storage.objects[0]

command: "test": {
	task: print: cli.Print & {
		text: yaml.Marshal(sc)
	}
}
-- kubernetes/storage.cue --
package kubernetes

let baseObject = {
	apiVersion: "storage.k8s.io/v1"
}

#StorageClass: baseObject & {
	kind:              "StorageClass"
	volumeBindingMode: "WaitForFirstConsumer"

	metadata: name: !=""
	provisioner: !=""
}
-- sans-kubernetes-def.txtar --

Once again, if we now run cue test 'example.com/kubernetes/common' with this modified file, it takes the same amount of time regardless of whether we emit one or five objects.

Object Count	Elapsed Wall Time (seconds)
1	0.04
2	0.04
3	0.04
4	0.04
5	0.04

This example is whittled down from a larger code base. Trying to emit just five objects there takes at least five minutes; I've rarely let it run long enough to see it complete. During that time, CUE burns around four CPUs and consumes approximately 9 GB of memory.

Note that I'm able to run commands like cue eval, cue vet, and cue export against the non-tool files, and they complete almost immediately. Something about the split into the tooling layer for the YAML serialization triggers this behavior—even though in each of these tests we're only serializing one of the objects (per the "sc" let declaration in file kubernetes/common/storage_tool.cue), regardless of how many we define.

Why does my use of this generated definition perturb the processing time like this? Why does this processing time increase linearly with each object that embeds such a definition?

The text was updated successfully, but these errors were encountered:

cueckoo · 2021-07-03T10:50:19Z

Original reply by @mpvl in cuelang/cue#391 (comment)

The overlay layer is gone in v0.3.0-alpha1. I was not able to reproduce this with that version.

As a result there may be name clashes now, see release notes.

cueckoo · 2021-07-03T10:52:16Z

Original reply by @mpvl in cuelang/cue#391 (comment)

This seems to have been fixed in v0.3.0:
time cue test 'example.com/kubernetes/common'
...
real 0m1.849s
user 0m3.928s
sys 0m0.118s

Up to v0.2.2: very long running times.

cueckoo · 2021-07-03T10:52:17Z

Original reply by @seh in cuelang/cue#391 (comment)

I confirmed similar timing with version 0.0.3-alpha3 on macOS: 3.64s user 0.15s system 235%!c(MISSING)pu 1.615 total.
Thank you!

cueckoo added NeedsInvestigation NeedsVerification labels Jul 3, 2021

cueckoo closed this as completed Jul 3, 2021

cueckoo mentioned this issue Jul 3, 2021

Including generated Kubernetes definition in tooling layer causes very long processing time cuelang/cue#391

Closed

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Including generated Kubernetes definition in tooling layer causes very long processing time #391

Including generated Kubernetes definition in tooling layer causes very long processing time #391

cueckoo commented Jul 3, 2021

cueckoo commented Jul 3, 2021

cueckoo commented Jul 3, 2021

cueckoo commented Jul 3, 2021

Including generated Kubernetes definition in tooling layer causes very long processing time #391

Including generated Kubernetes definition in tooling layer causes very long processing time #391

Comments

cueckoo commented Jul 3, 2021

What version of CUE are you using (cue version)?

Does this issue reproduce with the latest release?

What did you do?

What did you expect to see?

What did you see instead?

cueckoo commented Jul 3, 2021

cueckoo commented Jul 3, 2021

cueckoo commented Jul 3, 2021

What version of CUE are you using (`cue version`)?