-
Notifications
You must be signed in to change notification settings - Fork 1.8k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Fix issue with "$$" in Script blocks #3888
Conversation
The following is the coverage report on the affected files.
|
TODO:
|
First of all:
👏👏👏 I love it. There's even already a test that incidentally exercises it reaching Is there a limitation on the max size of an annotation, such that users who previously had a working kubernetes/kubernetes@b501cc7 (now here) makes me think this is 256KB but I don't see any other official documentation. That seems to be the max size of all annotation keys and values combined, not just one specific annotation. Granted, it probably wasn't a good idea to have hundreds of lines of bash in your YAML anyway (or anywhere for that matter! 🙃 ), but it's probably worth understanding and documenting and including in release notes just in case. Can we include a YAML test that exercises this newly functional behavior, that would have failed without this change? |
Great find. My guess here is that previously we were limited by Edit: we were building the script with heredoc before so maybe there were no limits? But we were inside an argument to |
The following is the coverage report on the affected files.
|
The following is the coverage report on the affected files.
|
I've added to the existing script example and confirmed that it explodes nicely prior to this commit: λ k create -f ./examples/v1beta1/taskruns/step-script.yaml
taskrun.tekton.dev/step-script-djp6g created
λ tkn tr logs -f step-script-djp6g
[noshebang] no shebang
[noshebang] + echo no shebang
# ... snip logs ...
[double-dollar-allowed] double dollar signs ($) are not passed through as expected :(
container step-double-dollar-allowed has failed Also updated release note to reflect 256kB limit. |
The following is the coverage report on the affected files.
|
The following is the coverage report on the affected files.
|
/kind bug |
/test check-pr-has-kind-label |
TODO:
|
I went back and tested the existing standard_init_linux.go:219: exec user process caused: argument list too long |
Using the annotations approach a single script field can go up to around 260,000 bytes. The trade-off is that this pool of bytes is shared amongst all the script blocks in your Task. And, of course, anything else that wants to write to the Pod's annotations. Here's what the failure condition looks like for the annotations approach when you use too many bytes: conditions:
- lastTransitionTime: "2021-04-26T19:20:52Z"
message: 'failed to create task run pod "script-262144-tm26f": Pod "script-262144-tm26f-pod-vtrz7"
is invalid: metadata.annotations: Too long: must have at most 262144 bytes.
Maybe invalid TaskSpec'
reason: CouldntGetTask |
@imjasonh looking at these two error conditions and size limits, do you feel strongly one way or the other about moving ahead on this PR? I think I've convinced myself at this point that it's OK to take the new annotations-based approach. Anyone with scripts this massive should probably bake them in to a container image rather than inflate a Task with them. |
The following is the coverage report on the affected files.
|
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
i think we might want to call this out as a potentially backwards incompatible change in our release notes (technically id say fixing any bug is a backwards incompatible change X'D but that's maybe a convo for another day) - specifically b/c some folks may have run into this and worked around it (and i wonder if it's possible that using args
the way we were may have been doing some other replacements? or maybe $$
was definitely the only one - sounds like that might be the case...)
or maybe
$$
was definitely the only one - sounds like that might be the case...
if this IS really only an issue with $$
, what do you think about the alternative where we special case that? i.e. we look for $$
in the script block and replace it with $$$$
? that seems kind hacky but if this really is the only sequence that was being replaced, it seems like it avoids adding the complexity of the annotation + downward API solution
runAsUser: 99 | ||
script: | | ||
#!/usr/bin/env python3 | ||
if '$$' != '\u0024\u0024': |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
nice test :D
Oooh yeah I like that much much more from a backwards-compatibility angle. Great point, will have a go at implementing it that way instead. |
/hold |
@bobcatfish I'm pretty sure $$ is the only case. I personally thought it was a bug in Kubernetes itself and offered a fix, but they will probably choose not to follow it up because it will break existing clients. But as a result I dove in the code and found the expansion parts. See also the testcases for it. Replacing all |
/hold cancel @bobcatfish that solves the immediate problem with much less blast radius. Thanks! |
Moved the creds-init volume naming test fix over to its own pr: #3907 |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
other than my nit pick about the security context looks great!!
if you DO get a chance it might be worth at least mentioning something about this change in the commit? (just let me know if you did and im missing it....)
/approve
// on these instances in our args and Kubernetes reduces them back down | ||
// to the expected number of dollar signs. This is a workaround for | ||
// https://github.com/kubernetes/kubernetes/issues/101137 | ||
script = strings.Replace(script, "$$", "$$$$", -1) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
wooooot :D
# are able to run scripts regardless of any specific | ||
# Step's UID. | ||
securityContext: | ||
runAsUser: 2000 |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
nit pick but i dont wanna block anythign! are these changes related to the $$ change? if not i feel like it would be clearer to have them in another commit (and/or PR)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Good call, I've removed these and moved them to a new PR: #3914
[APPROVALNOTIFIER] This PR is APPROVED This pull-request has been approved by: bobcatfish The full list of commands accepted by this bot can be found here. The pull request process is described here
Needs approval from an approver in each of these files:
Approvers can indicate their approval by writing |
/test tekton-pipeline-unit-tests |
Prior to this commit Steps with Scripts that include two dollar signs resulted in only one dollar sign ending up in the final executed script. This is because the script is passed into an init container via its "args" field and Kubernetes replaces instances of the literal string "$$" with "$". This is a documented behaviour of the "args" field (sort of) but it doesn't make sense to have those replacements happen for our "script" blocks. "$$" in bash scripts is the current process id so replacing this character sequence can be problematic. This commit replaces instances of two dollar signs in a script with four dollar signs. Kubernetes then converts these back to two dollar signs. This replacement behaviour only exists for dollar symbols so hard-coding this replacement feels like an acceptable workaround that will have the least impact on backwards compatibility.
/test tekton-pipeline-unit-tests |
/test pull-tekton-pipeline-integration-tests |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thank you for this and nice test!
It's good to fix this asap before too many people starts relying on the broken behaviour :)
/lgtm
Kubernetes replaces instances of "$$" in container args fields with "$". This can muck up the contents of script fields because scripts are passed into a TaskRun Pod as an arg to an init container. Prior to this commit we [tried to prevent the replacement](#3888) from happening by: 1. putting scripts into annotations on a pod and projecting them using downward API - con: the max size of the annotations map is capped to ~250kB. The aggregate size of all scripts in a single Task therefore becomes constrained by this. Any other systems using annotations will reduce the available headroom. Backwards incompatible. 2. replacing instances of "$$" in scripts with "$$$$" for k8s to then process back to "$$" - con: k8s doesn't actually process _all_ instances of "$$". For example, if you write an arg with format "echo $(eval \$$foo)" then k8s will see the first "$(", assume it's a variable reference, and pass it through verbatim. So user's scripts with bash variable become broken by tekton's new replacement. Backwards incompatible. This commit takes a third approach, proposed by @MartinKanters, encoding scripts as base64 in the controller and then having them decoded in the init container. This bypasses Kubernetes' args processing completely because dollar signs aren't used in base64 encodings. It also doesn't introduce a backwards-incompatible limit to the aggregate script size. And it doesn't mangle existing bash scripts with variable replacements. The most noticeable trade-offs we now make are: 1. Tiny scripts can be up to 300% bigger, but as scripts get longer the max increase gets closer to 133%. 2. Also the TaskRun's `initContainer` YAML is a bit less human readable: ``` initContainers: - args: - -c - | tmpfile="/tekton/scripts/script-0-f8fmf" touch ${tmpfile} && chmod +x ${tmpfile} cat > ${tmpfile} << '_EOF_' IyEvYmluL3NoCnNldCAteGUKZWNobyAibm8gc2hlYmFuZyI= _EOF_ /tekton/tools/entrypoint decode-script "${tmpfile}" ``` The entrypoint is extended to decode base64 files so that the `shellImage` (which is used to write scripts to disk for Step containers) is not required to package a `base64` binary.
Changes
Fix issue with "$$" in Script blocks
Prior to this commit Steps with Scripts that include twodollar signs resulted in only one dollar sign ending upin the final executed script. This is because the scriptis passed into an init container via its "args" field andKubernetes replaces instances of the literal string"$$" with "$". This is a documented behaviour of the"args" field (sort of) but it doesn't make sense to havethose replacements happen for our "script" blocks. "$$"in bash scripts is the current process id so replacing thischaracter sequence can be problematic.
This commit replaces instances of two dollar signs in a scriptwith four dollar signs. Kubernetes then converts these backto two dollar signs. This replacement behaviour only existsfor dollar symbols so hard-coding this replacement feels likean acceptable workaround that will have the least impacton backwards compatibility.
Submitter Checklist
As the author of this PR, please check off the items in this checklist:
functionality, content, code)
Release Notes
/kind bug