Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Fix security issue between buildkite agent and Bash 5.2 #1781

Merged
merged 10 commits into from
Oct 14, 2022
Merged

Conversation

moskyb
Copy link
Contributor

@moskyb moskyb commented Oct 13, 2022

Bash 5.2 and the Buildkite Agent have a compatibility issue. This issue may reveal the values of environment variables exported by hooks that contain multiple lines.

This new version of Bash was released 16 days ago, and includes an update that changes how environment variables are exported. When variables contain multiple lines Bash now exports them using $'...', or "ANSI-C" style quoting.

The Buildkite Agent allows using hooks to customise how jobs are run. These hooks are Bash scripts. Hooks can change environment variables, and those changes are propagated into later hooks and commands. This is done by exporting the variables from the Bash script and parsing that output. It doesn't yet understand this new style of quoting, so environment variables with newlines are currently parsing incorrectly, and so are being lost between hooks and commands.

The Agent will print the names of environment variables which are changed by hooks. This new style of quote is mis-parsed, however, and the value may be considered part of the name. This can cause an escaped version of the value to be printed within the job log by mistake:

image

This PR: Updates the parsing process we use to parse hook environments. On unix-y operating systems, we entirely replace the old system with one that uses a new command, buildkite-agent env. This new command simply outputs all of the process environment as a JSON-encoded map, making the job of parsing process environments much more straightforward, and sidestepping the bash issue entirely.

Issue: #1780

Many, many thanks to @morxa for reporting this issue.

@moskyb moskyb requested a review from a team October 13, 2022 02:12
Comment on lines 23 to 35
buildkite-agent env > "{{.BeforeEnvFileName}}"
CALL "{{.PathToHook}}"
SET BUILDKITE_HOOK_EXIT_STATUS=!ERRORLEVEL!
SET BUILDKITE_HOOK_WORKING_DIR=%CD%
SET > "{{.AfterEnvFileName}}"
buildkite-agent env > "{{.AfterEnvFileName}}"
EXIT %BUILDKITE_HOOK_EXIT_STATUS%`

powershellScript = `$ErrorActionPreference = "STOP"
Get-ChildItem Env: | Foreach-Object {"$($_.Name)=$($_.Value)"} | Set-Content "{{.BeforeEnvFileName}}"
buildkite-agent env | Set-Content "{{.BeforeEnvFileName}}"
{{.PathToHook}}
if ($LASTEXITCODE -eq $null) {$Env:BUILDKITE_HOOK_EXIT_STATUS = 0} else {$Env:BUILDKITE_HOOK_EXIT_STATUS = $LASTEXITCODE}
$Env:BUILDKITE_HOOK_WORKING_DIR = $PWD | Select-Object -ExpandProperty Path
Get-ChildItem Env: | Foreach-Object {"$($_.Name)=$($_.Value)"} | Set-Content "{{.AfterEnvFileName}}"
buildkite-agent env | Set-Content "{{.AfterEnvFileName}}"
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The windows scripts are untested as yet; smoke testing the agent on windows is a bit of a pain when i only have a mac dev box, but we'll get this sorted prior to merge

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Can we avoid making a change to windows if this is only known to be a bash issue so far?

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Good idea.

It's worth noting that some of our buildkite/agent usage on Windows does use bash, via cygwin e.g. as installed by Git for Windows. But your point stands; powershell and BAT files probably aren't affected.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

i'm kinda torn on this. i agree that making a minimal change (ie not messing with the windows scripts) will make this change easier, but it would also be really nice to have one really consistent way of getting the environment - otherwise we're waiting for powershell to pull a fast one on us in the same way that bash has.

the other thing is that not changing the windows implementation to be the same makes deleting the existing code harder - it's all currently tied up in the env.FromExport method, which does a bunch of malarkey to figure out what kind of output it's getting, and then tries to parse it. if we could delete that whole method, it'll be a much simpler change to make.

i'm happy to smoke test this change on a windows box in AWS if that makes things easier

Copy link
Contributor Author

@moskyb moskyb Oct 14, 2022

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

for future travellers: we decided (out of band) to prioritise the fix for the security issue, as that's more pressing than having consistent code, and the issue shouldn't affect windows machines in most cases, so it's probably fine to leave them as is for now.

We'll backport the buildkite-agent env stuff into windows very soon though, as it's definitely nice to have consistent code, and it lets us delete a whole bunch of code. yay!

@pda
Copy link
Member

pda commented Oct 13, 2022

I wonder about putting map[string]string between the os.Environ() output and the JSON representation 🤔

Go's non-deterministic ordering of map makes the output unstable; harder to test / diff / etc. I think at a minimum we should make the ordering stable, somehow. Unless I'm misremembering, and Go maps or JSON serialization thereof is already stable, but I don't think so.

More subtle: buildkite-agent is run in diverse environments (pun unintended) by people doing diverse things. AFAIK Unix/POSIX/computerey “environment” is loosely defined. At a code level, Go's os.Environ(), Linux's environ() call, and other implementations all agree that environment is a list of strings. But what can we assume about the strings?

Linux says:

By convention, the strings in environ have the form "name=value".
The name is case-sensitive and may not contain the character "=".
The value can be anything that can be represented as a string.
The name and the value may not contain an embedded null byte
('\0'), since this is assumed to terminate the string.

POSIX.1-2017 says

These strings have the form name=value; names shall not contain the character '='. For values to be portable across systems conforming to POSIX.1-2017, the value shall be composed of characters from the portable character set (except NUL and as indicated below). There is no meaning associated with the order of strings in the environment. If more than one string in an environment of a process has the same name, the consequences are undefined.

So Linux suggests that strings are only name=value by convention, implying they might be something different.
And neither Linux nor POSIX are specific about what happens if there's duplicate names.

I guess I'm wondering if buildkite-agent env should serialize to:

  • an object like {"name": "value", …} object, implicitly deduped from os.Environ()
    • in random order via Go's map[string]string,
    • in defined order e.g. sorted by name,
    • in the order they came out of os.Environ(),
  • an array of (name, value) tuples like [[name, value], …] after splitting each line by first =
    • in defined order e.g. sorted by name,
    • in the order they came out of os.Environ(),
  • an array like ["name=value", …]
    • verbatim from os.Environ()

I would lean towards either:

  • [[name, value], …] in the order that os.Environ() provided.
  • {"name": "value", …} sorted by name.

The former gives fairly useful uncontroversial/unopinionated lossless representation of the environment as provided by the OS. The latter gives a slightly more useful shape but is slightly more opinionated re. deduping and ordering.

exit $Env:BUILDKITE_HOOK_EXIT_STATUS`

bashScript = `export -p > "{{.BeforeEnvFileName}}"
bashScript = `buildkite-agent env > "{{.BeforeEnvFileName}}"
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Does this introduce a dependency on (the correct / currently-used version of) buildkite-agent being in $PATH? Or do we already do that elsewhere? Might need to provide an absolute path or something?

I'm wondering what happens if I go build -o buildkite-agent && buildkite-agent start on a system that has an older version installed in $PATH… I think it'll be surprising if the running version calls the older version, especially if the older version doesn't have the env subcommand.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

So looking at other parts of the bootstrap, we definitely do depend on the agent being on the PATH:
https://github.com/buildkite/agent/blob/main/bootstrap/bootstrap.go#L1866-L1868

but i think you're right to question this, and we should probably stay away from assuming the agent is on the path in the future, it'll only cause trouble when testing things locally.

We should be able to get away with using os.Executable() for this with no worries

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

oh weird! i thought we could use os.Executable() here, but for some reason it doesn't appear to play nicely with bintest, and causes every(?) test with bintest to fail with errors like

hooks_integration_test.go:450: fork/exec /opt/homebrew/bin/git: resource temporarily unavailable
# or
ssh_test.go:88: Compile of _bintest_06ec6d25081df5c7f7c59c7397831c7eae5da454/main.go failed: go: error obtaining buildID for go tool compile: fork/exec /opt/homebrew/Cellar/go/1.19.2/libexec/pkg/tool/darwin_arm64/compile: resource temporarily unavailable
# or
shell_test.go:265: Error running `/bin/pwd`: fork/exec /bin/pwd: resource temporarily unavailable

... and then the test hangs for 10 minutes, then times out, and everything is terrible forever. This only happens if i add an os.Executable() call to the scriptwrapper.

I think i'm going to close this can of worms by just calling buildkite-agent env directly, as there's prior art there, but i might make a note of it in a comment somewhere?

beforeEnvContents, err := ioutil.ReadFile(wrap.beforeEnvFile.Name())
beforeEnvContents, err := os.ReadFile(wrap.beforeEnvFile.Name())
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Ah I see ioutil.ReadFile is deprecated 👍🏼

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

all of ioutil is apparently! i learned this yesterday. we should probably go through and replace things, but the compatibility guarantee means that we probably won't have to until *checks watch* ever

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Was it golangci-lint that let you know?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

it might have been 😅

@moskyb
Copy link
Contributor Author

moskyb commented Oct 13, 2022

re: ordering in the environment, when serializing maps, json.Marshal() sorts the maps by key alphabetically: https://pkg.go.dev/encoding/json#Marshal:

Map values encode as JSON objects. The map's key type must either be a string, an integer type, or implement encoding.TextMarshaler. The map keys are sorted and used as JSON object keys

So we get ordering for free! that's nice. I verified this myself:

$ AARDVARK="yep!" ZEBRA="nope!" buildkite-agent env --pretty # Bloody good of them to have animals at each end of the alphabet i reckon
{
  "AARDVARK": "yep!",
  "BUILDKITE_API_TOKEN": "nice try!",
  # ...
  "ZEBRA": "nope!",
  "__CFBundleIdentifier": "com.googlecode.iterm2",
  "__CF_USER_TEXT_ENCODING": "0x1F5:0:15"
}

part of the reason that i decided on "big ol json map" as the encoding mechanism here is because the env.Environment type is just a map[string]string with a bit of sauce on it, so it makes the encode/decode cycle of this nice and easy.

@moskyb moskyb force-pushed the better-env-dumping branch from a5c79cd to 74fbd1b Compare October 13, 2022 22:14
Ideally, i think we'd probably like to test these directly, in band, but it's tricky to ensure that we always have an agent binary on the path, so a mock will have to do
@moskyb moskyb force-pushed the better-env-dumping branch 2 times, most recently from 4b7ab71 to c2f48b9 Compare October 14, 2022 03:00
@moskyb moskyb force-pushed the better-env-dumping branch from c2f48b9 to d61a0a3 Compare October 14, 2022 03:06
pda added 2 commits October 14, 2022 15:03
I have it on reliabable authority that bintest can't be used
concurrently.
Otherwise we get an error like this:

> Error: Failed to get environment: failed to unmarshal after env file: unexpected end of JSON input, file contents: ""

Note that previously, a hook deliberately unsetting _all_ environment
vars was indistinguishable from the early-exit condition.

With `buildkite-agent env` JSON environment, the two are now separate;
an empty JSON object is a non-empty file.

So now deliberly (or accidentally?) cleared environment will be
correctly propagated as unsetting all vars, vs previously where all vars
were left set. This should be an improvement?
@triarius triarius self-requested a review October 14, 2022 04:41
return HookScriptChanges{}, fmt.Errorf("failed to unmarshal after env file: %w, file contents: %q", err, string(afterEnvContents))
}
}

diff := afterEnv.Diff(beforeEnv)

// Pluck the after wd from the diff before removing the key from the diff
afterWd := ""
if afterWd == "" {
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

lol

Copy link
Contributor

@triarius triarius left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM

@pda pda merged commit 647756c into main Oct 14, 2022
@pda pda deleted the better-env-dumping branch October 14, 2022 05:57
@triarius triarius changed the title [WIP] Replace export -p with a dedicated agent command for printing env Replace export -p with a dedicated agent command for printing env Oct 14, 2022
triarius pushed a commit that referenced this pull request Oct 14, 2022
[WIP] Replace export -p with a dedicated agent command for printing env
@triarius triarius mentioned this pull request Oct 14, 2022
triarius added a commit that referenced this pull request Oct 14, 2022
@moskyb moskyb changed the title Replace export -p with a dedicated agent command for printing env Fix security issue between buildkite agent and Bash 5.2 Oct 14, 2022
triarius added a commit to buildkite/elastic-ci-stack-for-aws that referenced this pull request Oct 17, 2022
[Full Changelog](v5.11.1...v5.11.2)

- Fix log collector date command [#1048](#1048) (@jeremybumsted)
- Move "Upload" header in publish.sh to before all uploads [#1046](#1046) (@dabarrell)

- Bump buildkite-agent to v3.39.1 [#1054](#1054) (@triarius)
- Bump buildkite-agent to v3.39.0 [#1049](#1049) (@dabarrell)

- buildkite-agent v3.39.1 contains a security update. [buildkite/agent #1781](buildkite/agent#1781)
moskyb added a commit that referenced this pull request Oct 18, 2022
This uses the new buildkite-agent env command introduced in [#1781](#1781), and uses it for hooks on windows machines as well. This change was scoped out of the initial fix, as windows machines weren't affected by the bug, and the effort of testing them would add time between the issue being reported and a fix being issued.

This PR ports the behaviour that we were previously using for hooks running on unix machines to windows machines as well, making the code a bunch nicer and allowing us to delete a whole bunch of useless env processing code.
moskyb added a commit that referenced this pull request Oct 18, 2022
This uses the new buildkite-agent env command introduced in [#1781](#1781), and uses it for hooks on windows machines as well. This change was scoped out of the initial fix, as windows machines weren't affected by the bug, and the effort of testing them would add time between the issue being reported and a fix being issued.

This PR ports the behaviour that we were previously using for hooks running on unix machines to windows machines as well, making the code a bunch nicer and allowing us to delete a whole bunch of useless env processing code.
moskyb added a commit that referenced this pull request Nov 2, 2022
This uses the new buildkite-agent env command introduced in [#1781](#1781), and uses it for hooks on windows machines as well. This change was scoped out of the initial fix, as windows machines weren't affected by the bug, and the effort of testing them would add time between the issue being reported and a fix being issued.

This PR ports the behaviour that we were previously using for hooks running on unix machines to windows machines as well, making the code a bunch nicer and allowing us to delete a whole bunch of useless env processing code.
DrJosh9000 pushed a commit that referenced this pull request Nov 7, 2022
This uses the new buildkite-agent env command introduced in [#1781](#1781), and uses it for hooks on windows machines as well. This change was scoped out of the initial fix, as windows machines weren't affected by the bug, and the effort of testing them would add time between the issue being reported and a fix being issued.

This PR ports the behaviour that we were previously using for hooks running on unix machines to windows machines as well, making the code a bunch nicer and allowing us to delete a whole bunch of useless env processing code.
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants