Skip to content
Merged
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
3 changes: 2 additions & 1 deletion containers/agent/entrypoint.sh
Original file line number Diff line number Diff line change
Expand Up @@ -406,7 +406,8 @@ else
# 2. gosu switches to awfuser (drops root privileges)
# 3. exec replaces the current process with the user command
#
# Enable one-shot token protection to prevent tokens from being read multiple times
# Enable one-shot token protection - tokens are cached in memory and
# unset from the environment so /proc/self/environ is cleared
export LD_PRELOAD=/usr/local/lib/one-shot-token.so
exec capsh --drop=$CAPS_TO_DROP -- -c "exec gosu awfuser $(printf '%q ' "$@")"
fi
42 changes: 22 additions & 20 deletions containers/agent/one-shot-token/README.md
Original file line number Diff line number Diff line change
Expand Up @@ -2,9 +2,9 @@

## Overview

The one-shot token library is an `LD_PRELOAD` shared library that provides **single-use access** to sensitive environment variables containing GitHub, OpenAI, Anthropic/Claude, and Codex API tokens. When a process reads a protected token via `getenv()`, the library returns the value once and immediately unsets the environment variable, preventing subsequent reads.
The one-shot token library is an `LD_PRELOAD` shared library that provides **cached access** to sensitive environment variables containing GitHub, OpenAI, Anthropic/Claude, and Codex API tokens. When a process reads a protected token via `getenv()`, the library caches the value in memory and immediately unsets the environment variable. Subsequent `getenv()` calls return the cached value, allowing the process to read tokens multiple times while `/proc/self/environ` is cleared.

This protects against malicious code that might attempt to exfiltrate tokens after the legitimate application has already consumed them.
This protects against exfiltration via `/proc/self/environ` inspection while allowing legitimate multi-read access patterns that programs like the Copilot CLI require.
Comment on lines +5 to +7
Copy link

Copilot AI Feb 10, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

PR title/description mention "skip-unset mode", but the documented behavior here (and in the code) is in-memory caching + unsetenv. Consider updating the PR title/description to match the implemented caching approach to avoid confusion during release notes and audits.

Copilot uses AI. Check for mistakes.

## Configuration

Expand Down Expand Up @@ -78,7 +78,7 @@ Linux's dynamic linker (`ld.so`) supports an environment variable called `LD_PRE
│ Application calls getenv("GITHUB_TOKEN"): │
│ 1. Resolves to one-shot-token.so's getenv() │
│ 2. We check if it's a sensitive token │
│ 3. If yes: call real getenv(), copy value, unsetenv(), return │
│ 3. If yes: cache value, unsetenv(), return cached value
│ 4. If no: pass through to real getenv() │
└─────────────────────────────────────────────────────────────────┘
```
Expand All @@ -100,7 +100,7 @@ Second getenv("GITHUB_TOKEN") call:
┌─────────────┐ ┌──────────────────┐
│ Application │────→│ one-shot-token.so │
│ │ │ │
│ │←────│ Returns: NULL │ (token already accessed)
│ │←────│ Returns: "ghp_..." │ (from in-memory cache)
└─────────────┘ └──────────────────────┘
```

Expand All @@ -118,16 +118,17 @@ When `LD_PRELOAD=/usr/local/lib/one-shot-token.so` is set, the dynamic linker lo

We use `dlsym(RTLD_NEXT, "getenv")` to get a pointer to the **next** `getenv` in the symbol search order (libc's implementation). This allows us to:
- Call the real `getenv()` to retrieve the actual value
- Return that value to the caller
- Then call `unsetenv()` to remove it from the environment
- Cache the value in an in-memory array
- Call `unsetenv()` to remove it from the environment (clears `/proc/self/environ`)
- Return the cached value to the caller

### 3. State Tracking
### 3. State Tracking and Caching

We maintain an array of flags (`token_accessed[]`) to track which tokens have been read. Once a token is marked as accessed, subsequent calls return `NULL` without consulting the environment.
We maintain an array of flags (`token_accessed[]`) and a parallel cache array (`token_cache[]`). On first access, the token value is cached and the environment variable is unset. Subsequent calls return the cached value directly.

### 4. Memory Management

When we retrieve a token value, we `strdup()` it before calling `unsetenv()`. This is necessary because:
When we retrieve a token value, we `strdup()` it into the cache before calling `unsetenv()`. This is necessary because:
- `getenv()` returns a pointer to memory owned by the environment
- `unsetenv()` invalidates that pointer
- The caller expects a valid string, so we must copy it first
Expand Down Expand Up @@ -209,9 +210,9 @@ LD_PRELOAD=./one-shot-token.so ./test_getenv
Expected output:
```
[one-shot-token] Initialized with 11 default token(s)
[one-shot-token] Token GITHUB_TOKEN accessed and cleared
[one-shot-token] Token GITHUB_TOKEN accessed and cached (value: test...)
First read: test-token-12345
Second read:
Second read: test-token-12345
```

### Custom Token Test
Expand All @@ -236,12 +237,12 @@ LD_PRELOAD=./one-shot-token.so bash -c '
Expected output:
```
[one-shot-token] Initialized with 2 custom token(s) from AWF_ONE_SHOT_TOKENS
[one-shot-token] Token MY_API_KEY accessed and cleared
[one-shot-token] Token MY_API_KEY accessed and cached (value: secr...)
First MY_API_KEY: secret-value-123
Second MY_API_KEY:
[one-shot-token] Token SECRET_TOKEN accessed and cleared
Second MY_API_KEY: secret-value-123
[one-shot-token] Token SECRET_TOKEN accessed and cached (value: anot...)
First SECRET_TOKEN: another-secret
Second SECRET_TOKEN:
Second SECRET_TOKEN: another-secret
```

### Integration with AWF
Expand All @@ -263,13 +264,14 @@ Note: The `AWF_ONE_SHOT_TOKENS` variable must be exported before running `awf` s

### What This Protects Against

- **Token reuse by injected code**: If malicious code runs after the legitimate application has read its token, it cannot retrieve the token again
- **Token leakage via environment inspection**: Tools like `printenv` or reading `/proc/self/environ` will not show the token after first access
- **Token leakage via environment inspection**: `/proc/self/environ` and tools like `printenv` (in the same process) will not show the token after first access — the environment variable is unset
Copy link

Copilot AI Feb 10, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The phrase "printenv (in the same process)" is misleading: printenv runs as a separate process. If you want to describe the effect, it’s more accurate to say child processes spawned after the token is unset won’t inherit it, and /proc//environ for the current process won’t include it.

Suggested change
- **Token leakage via environment inspection**: `/proc/self/environ` and tools like `printenv` (in the same process) will not show the token after first access — the environment variable is unset
- **Token leakage via environment inspection**: After the first access, the environment variable is unset, so `/proc/self/environ` no longer contains the token and child processes spawned afterward (including those running tools like `printenv`) do not inherit it

Copilot uses AI. Check for mistakes.
- **Token exfiltration via /proc**: Other processes reading `/proc/<pid>/environ` cannot see the token

### What This Does NOT Protect Against

- **Memory inspection**: The token exists in process memory (as the returned string)
- **Memory inspection**: The token exists in process memory (in the cache array)
- **Interception before first read**: If malicious code runs before the legitimate code reads the token, it gets the value
- **In-process getenv() calls**: Since values are cached, any code in the same process can still call `getenv()` and get the cached token
- **Static linking**: Programs statically linked with libc bypass LD_PRELOAD
- **Direct syscalls**: Code that reads `/proc/self/environ` directly (without getenv) bypasses this protection

Expand All @@ -279,13 +281,13 @@ This library is one layer in AWF's security model:
1. **Network isolation**: iptables rules redirect traffic through Squid proxy
2. **Domain allowlisting**: Squid blocks requests to non-allowed domains
3. **Capability dropping**: CAP_NET_ADMIN is dropped to prevent iptables modification
4. **One-shot tokens**: This library prevents token reuse
4. **Token environment cleanup**: This library clears tokens from `/proc/self/environ` while caching for legitimate use

## Limitations

- **x86_64 Linux only**: The library is compiled for x86_64 Ubuntu
- **glibc programs only**: Programs using musl libc or statically linked programs are not affected
- **Single process**: Child processes inherit the LD_PRELOAD but have their own token state (each can read once)
- **Single process**: Child processes inherit the LD_PRELOAD but have their own token state and cache (each starts fresh)

## Files

Expand Down
91 changes: 67 additions & 24 deletions containers/agent/one-shot-token/one-shot-token.c
Original file line number Diff line number Diff line change
Expand Up @@ -2,8 +2,9 @@
* One-Shot Token LD_PRELOAD Library
*
* Intercepts getenv() calls for sensitive token environment variables.
* On first access, returns the real value and immediately unsets the variable.
* Subsequent calls return NULL, preventing token reuse by malicious code.
* On first access, caches the value in memory and unsets from environment.
* Subsequent calls return the cached value, so the process can read tokens
* multiple times while /proc/self/environ no longer exposes them.
*
* Configuration:
* AWF_ONE_SHOT_TOKENS - Comma-separated list of token names to protect
Expand Down Expand Up @@ -53,6 +54,11 @@ static int num_tokens = 0;
/* Track which tokens have been accessed (one flag per token) */
static int token_accessed[MAX_TOKENS] = {0};

/* Cached token values - stored on first access so subsequent reads succeed
* even after the variable is unset from the environment. This allows
* /proc/self/environ to be cleaned while the process can still read tokens. */
static char *token_cache[MAX_TOKENS] = {0};

/* Mutex for thread safety */
static pthread_mutex_t token_mutex = PTHREAD_MUTEX_INITIALIZER;

Expand Down Expand Up @@ -199,12 +205,43 @@ static int get_token_index(const char *name) {
return -1;
}

/**
* Format token value for logging: show first 4 characters + "..."
* Returns a static buffer (not thread-safe for the buffer, but safe for our use case
* since we hold token_mutex when calling this)
*/
static const char *format_token_value(const char *value) {
static char formatted[8]; /* "abcd..." + null terminator */

if (value == NULL) {
return "NULL";
}

size_t len = strlen(value);
if (len == 0) {
return "(empty)";
}

if (len <= 4) {
/* If 4 chars or less, just show it all with ... */
snprintf(formatted, sizeof(formatted), "%s...", value);
} else {
/* Show first 4 chars + ... */
snprintf(formatted, sizeof(formatted), "%.4s...", value);
}

return formatted;
}

/**
* Intercepted getenv function
*
* For sensitive tokens:
* - First call: returns the real value, then unsets the variable
* - Subsequent calls: returns NULL
* - First call: caches the value, unsets from environment, returns cached value
* - Subsequent calls: returns the cached value from memory
*
* This clears tokens from /proc/self/environ while allowing the process
* to read them multiple times via getenv().
*
* For all other variables: passes through to real getenv
*/
Expand All @@ -226,30 +263,33 @@ char *getenv(const char *name) {
return real_getenv(name);
}

/* Sensitive token - handle one-shot access (mutex already held) */
/* Sensitive token - handle cached access (mutex already held) */
char *result = NULL;

if (!token_accessed[token_idx]) {
/* First access - get the real value */
/* First access - get the real value and cache it */
result = real_getenv(name);

if (result != NULL) {
/* Make a copy since unsetenv will invalidate the pointer */
/* Cache the value so subsequent reads succeed after unsetenv */
/* Note: This memory is intentionally never freed - it must persist
* for the lifetime of the caller's use of the returned pointer */
result = strdup(result);
* for the lifetime of the process */
token_cache[token_idx] = strdup(result);

/* Unset the variable so it can't be accessed again */
/* Unset the variable from the environment so /proc/self/environ is cleared */
unsetenv(name);

fprintf(stderr, "[one-shot-token] Token %s accessed and cleared\n", name);
fprintf(stderr, "[one-shot-token] Token %s accessed and cached (value: %s)\n",
name, format_token_value(token_cache[token_idx]));

result = token_cache[token_idx];
}
Comment on lines 273 to 286
Copy link

Copilot AI Feb 10, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

If strdup() fails, token_cache[token_idx] becomes NULL but the code still calls unsetenv() and then returns NULL to the caller even though the real getenv() succeeded. Consider handling OOM by skipping unsetenv/logging when strdup fails (or aborting consistently) to avoid unexpectedly breaking token reads.

Copilot uses AI. Check for mistakes.

/* Mark as accessed even if NULL (prevents repeated log messages) */
token_accessed[token_idx] = 1;
} else {
/* Already accessed - return NULL */
result = NULL;
/* Already accessed - return cached value */
result = token_cache[token_idx];
}

pthread_mutex_unlock(&token_mutex);
Expand All @@ -261,11 +301,11 @@ char *getenv(const char *name) {
* Intercepted secure_getenv function
*
* This function preserves secure_getenv semantics (returns NULL in privileged contexts)
* while applying the same one-shot token protection as getenv.
* while applying the same cached token protection as getenv.
*
* For sensitive tokens:
* - First call: returns the real value (if not in privileged context), then unsets the variable
* - Subsequent calls: returns NULL
* - First call: caches the value, unsets from environment, returns cached value
* - Subsequent calls: returns the cached value from memory
*
* For all other variables: passes through to real secure_getenv (or getenv if unavailable)
*/
Expand All @@ -285,7 +325,7 @@ char *secure_getenv(const char *name) {
return real_secure_getenv(name);
}

/* Sensitive token - handle one-shot access with secure_getenv semantics */
/* Sensitive token - handle cached access with secure_getenv semantics */
pthread_mutex_lock(&token_mutex);

char *result = NULL;
Expand All @@ -295,22 +335,25 @@ char *secure_getenv(const char *name) {
result = real_secure_getenv(name);

if (result != NULL) {
/* Make a copy since unsetenv will invalidate the pointer */
/* Cache the value so subsequent reads succeed after unsetenv */
/* Note: This memory is intentionally never freed - it must persist
* for the lifetime of the caller's use of the returned pointer */
result = strdup(result);
* for the lifetime of the process */
token_cache[token_idx] = strdup(result);

/* Unset the variable so it can't be accessed again */
/* Unset the variable from the environment so /proc/self/environ is cleared */
unsetenv(name);

fprintf(stderr, "[one-shot-token] Token %s accessed and cleared (via secure_getenv)\n", name);
fprintf(stderr, "[one-shot-token] Token %s accessed and cached (value: %s) (via secure_getenv)\n",
name, format_token_value(token_cache[token_idx]));
Comment on lines +346 to +347
Copy link

Copilot AI Feb 10, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Same concern as above: this secure_getenv log line includes a token value preview on stderr, which can leak sensitive info into logs. Recommend removing or guarding this preview behind an explicit opt-in debug control.

This issue also appears on line 282 of the same file.

Suggested change
fprintf(stderr, "[one-shot-token] Token %s accessed and cached (value: %s) (via secure_getenv)\n",
name, format_token_value(token_cache[token_idx]));
fprintf(stderr, "[one-shot-token] Token %s accessed and cached (via secure_getenv)\n",
name);

Copilot uses AI. Check for mistakes.

result = token_cache[token_idx];
}

/* Mark as accessed even if NULL (prevents repeated log messages) */
token_accessed[token_idx] = 1;
} else {
/* Already accessed - return NULL */
result = NULL;
/* Already accessed - return cached value */
result = token_cache[token_idx];
}

pthread_mutex_unlock(&token_mutex);
Expand Down
Loading
Loading