Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Fix sb_read and umount cache not be freed #53

Merged
merged 2 commits into from
Aug 4, 2024

Conversation

RoyWFHuang
Copy link
Collaborator

@RoyWFHuang RoyWFHuang commented May 19, 2024

Using kmemleak to detect the leak, I found the the command "ls" and "mv" will make the cache leak

here are the leak informations:

0 [<ffffffffa5d2f98e>] __alloc_pages+0x24e
1 [<ffffffffa5d2f98e>] __alloc_pages+0x24e
2 [<ffffffffa5d51d3e>] alloc_pages+0x9e
3 [<ffffffffa5cbd4de>] __page_cache_alloc+0x7e
4 [<ffffffffa5cc1342>] pagecache_get_page+0x152
5 [<ffffffffa5dedec8>] grow_dev_page+0x48
6 [<ffffffffa5deebac>] __getblk_gfp+0xbc
7 [<ffffffffa5deed01>] __bread_gfp+0x11
  ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ <-- seems __bread() leak.
8 [<ffffffffc098427b>] ftrace_trampoline+0x427b
9 [<ffffffffc09845a7>] ftrace_trampoline+0x45a7
10 [<ffffffffa5dafaed>] vfs_mkdir+0xad
11 [<ffffffffa5db3368>] do_mkdirat+0x128
12 [<ffffffffa5db351c>] __x64_sys_mkdir+0x4c
13 [<ffffffffa5a04d64>] x64_sys_call+0x94
14 [<ffffffffa67c2ce6>] do_syscall_64+0x56
15 [<ffffffffa68000df>] entry_SYSCALL_64_after_hwframe+0x67

@jserv jserv requested a review from HotMercury May 19, 2024 14:14
@jserv
Copy link
Collaborator

jserv commented May 19, 2024

I defer to @HotMercury for confirmation.

@HotMercury
Copy link
Collaborator

Hi @RoyWFHuang
There is something wrong on my environment to use kmemleak, I will check later.

@HotMercury HotMercury closed this May 21, 2024
@HotMercury HotMercury reopened this May 21, 2024
@RoyWFHuang
Copy link
Collaborator Author

Hi @RoyWFHuang There is something wrong on my environment to use kmemleak, I will check later.

@HotMercury
error: attempt to use poisoned "strlcpy"
Is that the problem you encounter?

If that, you can go to bpftool/libbpf/ and update it to latest version

@HotMercury
Copy link
Collaborator

Thank you for your suggestions. I believe your logic in checking is correct, but I am using kmemleak, and it does not show any leak errors.
Please give me more about testing environment like kernel version and below information.

  BPFTOOL  bpftool/bootstrap/bpftool
...                        libbfd: [ on  ]
...               clang-bpf-co-re: [ on  ]
...                          llvm: [ on  ]
...                        libcap: [ on  ]

In my test, there is no error

$ sudo ./kmodleak simplefs -v
module 'simplefs' loaded
module 'simplefs' unloaded

3 stacks with outstanding allocations:
8192 bytes in 2 allocations from stack
	addr = 0xa33 size = 4096
	addr = 0xc2b size = 4096
	0 [<ffffffff9284e7e4>] __alloc_pages+0x264
	1 [<ffffffff9284e7e4>] __alloc_pages+0x264
	2 [<ffffffff928859b1>] alloc_pages_mpol+0x91
	3 [<ffffffff92885f14>] folio_alloc+0x64
	4 [<ffffffff927b5734>] filemap_alloc_folio+0xf4
	5 [<ffffffff927bac5b>] __filemap_get_folio+0x14b
	6 [<ffffffff929411f2>] __getblk_slow+0xb2
	7 [<ffffffff92941591>] __bread_gfp+0x81
	8 [<ffffffffc0710b59>] init_iso9660_fs+0x6b49
	9 [<ffffffff928ff674>] iterate_dir+0x114
	10 [<ffffffff928fff94>] __x64_sys_getdents64+0x84
	11 [<ffffffff92405b73>] x64_sys_call+0x1b43
	12 [<ffffffff9361b78f>] do_syscall_64+0x7f
	13 [<ffffffff9380012b>] entry_SYSCALL_64_after_hwframe+0x73
208 bytes in 2 allocations from stack
	addr = 0xffff898c35e6c8f0 size = 104
	addr = 0xffff898c1d3add00 size = 104
	0 [<ffffffff9285c8a3>] kmem_cache_alloc+0x253
	1 [<ffffffff9285c8a3>] kmem_cache_alloc+0x253
	2 [<ffffffff9293d6be>] alloc_buffer_head+0x1e
	3 [<ffffffff9293ebb5>] folio_alloc_buffers+0x95
	4 [<ffffffff92941238>] __getblk_slow+0xf8
	5 [<ffffffff92941591>] __bread_gfp+0x81
	6 [<ffffffffc0710b59>] init_iso9660_fs+0x6b49
	7 [<ffffffff928ff674>] iterate_dir+0x114
	8 [<ffffffff928fff94>] __x64_sys_getdents64+0x84
	9 [<ffffffff92405b73>] x64_sys_call+0x1b43
	10 [<ffffffff9361b78f>] do_syscall_64+0x7f
	11 [<ffffffff9380012b>] entry_SYSCALL_64_after_hwframe+0x73
192 bytes in 1 allocations from stack
	addr = 0xffff898c015ef180 size = 192
	0 [<ffffffff9285c55b>] kmem_cache_alloc_lru+0x25b
	1 [<ffffffff9285c55b>] kmem_cache_alloc_lru+0x25b
	2 [<ffffffff929046e4>] __d_alloc+0x34
	3 [<ffffffff9290493a>] d_alloc+0x1a
	4 [<ffffffff92907c7a>] d_alloc_parallel+0x5a
	5 [<ffffffff928f2b0c>] __lookup_slow+0x5c
	6 [<ffffffff928f2e0c>] lookup_one_unlocked+0x9c
	7 [<ffffffff928f2ebd>] lookup_positive_unlocked+0x1d
	8 [<ffffffff92aa673d>] debugfs_lookup+0x5d
	9 [<ffffffff92aa679f>] debugfs_lookup_and_remove+0xf
	10 [<ffffffff9285fd69>] debugfs_slab_release+0x19
	11 [<ffffffff927fee5c>] kmem_cache_destroy+0x11c
	12 [<ffffffffc070c625>] init_iso9660_fs+0x2615
	13 [<ffffffffc0710fe9>] init_iso9660_fs+0x6fd9
	14 [<ffffffff925f6943>] __do_sys_delete_module.isra.0+0x1a3
	15 [<ffffffff925f6af2>] __x64_sys_delete_module+0x12
	16 [<ffffffff92406320>] x64_sys_call+0x22f0
	17 [<ffffffff9361b78f>] do_syscall_64+0x7f
	18 [<ffffffff9380012b>] entry_SYSCALL_64_after_hwframe+0x73
done

@RoyWFHuang
Copy link
Collaborator Author

RoyWFHuang commented May 23, 2024

You can test the example in kmodleak, you will see that

1 stacks with outstanding allocations:
128 bytes in 1 allocations from stack
addr = 0xffff8c4204ae7300 size = 128
0 [<ffffffffa5d61b11>] kmem_cache_alloc_trace+0x261
1 [<ffffffffa5d61b11>] kmem_cache_alloc_trace+0x261
2 [<ffffffffc09ae02b>] 
bpf_prog_cc9504c91d44b81c_kmodleak__mm_page_free+0x127f
3 [<ffffffffa5a03a06>] do_one_initcall+0x46
4 [<ffffffffa5b9ec12>] do_init_module+0x52
5 [<ffffffffa5ba062b>] load_module+0xb2b
6 [<ffffffffa5a955b0>] __kretprobe_trampoline+0x0
7 [<ffffffffa5ba0a08>] __x64_sys_finit_module+0x18
8 [<ffffffffa5a06793>] x64_sys_call+0x1ac3
9 [<ffffffffa67c2ce6>] do_syscall_64+0x56
10 [<ffffffffa68000df>] entry_SYSCALL_64_after_hwframe+0x67
done

The checker will not show "error" or "leak" (like valgrind), it only shows the allocating path (that's the eBpf actually doing)
So if you see that something show out, that means some leaking in the program

And as your comments, there are 3 leaking in it

7 [<ffffffff92941591>] __bread_gfp+0x81
5 [<ffffffff92941591>] __bread_gfp+0x81
11 [<ffffffff927fee5c>] kmem_cache_destroy+0x11c

Another "11 [] kmem_cache_destroy+0x11c", The problem bothers me, Could you give me you mail that we can discuss this?

@sysprog21 sysprog21 deleted a comment from HotMercury May 28, 2024
@jserv
Copy link
Collaborator

jserv commented Jun 5, 2024

@HotMercury, can you confirm the effectiveness of this proposed change?

@HotMercury
Copy link
Collaborator

Hi @RoyWFHuang
You can use rcu_barrier() or a kernel thread to solve the problem where simplefs_destroy_inode_cache does not release in time before unmounting simplefs.

@jserv
Copy link
Collaborator

jserv commented Jun 17, 2024

You can use rcu_barrier() or a kernel thread to solve the problem where simplefs_destroy_inode_cache does not release in time before unmounting simplefs.

You should paste the proposed changes for dedicated discussions.

Copy link
Collaborator

@jserv jserv left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Rebase the latest master branch and validate memory leaks.

@RoyWFHuang RoyWFHuang changed the title Fix sb_read cache not be freed Fix sb_read and umount cache not be freed Jun 27, 2024
@RoyWFHuang RoyWFHuang requested a review from jserv June 27, 2024 08:30
Copy link
Collaborator

@jserv jserv left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Make the git commit messages informative. In particular, mention the way how you figure out the potential leaks.

@jserv
Copy link
Collaborator

jserv commented Jul 12, 2024

Make the git commit messages informative. In particular, mention the way how you figure out the potential leaks.

@RoyWFHuang, can you improve the git commit messages?

@RoyWFHuang
Copy link
Collaborator Author

Make the git commit messages informative. In particular, mention the way how you figure out the potential leaks.

@RoyWFHuang, can you improve the git commit messages?

Yes, just wait a few days, I need some time to conclude it.

1. When the cache be destoryed, we should call rcu_barrier() to prevent
call_rcu() still works and this also prevent the cache be reused.

2. After simplefs_destroy_inode_cache() function, we need rcu_barrier()
to make sure all memory have be freed.
@RoyWFHuang RoyWFHuang requested a review from jserv August 4, 2024 15:19
@jserv jserv merged commit 56d2ca0 into sysprog21:master Aug 4, 2024
2 checks passed
@jserv
Copy link
Collaborator

jserv commented Aug 4, 2024

Thank @RoyWFHuang for contributing!

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants