Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

OOM on zfs scrub #13546

Closed
dglidden opened this issue Jun 10, 2022 · 8 comments
Closed

OOM on zfs scrub #13546

dglidden opened this issue Jun 10, 2022 · 8 comments
Labels
Status: Stale No recent activity for issue Type: Defect Incorrect behavior (e.g. crash, hang)

Comments

@dglidden
Copy link

dglidden commented Jun 10, 2022

System information

Type Version/Name
Distribution Name Ubuntu
Distribution Version 20.04.4
Kernel Version 5.13.0-44-generic
Architecture x86_64
OpenZFS Version zfs-2.0.7-1

Describe the problem you're observing

zpool scrub causes System-wide OOM after several hours, killing the machine

Describe how to reproduce the problem

zpool scrub [pool name]
wait an hour to many hours
check console that the system has OOM and died

Include any warning/errors/backtraces from the system logs

Unfortunately I don't have any kind of core or logs because the whole system bombs when it goes OOM and can't do anything. If there is a way I can grab a kernel dump or any kind of logging beforehand, I am more than willing to give it a go.

System Information

Component Type
CPU i3-4160 @ 3.6Ghz
Memory 16GB RAM
Swap 16GB swap
Bus Type External USB3 "PROBOX" 4-drive enclosure (I know, I know, bear with me here, I'll explain in a minute)
Drives 4x 6TB Seagate NAS drives (NOT SHINGLED)
Format RAIDZ1
SIZE ALLOC FREE
21.8T 15.3T 6.46T

Originally created with Ubuntu 20.04 default ZFS version (0.8.4 or something?) - I have not yet done a "zpool upgrade" on it, in case I want to go back to Ubuntu default version. 2.0.7 built from git source and installed as debs.

Encryption enabled on "main" vol, sub-vol created under that e.g.:

tank/
tank/encrypted/
tank/encrypted/backup_1/
tank/encrypted/backup_2/
tank/encrypted/urbackup/

etc.

I know USB isn't a "recommended" bus type for ZFS, but this is my nearline/backup server, the storage for which exists in an easy-to-access external cabinet that is part of my "bug out" kit. I'm in FL, we get storms and hurricanes and probably plagues of frogs eventually, if I have to leave for any emergency I want to be able to easily grab my backup and bring everything with me without carrying a 40lb server. (Yes I also have offsite and other backups, but I'm super paranoid.)

If I try to scrub the external array, it will eventually OOM. It may happen in an hour, it may happen in a day. Unfortunately, I have never "caught" it going OOM. Everything seems nominal until it's not, in terms of memory, CPU, and disk I/O. It doesn't matter how long I leave the scrub running, if I stop the scrub before the system OOMs, it's fine.

I'm currently doing a scrub on it with zfs_scan_legacy=1 as suggested in this issue: ##11574

It's been going for a few days now without OOM, with several days left to go. (Legacy scan is considerably slower than the current scan mode. It would typically get over 10% at least within a couple of hours before eventually dying, already after 3+ days, it's only at 20%.)

I have yet to try connecting the enclosure to another machine, or try connecting the drives directly to the SATA bus on the motherboard. This is my next step to see if I can complete a scrub without having to resort to legacy mode once the current scrub either completes or dies.

The machine periodically runs rsync against the "main" file server to back it up, as well as urbackup to back up the Windows machine I have. I can stop all rsync/urbackupsrv tasks and let it scrub and it will still OOM after a while. It has been suggested that, because urbackup creates so many files in its directories, ZFS is dying trying to read all the metadata and I need to add more RAM. I'd prefer to find a solution that fixes rather than bandaids the problem.

Running multiple rsyncs and urbackup jobs simultaneously taxes the system as expected, but it handles it with no issues. A scrub seems to be the only thing that will take it down.

@dglidden dglidden added the Type: Defect Incorrect behavior (e.g. crash, hang) label Jun 10, 2022
@behlendorf
Copy link
Contributor

I'd prefer to find a solution that fixes rather than bandaids the problem.

As would I, and we may in fact have just merged a fix for this. The patch in #13537 resolves a sequential scan memory accounting bug which would cause ZFS to underestimate the amount of memory in use. This could potentially lead to an OOM if memory was tight on the system when scrubbing.

Once the legacy scrub completes would you mind testing out the small patch in the PR (commit 87b46d6)?

@dglidden
Copy link
Author

dglidden commented Jun 12, 2022

Brian,
Thanks for the reply. I gave up on the slow scrub since it's been at "6 days left" for the last two days. I've applied that PR and made sure everything is up-to-date on the system. I believe I pulled the right code in, as the sdl_scan.c file was the only thing modified, and I checked that the change was present. Rebuilt ZFS utils/kmods and have started a scrub. I will follow up with results when it either completes or dies.

$ zfs version
zfs-2.0.7-1_gd84d6b905
zfs-kmod-2.0.7-1_gd84d6b905

edit: I have stopped the urbackupsrv process during the scrub, just to give it the most potential resources to complete.

@dglidden
Copy link
Author

dglidden commented Jun 16, 2022

Good news: it got 38% through a scrub without OOMing.

Bad news: it took several days to get to 38% as it's only scanning ~18MB/s. I'm not sure why it is going so slowly, I have not tweaked any of the ZFS settings for legacy scan or anything else since rebuilding the ZFS utils/kmods. Also, after several days and 38%, I had a power outage that caused everything to need to be rebooted and the scrub resumed. I'm not sure if either of these affects the overall testing.

Just to double-check:

/sys/module/zfs/parameters$ cat zfs_scan_legacy
0

@dglidden
Copy link
Author

After rebooting and resuming the scrub, it's going at ~80MB/s, considerably faster than before. It's 73% done at the moment, which is well beyond the point it would have OOMed in the past. I think the patch can tentatively be considered working. I will reply once more when the scrub completes, assuming it doesn't fail some time in the next 16 hours.

scan: scrub in progress since Sun Jun 12 18:39:43 2022
12.3T scanned at 82.2M/s, 11.3T issued at 70.3M/s, 15.4T total
0B repaired, 73.84% done, 16:38:21 to go

scrub had reached ~38% from Sunday until yesterday. The rest has been scrubbed in the last 22 hours since it was rebooted after power failure.

$ free
total used free shared buff/cache available
Mem: 16255844 10666684 516564 2960 5072596 5246096
Swap: 16777212 1024 16776188

$ w
15:39:48 up 22:44, 1 user, load average: 5.14, 5.04, 4.96

Only up 22h since reboot, load is about expected while doing a scrub. Latency is fine, although disk I/O is somewhat degraded, as expected.

$ zfs version
zfs-2.0.7-1_gd84d6b905
zfs-kmod-2.0.7-1_gd84d6b905

Making sure I'm still running the right patch.

@dglidden
Copy link
Author

Success, it completed with no OOM!

@behlendorf
Copy link
Contributor

That's great news. This fix is already in the master branch and will be included in the planned 2.1.5 release. Then I think we can close this out.

@stale
Copy link

stale bot commented Jun 23, 2023

This issue has been automatically marked as "stale" because it has not had any activity for a while. It will be closed in 90 days if no further activity occurs. Thank you for your contributions.

@stale stale bot added the Status: Stale No recent activity for issue label Jun 23, 2023
@dglidden
Copy link
Author

Confirming close. No OOMs since the original problem was fixed.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Status: Stale No recent activity for issue Type: Defect Incorrect behavior (e.g. crash, hang)
Projects
None yet
Development

No branches or pull requests

2 participants