Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

ARC Data cache been evict when hitting Metadata limit #5537

Closed
AndCycle opened this issue Dec 29, 2016 · 12 comments
Closed

ARC Data cache been evict when hitting Metadata limit #5537

AndCycle opened this issue Dec 29, 2016 · 12 comments
Labels
Component: Memory Management kernel memory management Status: Feedback requested More information is requested Status: Inactive Not being actively updated Status: Stale No recent activity for issue Type: Performance Performance improvement or performance problem

Comments

@AndCycle
Copy link

AndCycle commented Dec 29, 2016

this is an observation recently on my system,
I just see the behavior as I work on my munin to visualize ARC usage

Gentoo
kernel ver 4.4.39
spl-zfs-0.6.5.8
zfs-0.6.5.8

with patch #4850

as you can see the data size is drop to almost zero when metadata usage hit the limit.

yazol_zfs_stats_arc_size_breakdown-day

yazol_zfs_stats_arc_size-day

meminfo_phisical-day

@perfinion
Copy link
Contributor

Related to #5418 ?

@AndCycle
Copy link
Author

AndCycle commented Dec 29, 2016

@perfinion maybe, it's hard to tell what happened because currently there is no tool to monitor arcstats continuously,

here is my current work in progress for munin to monitor it,

yazol_zfs_stats_

@dweeezil
Copy link
Contributor

@AndCycle I have a hunch this is due to the balanced-mode adjuster. Try setting zfs_arc_meta_strategy=0.

@richardelling
Copy link
Contributor

FYI, both telegraf and collectd open source aggregators have agents that collect ARC stats. In the commercial world, there have been collectors available for a very long time, I'd recommend Circonus.

@AndCycle
Copy link
Author

@kpande I think he got me, I didn't try out that much monitoring tool as many of them lack builtin visualize tool, and as a personal server I only search through free solution for this,

munin is one that easy enough to do, although the base of munin is pretty sucks and full of bugs, and many contribute plugin have incorrect way to do the calculation which force me to write one for my own,

@AndCycle
Copy link
Author

AndCycle commented Dec 29, 2016

@dweeezil you got it.

yazol_zfs_stats_arc_size-day 1
yazol_zfs_stats_arc_size_breakdown-day 1
yazol_zfs_stats_efficiency_pct-day

@kernelOfTruth
Copy link
Contributor

referencing #5128 (comment) (Poor cache performance) and #5418 (ARC efficiency regression) again

@kernelOfTruth
Copy link
Contributor

kernelOfTruth commented Dec 29, 2016

@dweeezil could also be memcg

here's some notes that I collected while investigating into the matter and which landed in /etc/modprobe.d/zfs.conf some time ago

# Your system is having trouble keeping the metadata under the limit and its not showing much evictable memory. 
# Try setting the tunable zfs_arc_meta_strategy to zero and see if the traditional metadata-only adjuster doesn't work better.
#
# The problem appears to be the continuing evolution of memory cgroups (memcg). 
# If you boot with cgroup_disable=memory the reclaiming should start working again. I've not worked up a patch yet.
#
# options zfs zfs_arc_meta_strategy=0

I've seen people over at Ubuntu running into that kind of issue and

cgroup_disable=memory

appending to boot seemed to have helped

@AndCycle
Copy link
Author

for ref.
I do have Memory Resource Controller for Control Groups and lot's cgroup related option selected in my kernel.

@dweeezil
Copy link
Contributor

This was added to help deal with the memcg issue. There didn't seem to be any way to coax the normal SB shrinker into doing the Right Thing.

@kernelOfTruth
Copy link
Contributor

kernelOfTruth commented Dec 29, 2016

referencing #3303 (comment) arc_adapt left spinning after rsync with lots of small files

it might be also worth it to take a look at

/sys/module/zfs/parameters/zfs_arc_meta_adjust_restarts

and

/sys/module/zfs/parameters/zfs_arc_meta_prune

in that issue NUMA is also mentioned which should already be addressed

@behlendorf behlendorf added the Type: Performance Performance improvement or performance problem label Jan 1, 2017
@stale
Copy link

stale bot commented Aug 24, 2020

This issue has been automatically marked as "stale" because it has not had any activity for a while. It will be closed in 90 days if no further activity occurs. Thank you for your contributions.

@stale stale bot added the Status: Stale No recent activity for issue label Aug 24, 2020
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Component: Memory Management kernel memory management Status: Feedback requested More information is requested Status: Inactive Not being actively updated Status: Stale No recent activity for issue Type: Performance Performance improvement or performance problem
Projects
None yet
Development

No branches or pull requests

7 participants
@behlendorf @perfinion @richardelling @dweeezil @AndCycle @kernelOfTruth and others