Skip to content
This repository has been archived by the owner on Feb 26, 2020. It is now read-only.

Changes needed by newer Illumos code #300

Closed
wants to merge 7 commits into from
Closed

Conversation

ryao
Copy link
Contributor

@ryao ryao commented Oct 9, 2013

These are all of the commits needed to support newer Illumos code. It also supersedes #286. I am opening this pull request for comments.

nedbass and others added 7 commits September 4, 2013 15:09
These kstat interfaces are required to port
"Illumos #3537 want pool io kstats" to ZFS on Linux.

kstat_waitq_enter()
kstat_waitq_exit()
kstat_runq_enter()
kstat_runq_exit()

Additionally, zero out the ks_data buffer in __kstat_create() so
that the kstat_io_t counters are initialized to zero.
While porting Illumos #3537 I found that ks_lock member of kstat_t
structure is different between Illumos and SPL. It is a pointer to
the kmutex_t in Illumos, but the mutex lock itself in SPL.
Apparently Illumos kstat API allows consumer to override the lock
if required. With SPL implementation it is not possible anymore.

Things were alright until the first attempt to actually override
the lock. Porting of Illumos #3537 introduced such code for the
first time.

In order to provide the Solaris/Illumos like functionality we:
  1. convert ks_lock to "kmutex_t *ks_lock"
  2. create a new field "kmutex_t ks_private_lock"
  3. On kstat_create() ks_lock = &ks_private_lock

Thus if consumer doesn't care we still have our internal lock in use.
If, however, consumer does care she has a chance to set ks_lock to
anything else before calling kstat_install().

The rest of the code will use ks_lock regardless of its origin.
Needed for Illumos #3852. This interface is supposed to support a
variable-resolution timeout with nanosecond granularity.  This
implementation just rounds up to jiffie resolution, as this was the most
expedient solution, and nanosecond resolution is rarely needed for
real-world performance tuning.  Add flags from sys/callo.h as these are
used to control the behavior of cv_timedwait_hires().  Specifically,

CALLOUT_FLAG_ABSOLUTE
    Normally, the expiration passed to the timeout API functions is
    an expiration interval. If this flag is specified, then it is
    interpreted as the expiration time itself.

CALLOUT_FLAG_ROUNDUP
    Roundup the expiration time to the next resolution boundary. If this
    flag is not specified, the expiration time is rounded down.

References:
    https://www.illumos.org/issues/3582
    illumos/illumos-gate@0689f76
This is needed for the Illumos #4045 write throttle patch.  It is used
in the arc eviction code to avoid blocking all arc activity by sitting on
arcs_mtx too long.
The Open Solaris man page states that KM_NOSLEEP will "Return NULL
immediately if memory is not available". However, we currently do not
honor this behavior. We map KM_NOSLEEP to a Linux equivalent that
contains __GFP_NORETRY, which is meant to prevent this behavior. We
current this by checking for __GFP_NORETRY in spl_kmem_cache_alloc() and
spl_cache_refill() If it is set, exit the busy loop and return with
whatever we have done.

Signed-off-by: Richard Yao <ryao@gentoo.org>
The SPL's SLAB code was written with an optimization to preconstruct
objects before they are used. This is semantically differs from the
Solaris behavior where objects are constructed and deconstructed on
demand. A consequence of preallocation is that buffers that should be
zeroed by constructors will contain stale data whenever the buffer is
reused. I discovered this when experimenting with the conversion of
important ARC memory allocations to SLAB allocation.

We switch to on-demand construction and deconstruction to correct this.

Signed-off-by: Richard Yao <ryao@gentoo.org>
Signed-off-by: Richard Yao <ryao@gentoo.org>
@behlendorf
Copy link
Contributor

Most of these changes exist in other pull requests which I hope to get merged fairly soon.

@behlendorf
Copy link
Contributor

@ryao Several of these changes have been merged, can you refresh this branch against master when you get a chance.

@behlendorf
Copy link
Contributor

The only changes remaining unmerged in this pull request relate to memory management. It would be best to open a new pull request with those patches reworked to include the feedback above.

@behlendorf behlendorf closed this Nov 4, 2013
Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants