-
Notifications
You must be signed in to change notification settings - Fork 3
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Madvise jnr #109
Madvise jnr #109
Changes from all commits
708e489
edfb2ef
79be5de
8978fd5
5392a9b
8947719
98b53ec
File filter
Filter by extension
Conversations
Jump to
Diff view
Diff view
There are no files selected for viewing
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -1,2 +1 @@ | ||
rootProject.name = 'uppend' | ||
|
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -0,0 +1,61 @@ | ||
package com.upserve.uppend.blobs; | ||
|
||
import jnr.ffi.*; | ||
import jnr.ffi.types.size_t; | ||
import org.slf4j.Logger; | ||
import com.kenai.jffi.MemoryIO; | ||
|
||
import java.io.IOException; | ||
import java.lang.invoke.MethodHandles; | ||
import java.nio.*; | ||
|
||
public class NativeIO { | ||
private static final Logger log = org.slf4j.LoggerFactory.getLogger(MethodHandles.lookup().lookupClass()); | ||
|
||
private static final NativeC nativeC = LibraryLoader.create(NativeC.class).load("c"); | ||
public static final int pageSize = nativeC.getpagesize(); // 4096 on most Linux | ||
|
||
public enum Advice { | ||
// These seem to be fairly stable https://github.com/torvalds/linux | ||
// TODO add to https://github.com/jnr/jnr-constants | ||
Normal(0), Random(1), Sequential(2), WillNeed(3), DontNeed(4); | ||
private final int value; | ||
Advice(int val) { | ||
this.value = val; | ||
} | ||
} | ||
|
||
public interface NativeC { | ||
int madvise(@size_t long address, @size_t long size, int advice); | ||
int getpagesize(); | ||
} | ||
|
||
static long alignedAddress(long address) { | ||
return address & (- pageSize); | ||
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. The return value of this function will always be <= address. This means that for an unaligned buffer,
|
||
} | ||
|
||
static long alignedSize(long address, int capacity) { | ||
long end = address + capacity; | ||
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. Lucene had this as At this point I am not inclined to license this file under the lucene apache license, but open suggestions. There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. I wouldn't worry about the license. IANAL, but It doesn't look like there's a single line copied verbatim. I'm skeptical about their approach in general. It makes a lot of assumptions about alignment that don't seem necessary and could break in the future if the underlying kernel implementation changes. |
||
end = (end + pageSize - 1) & (-pageSize); | ||
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. I don't understand why the size needs to be page aligned. The Linux man page for There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more.
I am actually inclined to be aggressive and assert that all three pages should be advised. I think a good solution might be to enforce that the file headers which are WILLNEED and the file content which may be RANDOM are aligned at 4096. I think this is probably already true, but would be easy to enforce. There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. -1 on the feature flag. This isn't something we want to tune. I would rather add a comment that the policy is to apply There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. Just thought of this:
What if A is WILLNEED and B is RANDOM or vice-versa? |
||
return end - alignedAddress(address); | ||
} | ||
|
||
public static void madvise(MappedByteBuffer buffer, Advice advice) throws IOException { | ||
|
||
final long address = MemoryIO.getInstance().getDirectBufferAddress(buffer); | ||
final int capacity = buffer.capacity(); | ||
|
||
long alignedAddress = alignedAddress(address); | ||
long alignedSize = alignedSize(alignedAddress, capacity); | ||
|
||
log.debug( | ||
"Page size {}; Address: raw - {}, aligned - {}; Size: raw - {}, aligned - {}", | ||
pageSize, address, alignedAddress, capacity, alignedSize | ||
); | ||
int val = nativeC.madvise(alignedAddress, alignedSize, advice.value); | ||
|
||
if (val != 0) { | ||
throw new IOException(String.format("System call madvise failed with code: %d", val)); | ||
} | ||
} | ||
} |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Why are you no longer flushing pages inside this method called
flush
?There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Because flush guarantees durability - written to disk, but I now understand that is not required for multiple processes to share state in the same machine through the page cache.