-
Notifications
You must be signed in to change notification settings - Fork 4.8k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Memory<T> and large memory mapped files #24805
Comments
We will be soon adding ReadOnlyBuffer. See https://github.com/dotnet/corefxlab/blob/master/src/System.Buffers.Primitives/System/Buffers/ReadOnlyBuffer.cs We would be interested in your feedback on this type. Would it support your scenarios? |
I took some time to look into this new type and I think I could make it work (haven't had the time to try it yet, though). In my current case, the file is approximately 3,5GB, so I could create 4 As soon as I have the time, I'll try creating a small benchmark for this use case, and compare possible implementations. |
@pakrym, @davidfowl I think we could solve the O(log N) seek problem if |
N is the number of segments. So I don't see how this has a big impact if buffers are large. As I said before I don't like two sources of Positions (ROB and IML) |
If IMemoryList extends ISequence, there would not be two sources of position. There would only be APIs on ISequence (Start, TryGet, Seek) |
What about ReadOnlyBuffer? It edits Index to put bit's into it, how would it know that |
I created a benchmark comparing approaches for accessing a large memory block: I tried to get it as close as possible to my real use-case:
Assuming I didn't make any mistakes in the benchmark code, the numbers tell me that using ReadOnlyBuffer would be ~1.95 times slower than implementing a custom slice type: BenchmarkDotNet=v0.10.12, OS=Windows 10 Redstone 3 [1709, Fall Creators Update] (10.0.16299.192)
Intel Core i7-4578U CPU 3.00GHz (Haswell), 1 CPU, 4 logical cores and 2 physical cores
Frequency=2929690 Hz, Resolution=341.3330 ns, Timer=TSC
.NET Core SDK=2.1.4
[Host] : .NET Core 2.0.5 (Framework 4.6.26020.03), 64bit RyuJIT
DefaultJob : .NET Core 2.0.5 (Framework 4.6.26020.03), 64bit RyuJIT
I'm not sure how much implementing |
FYI: We are adding IMemoryList.GetPosition(long). It will enable O(1) random access on some IMemoryList implementations (implementations with uniform size segments). cc: @pakrym |
Using PR dotnet/corefx#27499
Improved to x1.43 off the local span. Code changes to benchmark to test hexawyz/MemoryLookupBenchmark#1 Bear in mind that Also *edit updated with tweaks |
Update to benchmarks PR dotnet/corefx#27499 is doesn't scale badly for 100-1000 segments as shown below
|
You're right about that… I just tried adding bounds checking before the creation of
I may have made a mistake somewhere, or maybe it simply plays well with the JIT inlining, but I don't know what to conclude. Anyway, good job with the improvements. The new results are great 🙂 |
Latest in dotnet/corefx#27499 is much closer still
|
Nice! These results are so close that I doubt the differences will matter outside of microsbenchmarks, i.e. once the program starts doing something interesting with the data in the buffers. |
I am going to close this. If there is data showing that ROS still cannot support real apps with multi-segmented buffers, we can think how to improve the perf further. @GoldenCrystal thanks for bringing this scenario to our attention. |
Copying conversation over from https://github.com/dotnet/coreclr/issues/5851#issuecomment-370276484From @kstewart83:
var dbPath = "test.txt";
var initialSize = 1024;
var mmf = MemoryMappedFile.CreateFromFile(dbPath);
var mma = mmf.CreateViewAccessor(0, initialSize).SafeMemoryMappedViewHandle;
Span<byte> bytes;
unsafe
{
byte* ptrMemMap = (byte*)0;
mma.AcquirePointer(ref ptrMemMap);
bytes = new Span<byte>(ptrMemMap, (int)mma.ByteLength);
}
var dbPath = "test.txt";
var initialSize = 1024;
var mmf = MemoryMappedFile.CreateFromFile(dbPath);
var mma = mmf.CreateViewAccessor(0, initialSize).SafeMemoryMappedViewHandle;
var mem = new Memory(mma);
var span = mem.Span.Slice(0, 512);
From @kstewart83:
From @davidfowl:
|
It is not generally possible to slice large files into 1GB span segments. For example, a file could contain a large stream of small serialized items. Then, it's not possible to know where to cut the file. Slicing it could lead to torn items. So it's no longer possible to create a span and pass it to some API of the form It would be really good if span supported But I assume the |
Wouldn't it be better for the API to be built to handle chunks and therefore work with streaming scenarios as well? |
If I understand correctly, the problem here would be more with The current version of It is true that in the case I presented, |
I suspect though that since A compelling use case I see with combining memory mapped files with |
Memory is not allocated on the heap (necessarily). It's a struct. |
@KrzysztofCwalina, there is no API proposal for MMF Memory/Span overloads, should this issue be converted to |
@kasper3, please open a separate issue for adding span support to MMF. This issue was about Memory's length property not being able to deal with large files. |
@kasper3 @KrzysztofCwalina is there a separate issue for MMF/Span? I was not able to find it and is not linked here. |
I am not aware. |
@attilah, related https://github.com/dotnet/corefx/issues/29562#issuecomment-388182098 and overarching idea https://github.com/dotnet/corefx/issues/30174. |
Sorry to be late to this, but it is not very clear to me from the above what is currently the recommended way to turn a |
@miloush , you can use third-party library. ReadOnlySequenceAccessor is probably what you need. |
I'm currently experimenting with
OwnedMemory<T>
andMemory<T>
in an existing project that I'm trying to improve, and I ran into an issue withOwnedMemory<T>
andMemory<T>
being limited toint.MaxValue
.Scenario
I have a relatively big (> 2GB) data file that I want to fully map in memory (i.e. a database). My API exposes methods that returns subsets of this big memory mapped file, e.g.
Wrapping the
MemoryMappedFile
and associatedMemoryMappedViewAccessor
into anOwnedMemory<byte>
seemed to be a good idea, since most of the tricky logic would then be handled by the framework.Problem
The memory block that I want to wrap is bigger than 2GB and cannot currently be represented by a single Memory instance.
Since Memory can only work with
T[]
,string
, orOwnedMemory<T>
, it seems that having to give up on the straightfowardOwnedMemory<T>
implementation also means that I have to give up on usingMemory<T>
at all.(In this specific case,
Span<T>
being limited to 2GB, would not be a problem, because the sliced memory blocks that my API would return would always be much smaller than that.)Possible solutions with the currently proposed API
Memory<T>
at all and implementing a much simplified version ofOwnedMemory<T>
/Memory<T>
that would fit my use caseOwnedMemory<T>
around and use the one that best fits the current caseQuestion
Would it be possible to improve the framework in order to be able of easily working with such large memory blocks? (Maybe implementing something like a
BigMemory<T>
?)The text was updated successfully, but these errors were encountered: