#1991 - ObjectPool approach #3702

redknightlois · 2015-10-07T20:02:18Z

Allow DefrateStream and OutputWindow to pool internal resources that causes stress allocation when DefrateStream is used in high throughput small data packages. Those commonly happen in web servers which accept gzipped incoming connections and must respond accordingly. DefrateStream will allocate for reuse up to 256Kb of memory (or up to 32 concurrent streams without allocations) and will apply both to ZLib and Managed versions. On the managed implementation (InfrateManaged), there will exist a reservation of up to 512Kb of memory (or up to 16 concurrent compression streams without allocations). The rationale is that on the web server scenario messages are paired, one incoming and one outgoing, therefore there are half as many InfrateManaged than total DefrateStreams.

Reservations happen lazily, therefore in low throughput scenarios the probability of allocating the whole 768Kb is pretty low.

Fix: #1991

…causes stress allocation when DefrateStream is used in high throughput small data packages. Those commonly happen in web servers which accept gzipped incoming connections and must respond accordingly. DefrateStream will allocate for reuse up to 256Kb of memory (or up to 32 concurrent streams without allocations) and will apply both to ZLib and Managed versions. On the managed implementation (InfrateManaged), there will exist a reservation of up to 512Kb of memory (or up to 16 concurrent compression streams without allocations). The rationale is that on the web server scenario messages are paired, one incoming and one outgoing, therefore there are half as many InfrateManaged than total DefrateStreams. Reservations happen lazily, therefore in low throughput scenarios the probability of allocating the whole 768Kb is pretty low. Fix: #1991

dnfclas · 2015-10-07T20:02:23Z

Hi @redknightlois, I'm your friendly neighborhood .NET Foundation Pull Request Bot (You can call me DNFBOT). Thanks for your contribution!

In order for us to evaluate and accept your PR, we ask that you sign a contribution license agreement. It's all electronic and will take just minutes. I promise there's no faxing. https://cla2.dotnetfoundation.org.

TTYL, DNFBOT;

Maxwe11 · 2015-10-07T20:10:21Z

src/System.IO.Compression/src/System/IO/Compression/Internal/ObjectPool`1.cs

+        /// </remarks>
+        internal T Allocate()
+        {
+            var items = _items;


❗ coding style violation: rule # 10

@Maxwe11 I actually took this verbatim from: https://github.com/dotnet/corefx/blob/master/src/System.Reflection.Metadata/src/System/Reflection/Internal/Utilities/ObjectPool%601.cs

I was signing the CLA and then opening a new ticket to make ObjectPool available throught all CoreFX to avoid redundant code-copying.

In that case I'd rather apply git mv on this file and move it to src/Common folder for reuse 😄

I was signing the CLA and then opening a new ticket to make...

Apparently the first part didn't went through. 😄

No it didnt. Sent an email to the foundation requesting support. Not even using the Access documents option worked.

@stephentoub I didn't get any response from either the ticket I opened to DocuSign and the email I sent to the DotNet Foundation that is specified in the email. Do you know who I could contact to be able to sign the CLA? Because it is not working, tried in Chrome and Edge, no luck.

@richlander, @terrajobst, can you help @redknightlois?

@redknightlois, I spoke with @martinwoodward at the foundation, and he's concerned because he didn't see any email from you nor any issues with signing. He asked that you resend your email and include him directly: martin@dotnetfoundation.org. Sorry for the hassle, and thanks for persisting.

Done. Forwarded the original email to Martin.

VSadov · 2015-10-07T20:58:58Z

CC @stephentoub

stephentoub · 2015-10-07T21:33:58Z

cc: @KrzysztofCwalina, @vancem

I applaud the goal, but I have concerns about adding such an object pool under DeflateStream. We could end up significantly increasing the amount of space taken up by the process; in some apps that maintain a steady state of use, that's probably fine, but in other apps that use a DeflateStream only here and there, or worse use a bunch in a quick burst and then never again, this could effectively be a big leak. We'd probably need to invest in much better object pooling before being comfortable having such a pool hidden beneath such a common stream.

Personally, I think I'd prefer to see a different approach taken:

Introduce some sort of buffer pool interface to the framework (e.g. IBufferPool), with some good default implementation that's also exposed (e.g. BufferPool)
Add ctors to streams like DeflateStream and FileStream, to methods like Stream.CopyToAsync, etc. that take an IBufferPool instead of or in addition to a buffer size
Leave it up to the consumer of the class/API whether to pass in a pool and, if it does, which instance of which kind to use

We'd want to design such a solution holistically across the set of types that would use it rather than one-off for an individual stream. @KrzysztofCwalina has already started experimenting with buffer pools in corefxlabs at https://github.com/dotnet/corefxlab/tree/master/src/System.Buffers.

redknightlois · 2015-10-07T22:02:06Z

@stephentoub While I agree that a more holistical solution is needed, it would not improve the life either of those that have already code out there either (not a pit of success as @friedroman commented). This solution (we should think about how big a footprint make sense) is an stopgap there. That's essentially the reason behind limiting the memory footprint to under 1Mb.

We dont know where CoreCLR is going to be used, but it seems a small price to pay from the eventual usage scenarios into the mid-range to heavy-range usage scenarios with also immediate improvement to all users of the class without changes. In the case of the single usage and even many uses in short intervals (not multithreaded apps) there is no reason why the pool will grow over a few instances (the amount of simultaneously used DefrateStream instances); so the effective footprint for all quick burst and fire-and-forget cases should be low enough (except in a few degenerated cases). While the improvement on heavy-usage and to current code is still retained.

stephentoub · 2015-10-09T12:42:23Z

stopgap

I understand. But the problem with a "stopgap" like this is that we very likely won't be able to remove it, once folks rely on its being there.

redknightlois · 2015-10-09T12:49:58Z

I meant "stopgap" as in: it works as it is, if you are not in the 90% (I am making this number up, but we should aim at least that coverage), then you should use a different solution. Not as in something thansitory.

Currently if you rely on GZipStream to do anything outside of unzipping a few files the GC pressure is not negligible. Every time we create a GZipStream we are allocating up to 40kb just to do nothing (and 8K if you are using the zlib version). Then there are internal allocation in the managed provider in the range of 32kb plus other allocations on the process of compressing and decompressing itself (which I didn't consider in this PR).

This type of usage is pretty common: https://gist.github.com/chrisnicola/1147568

stephentoub · 2015-10-13T16:49:14Z

Every time we create a GZipStream we are allocating up to 40kb just to do nothing (and 8K if you are using the zlib version)

The zlib version should be what's basically used everywhere. In what situation with corefx are you not using the zlib implementation? As I've stated multiple times, I fundamentally agree there's a problem that needs to be solved here, e.g. an 8KB buffer being allocated for each stream. I just want to make sure we're not overstating the problem.

I meant "stopgap" as in

This is a very slippery slope. If every stream type maintained its own buffer pool, that could get very expensive very quickly. What about FileStream? BufferedStream? What about methods that do a copy and allocate a buffer, e.g. Stream.CopyTo? Etc. They're all over the place, and if we tackled them each independently, I really fear we're going to create a mess.

@vancem, opinions?

KrzysztofCwalina · 2015-10-13T16:58:39Z

We need to get and use a shared buffer pool. We are getting very close to having cycles to look into implementing such shared pool.

redknightlois · 2015-10-13T16:58:44Z

@stephentoub They are not nearly close, most do not even allocate at all. The only one that does allocate heavily is MemoryStream (nothing that it makes sense to do there) and BufferedStream. BufferedStream is nowhere near, it uses by default 4K and it also allows you to decide to use less buffer in cases where you know you are not going to gain either (not as great, but it can help on some scenarios as the one I discovered through profiling). That being said, the holistic solution would be the best of worlds there.

stephentoub · 2015-10-13T17:08:07Z

They are not nearly close, most do not even allocate at all

I don't understand this comment.
FileStream by default allocates a 4K buffer.
BufferedStream by default allocates a 4K buffer.
MemoryStream obviously allocates as much as is necessary to store all the data.
StreamReader/StreamWriter by default allocate a 1K buffer.
Stream.CopyTo by default allocates an 80K buffer.
Etc.

Even just looking at the 4K ones, I don't understand why 4K is so different from 8K that the 8K one would warrant a solution and the 4K one wouldn't. Sure, with a stream like FileStream you can lower the size of the buffer, but in doing so you sacrifice other performance benefits.

redknightlois · 2015-10-13T17:31:53Z

For most high performance scenarios, 4K is bad too, but it is half as bad as 8K (sorry I couldnt resist the attempt at humor :D).

Stream.CopyTo indeed as you noted is very bad. Didnt lookup the code for that one, but I would have never though about it, as for ages I have been using custom implementations for it. As you noted the stock one is quite bad if you are calling it too many times. https://github.com/dotnet/coreclr/issues/1494#issuecomment-138495504
Our implementation uses an internal internal memory pool to keep allocations in check :) .

Even in the case of FileStream (as in BufferedStream) the constructor allows you to control the buffer size and therefore the allocations. https://github.com/dotnet/coreclr/blob/master/src/mscorlib/src/System/IO/FileStream.cs#L493

Most of the time is a game of trying to figure out how big a buffer would make sense trying to get the most out of the allocation vs. read performance function. (Time consuming, brittle and requires endless tweaking)

KrzysztofCwalina · 2015-10-19T17:13:13Z

@sokket, could you look into this and see if the buffer pool you are working on coul dbe used here.

martinwoodward · 2015-10-19T17:57:14Z

@stephentoub - we have a signed CLA back from @redknightlois now (thanks for bearing with us, still not sure what was going wrong there but we did it manually to unblock this one). Any new PR's created by him in the future should come up as cla-signed automatically. I manually changed this one to avoid it having to be closed and re-opened.

stephentoub · 2015-10-19T17:58:38Z

Thanks, @martinwoodward and @redknightlois!

jonmill · 2015-10-19T18:29:46Z

@KrzysztofCwalina it looks like it could since this is using a byte[] buffer pool; either of my implementations could work here. I'll sync up with you offline to figure out next steps

stephentoub · 2015-12-04T16:41:00Z

@redknightlois, thanks for the example PR. I'm going to close this now. We still have https://github.com/dotnet/corefx/issues/1991 tracking improvements around this, and we have https://github.com/dotnet/corefx/issues/4547 tracking adding a shared buffer pool to corefx, which we'd want to use instead of a custom implementation unique to this assembly. Part of @sokket's work on https://github.com/dotnet/corefx/issues/4547 will be in validating the pool in use by assemblies like this and others in corefx.

KrzysztofCwalina · 2015-12-04T16:46:21Z

@sokket , @Priya91, @redknightlois, who will do a PR to use the shared pool in compression?

jonmill · 2015-12-07T23:39:12Z

@KrzysztofCwalina I opened issue #4875 to track this for RC2, assigned to myself

kerryjiang · 2016-06-09T15:51:21Z

Hello guys, It's pretty good to have a buffer pool in the assembly System.Buffers.
But ObjectPool is required in lots of situations and I also found many different internal ObjectPool implementations in CoreCLR and CoreFx.

So, do you have a plan to make one of the ObjectPools public and move it to a common place?

dnfclas added the cla-required label Oct 7, 2015

Maxwe11 reviewed Oct 7, 2015
View reviewed changes

martinwoodward added cla-signed and removed cla-required labels Oct 19, 2015

stephentoub closed this Dec 4, 2015

karelz modified the milestone: 1.0.0-rtm Dec 3, 2016

jonmill mentioned this pull request Jan 31, 2020

Utilize a Buffer Pool in the compression library dotnet/runtime#15881

Closed

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

#1991 - ObjectPool approach #3702

#1991 - ObjectPool approach #3702

redknightlois commented Oct 7, 2015

dnfclas commented Oct 7, 2015

Maxwe11 Oct 7, 2015

redknightlois Oct 7, 2015

Maxwe11 Oct 7, 2015

ghost Oct 8, 2015

redknightlois Oct 8, 2015

redknightlois Oct 9, 2015

stephentoub Oct 9, 2015

stephentoub Oct 16, 2015

redknightlois Oct 16, 2015

VSadov commented Oct 7, 2015

stephentoub commented Oct 7, 2015

redknightlois commented Oct 7, 2015

stephentoub commented Oct 9, 2015

redknightlois commented Oct 9, 2015

stephentoub commented Oct 13, 2015

KrzysztofCwalina commented Oct 13, 2015

redknightlois commented Oct 13, 2015

stephentoub commented Oct 13, 2015

redknightlois commented Oct 13, 2015

KrzysztofCwalina commented Oct 19, 2015

martinwoodward commented Oct 19, 2015

stephentoub commented Oct 19, 2015

jonmill commented Oct 19, 2015

stephentoub commented Dec 4, 2015

KrzysztofCwalina commented Dec 4, 2015

jonmill commented Dec 7, 2015

kerryjiang commented Jun 9, 2016

#1991 - ObjectPool approach #3702

#1991 - ObjectPool approach #3702

Conversation

redknightlois commented Oct 7, 2015

dnfclas commented Oct 7, 2015

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

VSadov commented Oct 7, 2015

stephentoub commented Oct 7, 2015

redknightlois commented Oct 7, 2015

stephentoub commented Oct 9, 2015

redknightlois commented Oct 9, 2015

stephentoub commented Oct 13, 2015

KrzysztofCwalina commented Oct 13, 2015

redknightlois commented Oct 13, 2015

stephentoub commented Oct 13, 2015

redknightlois commented Oct 13, 2015

KrzysztofCwalina commented Oct 19, 2015

martinwoodward commented Oct 19, 2015

stephentoub commented Oct 19, 2015

jonmill commented Oct 19, 2015

stephentoub commented Dec 4, 2015

KrzysztofCwalina commented Dec 4, 2015

jonmill commented Dec 7, 2015

kerryjiang commented Jun 9, 2016