Skip to content
This repository has been archived by the owner on Jan 23, 2023. It is now read-only.
/ corefx Public archive

#1991 - ObjectPool approach #3702

Closed
wants to merge 1 commit into from

Conversation

redknightlois
Copy link

Allow DefrateStream and OutputWindow to pool internal resources that causes stress allocation when DefrateStream is used in high throughput small data packages. Those commonly happen in web servers which accept gzipped incoming connections and must respond accordingly. DefrateStream will allocate for reuse up to 256Kb of memory (or up to 32 concurrent streams without allocations) and will apply both to ZLib and Managed versions. On the managed implementation (InfrateManaged), there will exist a reservation of up to 512Kb of memory (or up to 16 concurrent compression streams without allocations). The rationale is that on the web server scenario messages are paired, one incoming and one outgoing, therefore there are half as many InfrateManaged than total DefrateStreams.

Reservations happen lazily, therefore in low throughput scenarios the probability of allocating the whole 768Kb is pretty low.

Fix: #1991

…causes stress allocation when DefrateStream is used in high throughput small data packages. Those commonly happen in web servers which accept gzipped incoming connections and must respond accordingly. DefrateStream will allocate for reuse up to 256Kb of memory (or up to 32 concurrent streams without allocations) and will apply both to ZLib and Managed versions. On the managed implementation (InfrateManaged), there will exist a reservation of up to 512Kb of memory (or up to 16 concurrent compression streams without allocations). The rationale is that on the web server scenario messages are paired, one incoming and one outgoing, therefore there are half as many InfrateManaged than total DefrateStreams.

Reservations happen lazily, therefore in low throughput scenarios the probability of allocating the whole 768Kb is pretty low.

Fix: #1991
@dnfclas
Copy link

dnfclas commented Oct 7, 2015

Hi @redknightlois, I'm your friendly neighborhood .NET Foundation Pull Request Bot (You can call me DNFBOT). Thanks for your contribution!

In order for us to evaluate and accept your PR, we ask that you sign a contribution license agreement. It's all electronic and will take just minutes. I promise there's no faxing. https://cla2.dotnetfoundation.org.

TTYL, DNFBOT;

/// </remarks>
internal T Allocate()
{
var items = _items;
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

❗ coding style violation: rule # 10

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@Maxwe11 I actually took this verbatim from: https://github.com/dotnet/corefx/blob/master/src/System.Reflection.Metadata/src/System/Reflection/Internal/Utilities/ObjectPool%601.cs

I was signing the CLA and then opening a new ticket to make ObjectPool available throught all CoreFX to avoid redundant code-copying.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

In that case I'd rather apply git mv on this file and move it to src/Common folder for reuse 😄

Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I was signing the CLA and then opening a new ticket to make...

Apparently the first part didn't went through. 😄

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

No it didnt. Sent an email to the foundation requesting support. Not even using the Access documents option worked.

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@stephentoub I didn't get any response from either the ticket I opened to DocuSign and the email I sent to the DotNet Foundation that is specified in the email. Do you know who I could contact to be able to sign the CLA? Because it is not working, tried in Chrome and Edge, no luck.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@redknightlois, I spoke with @martinwoodward at the foundation, and he's concerned because he didn't see any email from you nor any issues with signing. He asked that you resend your email and include him directly: martin@dotnetfoundation.org. Sorry for the hassle, and thanks for persisting.

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Done. Forwarded the original email to Martin.

@VSadov
Copy link
Member

VSadov commented Oct 7, 2015

CC @stephentoub

@stephentoub
Copy link
Member

cc: @KrzysztofCwalina, @vancem

I applaud the goal, but I have concerns about adding such an object pool under DeflateStream. We could end up significantly increasing the amount of space taken up by the process; in some apps that maintain a steady state of use, that's probably fine, but in other apps that use a DeflateStream only here and there, or worse use a bunch in a quick burst and then never again, this could effectively be a big leak. We'd probably need to invest in much better object pooling before being comfortable having such a pool hidden beneath such a common stream.

Personally, I think I'd prefer to see a different approach taken:

  • Introduce some sort of buffer pool interface to the framework (e.g. IBufferPool), with some good default implementation that's also exposed (e.g. BufferPool)
  • Add ctors to streams like DeflateStream and FileStream, to methods like Stream.CopyToAsync, etc. that take an IBufferPool instead of or in addition to a buffer size
  • Leave it up to the consumer of the class/API whether to pass in a pool and, if it does, which instance of which kind to use

We'd want to design such a solution holistically across the set of types that would use it rather than one-off for an individual stream. @KrzysztofCwalina has already started experimenting with buffer pools in corefxlabs at https://github.com/dotnet/corefxlab/tree/master/src/System.Buffers.

@redknightlois
Copy link
Author

@stephentoub While I agree that a more holistical solution is needed, it would not improve the life either of those that have already code out there either (not a pit of success as @friedroman commented). This solution (we should think about how big a footprint make sense) is an stopgap there. That's essentially the reason behind limiting the memory footprint to under 1Mb.

We dont know where CoreCLR is going to be used, but it seems a small price to pay from the eventual usage scenarios into the mid-range to heavy-range usage scenarios with also immediate improvement to all users of the class without changes. In the case of the single usage and even many uses in short intervals (not multithreaded apps) there is no reason why the pool will grow over a few instances (the amount of simultaneously used DefrateStream instances); so the effective footprint for all quick burst and fire-and-forget cases should be low enough (except in a few degenerated cases). While the improvement on heavy-usage and to current code is still retained.

@stephentoub
Copy link
Member

stopgap

I understand. But the problem with a "stopgap" like this is that we very likely won't be able to remove it, once folks rely on its being there.

@redknightlois
Copy link
Author

I meant "stopgap" as in: it works as it is, if you are not in the 90% (I am making this number up, but we should aim at least that coverage), then you should use a different solution. Not as in something thansitory.

Currently if you rely on GZipStream to do anything outside of unzipping a few files the GC pressure is not negligible. Every time we create a GZipStream we are allocating up to 40kb just to do nothing (and 8K if you are using the zlib version). Then there are internal allocation in the managed provider in the range of 32kb plus other allocations on the process of compressing and decompressing itself (which I didn't consider in this PR).

This type of usage is pretty common: https://gist.github.com/chrisnicola/1147568

@stephentoub
Copy link
Member

Every time we create a GZipStream we are allocating up to 40kb just to do nothing (and 8K if you are using the zlib version)

The zlib version should be what's basically used everywhere. In what situation with corefx are you not using the zlib implementation? As I've stated multiple times, I fundamentally agree there's a problem that needs to be solved here, e.g. an 8KB buffer being allocated for each stream. I just want to make sure we're not overstating the problem.

I meant "stopgap" as in

This is a very slippery slope. If every stream type maintained its own buffer pool, that could get very expensive very quickly. What about FileStream? BufferedStream? What about methods that do a copy and allocate a buffer, e.g. Stream.CopyTo? Etc. They're all over the place, and if we tackled them each independently, I really fear we're going to create a mess.

@vancem, opinions?

@KrzysztofCwalina
Copy link
Member

We need to get and use a shared buffer pool. We are getting very close to having cycles to look into implementing such shared pool.

@redknightlois
Copy link
Author

@stephentoub They are not nearly close, most do not even allocate at all. The only one that does allocate heavily is MemoryStream (nothing that it makes sense to do there) and BufferedStream. BufferedStream is nowhere near, it uses by default 4K and it also allows you to decide to use less buffer in cases where you know you are not going to gain either (not as great, but it can help on some scenarios as the one I discovered through profiling). That being said, the holistic solution would be the best of worlds there.

@stephentoub
Copy link
Member

They are not nearly close, most do not even allocate at all

I don't understand this comment.
FileStream by default allocates a 4K buffer.
BufferedStream by default allocates a 4K buffer.
MemoryStream obviously allocates as much as is necessary to store all the data.
StreamReader/StreamWriter by default allocate a 1K buffer.
Stream.CopyTo by default allocates an 80K buffer.
Etc.

Even just looking at the 4K ones, I don't understand why 4K is so different from 8K that the 8K one would warrant a solution and the 4K one wouldn't. Sure, with a stream like FileStream you can lower the size of the buffer, but in doing so you sacrifice other performance benefits.

@redknightlois
Copy link
Author

For most high performance scenarios, 4K is bad too, but it is half as bad as 8K (sorry I couldnt resist the attempt at humor :D).

Stream.CopyTo indeed as you noted is very bad. Didnt lookup the code for that one, but I would have never though about it, as for ages I have been using custom implementations for it. As you noted the stock one is quite bad if you are calling it too many times. https://github.com/dotnet/coreclr/issues/1494#issuecomment-138495504
Our implementation uses an internal internal memory pool to keep allocations in check :) .

Even in the case of FileStream (as in BufferedStream) the constructor allows you to control the buffer size and therefore the allocations. https://github.com/dotnet/coreclr/blob/master/src/mscorlib/src/System/IO/FileStream.cs#L493

Most of the time is a game of trying to figure out how big a buffer would make sense trying to get the most out of the allocation vs. read performance function. (Time consuming, brittle and requires endless tweaking)

@KrzysztofCwalina
Copy link
Member

@sokket, could you look into this and see if the buffer pool you are working on coul dbe used here.

@martinwoodward
Copy link
Member

@stephentoub - we have a signed CLA back from @redknightlois now (thanks for bearing with us, still not sure what was going wrong there but we did it manually to unblock this one). Any new PR's created by him in the future should come up as cla-signed automatically. I manually changed this one to avoid it having to be closed and re-opened.

@stephentoub
Copy link
Member

Thanks, @martinwoodward and @redknightlois!

@jonmill
Copy link

jonmill commented Oct 19, 2015

@KrzysztofCwalina it looks like it could since this is using a byte[] buffer pool; either of my implementations could work here. I'll sync up with you offline to figure out next steps

@stephentoub
Copy link
Member

@redknightlois, thanks for the example PR. I'm going to close this now. We still have https://github.com/dotnet/corefx/issues/1991 tracking improvements around this, and we have https://github.com/dotnet/corefx/issues/4547 tracking adding a shared buffer pool to corefx, which we'd want to use instead of a custom implementation unique to this assembly. Part of @sokket's work on https://github.com/dotnet/corefx/issues/4547 will be in validating the pool in use by assemblies like this and others in corefx.

@stephentoub stephentoub closed this Dec 4, 2015
@KrzysztofCwalina
Copy link
Member

@sokket , @Priya91, @redknightlois, who will do a PR to use the shared pool in compression?

@jonmill
Copy link

jonmill commented Dec 7, 2015

@KrzysztofCwalina I opened issue #4875 to track this for RC2, assigned to myself

@kerryjiang
Copy link

Hello guys, It's pretty good to have a buffer pool in the assembly System.Buffers.
But ObjectPool is required in lots of situations and I also found many different internal ObjectPool implementations in CoreCLR and CoreFx.

So, do you have a plan to make one of the ObjectPools public and move it to a common place?

Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

10 participants