-
Notifications
You must be signed in to change notification settings - Fork 345
Experimental: Deferred string copies for StringFormatter.Append #976
Conversation
my 2 cents: Also, possibly provide different API paths if multiple clearly different optimization scenarios emerge (we have generic BCL for optimizing the "general case") |
@jamesqo, this works ok, if you are combining strings. But if somebody appends bunch of ints, date times, etc. or does composite formatting, it ends up being worse, i.e. you have copies and you need to allocate additional strings. |
Where does this PR cause additional copying/string allocations for those scenarios? |
@dotnet-bot test this please |
@jamesqo I think a performance test with both short and long copy queues is necessary to determine the benefit of this change. |
ping @jamesqo |
@ahsonkhan Thanks for pinging me. I haven't had time to work on this in the months I've submitted it; I'll close this now and reopen when I get a chance to run perf tests. BTW, I submitted a similar change that actually implemented this concept in LINQ a while ago (after I opened this PR): dotnet/corefx#14675 It yielded about a 50% reduction in allocations. |
Currently,
StringFormatter.Append(string)
eagerly copies the string into its internal buffer. This is inefficient since we do 2 copies: one to the internal buffer, and one to the result string duringToString
.There is a way to improve on this. Instead of copying the string up-front when
Append
is called, we record the current index & add the string to a queue to be copied later. Then, duringToString
, we copy from our buffer up to that index, copy the contents of our string, then continue copying from our buffer. This way, only 1 copy of the string is ever made.Visualization for a test case I added in the PR:
Please note that this should not be merged yet. The code is a little sloppy-- I used
List<T>
for queueing the strings instead of a struct, I usedUnsafe.CopyBlock
instead ofEncoding.GetChars
which will undoubtedly mess up some inputs with special characters, etc. This is more of a proof-of-concept PR intended to garner initial feedback.@KrzysztofCwalina What do you think?