Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

FromInputStreamPublisher: avoid extra allocation of a buffer #2965

Merged
merged 3 commits into from
Jun 14, 2024

Conversation

idelpivnitskiy
Copy link
Member

@idelpivnitskiy idelpivnitskiy commented Jun 12, 2024

Motivation:

In #2949 we optimized a case when available() is not implemented and always returns 0. However, we de-optimized a use-case when it's implemented because the last call to available() always returns 0, but we still allocate a buffer of size readChunkSize that won't be used.

Modifications:

  • Enhance doNotFailOnInputStreamWithBrokenAvailableCall(...) test before any changes for better test coverage.
  • Remove byte[] buffer from a class variable. It can be a local variable because it's never reused in practice. Only the last buffer won't be nullified, but we don't need it after that.
  • When available() returns 0, try reading a single byte and then recheck availability instead of always falling back to
    readChunkSize.
  • Adjust doNotFailOnInputStreamWithBrokenAvailableCall() test to account for the 2nd call to available();
  • Add singleReadTriggersMoreAvailability() test to simulate when the 2nd call to available() returns positive value;

Result:

  1. No allocation of a buffer that won't be used at the EOF.
  2. Account for new availability if it appears after a read().

Benchmark results:

image

The effect is visible for 256B and 16Kb payloads (less than 1 readChunkSize) because for them an allocation of an extra buffer is a bigger contributor.
available() of 0 means it always returns 0, true - it always returns a real value. Used in-memory ByteArrayInputStream with overridden available() method.

new byte[]{31, 32, 33, 34, 35},
new byte[]{36},
// available < readChunkSize
new byte[]{0, 1, 2},
Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Changes for this test are made in the first commit to demonstrate that they only enhance test coverage and are not driven by the actual change. Consider reviewing this PR by every commit.

Motivation:

In apple#2949 we optimized a case when `available()` is not implemented and
always returns `0`. However, we de-optimized a use-case when it's
implemented because after that change the last call to `available()`
always returns 0, but we still allocate a buffer of size `readChunkSize`
that won't be used at all.

Modifications:
- Enhance `doNotFailOnInputStreamWithBrokenAvailableCall(...)` test
before any changes to have better test coverage.
- Remove `byte[] buffer` from a class variable. It can be a local
variable because it's never reused in practice. Only the last `buffer`
won't be used nullified, but we don't need it after that.
- When `available()` returns `0`, try reading a single byte and then
check availability again instead of always falling back to
`readChunkSize`.
- Adjust `doNotFailOnInputStreamWithBrokenAvailableCall()` test to
account for the 2nd call to `available()`;
- Add `singleReadTriggersMoreAvailability()` test to simulate when the
2nd call to `available()` returns positive value;

Result:

1. No allocation of a `buffer` that won't be used at the EOF.
2. Account for new availability if it appears after a `read()`.
@@ -176,14 +173,27 @@ public void cancel() {
private void readAndDeliver(final Subscriber<? super byte[]> subscriber) {
try {
do {
int readByte = -1;
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

ffti

Suggested change
int readByte = -1;
int readByte = END_OF_FILE;

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

While technically it's the same value, it has a different meaning as "didn't read a byte" vs "read EOF". I would prefer to not confuse it, but ready to change it to Integer.MIN_VALUE instead. WDYT?

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Maybe an inline javadoc would help to clear up confusion and we can keep the -1?

Copy link
Contributor

@bryce-anderson bryce-anderson Jun 14, 2024

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Inline comment, or maybe a new constant (could have same or different value) to signal the different meaning would be fine. tbh I'm also fine with just merging as is, thus the ffti.

@@ -176,14 +173,27 @@ public void cancel() {
private void readAndDeliver(final Subscriber<? super byte[]> subscriber) {
try {
do {
int readByte = -1;
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Maybe an inline javadoc would help to clear up confusion and we can keep the -1?

@idelpivnitskiy idelpivnitskiy merged commit 82e256e into apple:main Jun 14, 2024
11 checks passed
@idelpivnitskiy idelpivnitskiy deleted the readNoAvailable branch June 14, 2024 22:07
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants