Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

RFC-3734: Buffered reader #3734

Merged
merged 5 commits into from
Dec 18, 2023
Merged

RFC-3734: Buffered reader #3734

merged 5 commits into from
Dec 18, 2023

Conversation

WenyXu
Copy link
Member

@WenyXu WenyXu commented Dec 10, 2023

This RFC proposes a buffered reader to amortize the overhead of IO. The concurrent fetching feature will be introduced in another RFC soon.

Related

@WenyXu WenyXu requested a review from Xuanwo as a code owner December 10, 2023 08:32
@WenyXu
Copy link
Member Author

WenyXu commented Dec 10, 2023

I'm going to invite my workmate to review this RFC, too. cc @evenyag @zhongzc

@WenyXu WenyXu force-pushed the docs/buffer-reader-rfc branch from 268b142 to 83f10e8 Compare December 10, 2023 08:57
@WenyXu
Copy link
Member Author

WenyXu commented Dec 11, 2023

Another idea came to my mind: we can support multiple buffering policies for different purposes. i.e., only buffer the partial read and prefetched segments for future use(the segment prefetching will be introduced with the concurrent fetching feature). Maybe we should leave it as future work.

core/src/docs/rfcs/3734_buffered_reader.md Outdated Show resolved Hide resolved
core/src/docs/rfcs/3734_buffered_reader.md Outdated Show resolved Hide resolved
core/src/docs/rfcs/3734_buffered_reader.md Outdated Show resolved Hide resolved
core/src/docs/rfcs/3734_buffered_reader.md Outdated Show resolved Hide resolved
core/src/docs/rfcs/3734_buffered_reader.md Outdated Show resolved Hide resolved
core/src/docs/rfcs/3734_buffered_reader.md Outdated Show resolved Hide resolved
core/src/docs/rfcs/3734_buffered_reader.md Outdated Show resolved Hide resolved
core/src/docs/rfcs/3734_buffered_reader.md Outdated Show resolved Hide resolved
@WenyXu WenyXu marked this pull request as draft December 11, 2023 10:13
@WenyXu WenyXu marked this pull request as ready for review December 12, 2023 09:05
@WenyXu WenyXu mentioned this pull request Dec 12, 2023
@WenyXu WenyXu force-pushed the docs/buffer-reader-rfc branch from 1e4015f to 2dbced7 Compare December 12, 2023 14:07
@WenyXu WenyXu force-pushed the docs/buffer-reader-rfc branch from 2dbced7 to 13d7cdf Compare December 12, 2023 16:31
@WenyXu WenyXu marked this pull request as draft December 13, 2023 13:10
@WenyXu WenyXu force-pushed the docs/buffer-reader-rfc branch from bf07ef1 to 6400337 Compare December 18, 2023 11:10
@WenyXu WenyXu force-pushed the docs/buffer-reader-rfc branch from 6400337 to fa960e5 Compare December 18, 2023 11:15
@WenyXu WenyXu marked this pull request as ready for review December 18, 2023 11:15
@WenyXu
Copy link
Member Author

WenyXu commented Dec 18, 2023

Sorry for the delay. Please take a look when you have time. Thanks🥹

Copy link
Member

@Xuanwo Xuanwo left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks! Mostly LGTM.

@Xuanwo Xuanwo changed the title RFC: Buffered reader RFC-3734: Buffered reader Dec 18, 2023
@Zheaoli
Copy link
Member

Zheaoli commented Dec 18, 2023

For me, mostly LGTM. BTW, I have a suggestion here, maybe we need to add a Capability to let each service's developer decide support the buffer read or not and check the maximum size their own.

Because some of the service like PGSQL,MySQL etc.. are dedicated to the small file.

This is probably an implementation issue, I'm not sure we need add extra description in RFC docs

@Xuanwo
Copy link
Member

Xuanwo commented Dec 18, 2023

BTW, I have a suggestion here, maybe we need to add a Capability to let each service's developer decide support the buffer read or not and check the maximum size their own.

Users don't care about buffer can just don't use this API.

And besides, buffer is useful even when file is small. For example, read(1), seek(1), read(2).

@Zheaoli
Copy link
Member

Zheaoli commented Dec 18, 2023

Users don't care about buffer can just don't use this API.

And besides, buffer is useful even when file is small. For example, read(1), seek(1), read(2).

Make sense.

Copy link
Member

@Zheaoli Zheaoli left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM

@Xuanwo
Copy link
Member

Xuanwo commented Dec 18, 2023

We got three approvals now, merging!

@Xuanwo Xuanwo merged commit b0f759d into apache:main Dec 18, 2023
165 checks passed
@hoosin
Copy link

hoosin commented Dec 19, 2023

cool

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging this pull request may close these issues.

8 participants