Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

storage: retry timeout chunks for fscache ondemand path #788

Merged
merged 1 commit into from
Oct 14, 2022

Conversation

kevinXYin
Copy link
Contributor

for fscache ondemand path, if some requested chunks are set to pending by prefetch threads, and wait them timeout, will casue EIO to container side.

retry the timeout chunks on ondemand path, minimize EIOs.

Signed-off-by: Xin Yin yinxin.x@bytedance.com

@anolis-bot
Copy link
Collaborator

@kevinXYin , a new test job has been submitted. Please wait in patience. The test job url: https://tone.openanolis.cn/ws/nrh4nnio/test_result/25903

@anolis-bot
Copy link
Collaborator

@kevinXYin , The CI test is completed, please check result:

Test CaseTest Result
merge-target-branch✅ SUCCESS
build-docker-image✅ SUCCESS
compile-nydus✅ SUCCESS
compile-ctr-remote✅ SUCCESS
compile-nydus-snapshotter✅ SUCCESS
start-nydus-snapshotter-config-containerd✅ SUCCESS
run-container-with-nydus-image✅ SUCCESS

Congratulations, your test job passed!

@@ -579,7 +579,30 @@ impl FileCacheEntry {
}

if !bitmap.wait_for_range_ready(chunk_index, count)? {
Err(eio!("failed to read data from storage backend"))
if !prefetch {
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Minor suggestion:

if prefetch {
 return Err(eio!("failed to read data from storage backend"));
}
...

This function is a little over complex, will try to refactor it later

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks , updated.
Yes , this func is too long now , and I remember seems there's still issue for stargz part, may need farther check later.

for fscache ondemand path, if some requested chunks are set to pending by
prefetch threads, and wait them timeout, will casue EIO to container side.

retry the timeout chunks on ondemand path, minimize EIOs.

Signed-off-by: Xin Yin <yinxin.x@bytedance.com>
@anolis-bot
Copy link
Collaborator

@kevinXYin , the code has been updated, so a new test job has been submitted. Please wait in patience. The test job url: https://tone.openanolis.cn/ws/nrh4nnio/test_result/25933

@anolis-bot
Copy link
Collaborator

@kevinXYin , The CI test is completed, please check result:

Test CaseTest Result
merge-target-branch✅ SUCCESS
build-docker-image✅ SUCCESS
compile-nydus✅ SUCCESS
compile-ctr-remote✅ SUCCESS
compile-nydus-snapshotter✅ SUCCESS
start-nydus-snapshotter-config-containerd✅ SUCCESS
run-container-with-nydus-image❌ FAIL

Sorry, your test job failed. Please get the details in the link.

@changweige
Copy link
Contributor

/retest

@anolis-bot
Copy link
Collaborator

@changweige , the test job has been submitted. Please wait in patience. The test job url: https://tone.openanolis.cn/ws/nrh4nnio/test_result/26268

@anolis-bot
Copy link
Collaborator

@changweige , The CI test is completed, please check result:

Test CaseTest Result
merge-target-branch✅ SUCCESS
build-docker-image✅ SUCCESS
compile-nydus✅ SUCCESS
compile-ctr-remote✅ SUCCESS
compile-nydus-snapshotter✅ SUCCESS
start-nydus-snapshotter-config-containerd✅ SUCCESS
run-container-with-nydus-image✅ SUCCESS

Congratulations, your test job passed!

@jiangliu jiangliu merged commit 665a7b1 into dragonflyoss:master Oct 14, 2022
@kevinXYin kevinXYin deleted the fscache_fetch_chunks_fix branch October 17, 2022 01:39
@changweige
Copy link
Contributor

I didn't figure out why wait_for_range_ready is called after synchronous pulls of chunks. It looks like somewhat of a validation. But I am afraid something is wrong if wait_for_range_ready returns timeout ERROR since it is called after a synchronous download. It should always return true immediately.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging this pull request may close these issues.

4 participants