Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

fix: correct some bugs and add feature of backend config for optimize subcommand #1661

Open
wants to merge 9 commits into
base: master
Choose a base branch
from

Conversation

Maxing1997
Copy link

Details

There are some bugs in optimize subcommand, following bugs are fixed in this patchset:

  • read chunk from wrong blob
  • missing extended table in new bootstrap
  • hardlink is broken in new image
  • missing blobinfo updates
  • wrong chunk align for fs-version 5

By the way, implement the backend config support for optimize subcommand, thus we can read chunk on-demand during build process.

Types of changes

What types of changes does your PullRequest introduce? Put an x in all the boxes that apply:

  • Bug fix (non-breaking change which fixes an issue)
  • New feature (non-breaking change which adds functionality)
  • Breaking change (fix or feature that would cause existing functionality to change)
  • Documentation Update (if none of the other choices apply)

@Maxing1997 Maxing1997 requested a review from a team as a code owner January 22, 2025 08:04
@Maxing1997 Maxing1997 requested review from bergwolf, changweige and adamqqqplay and removed request for a team January 22, 2025 08:04
Copy link

codecov bot commented Jan 22, 2025

Codecov Report

Attention: Patch coverage is 0% with 247 lines in your changes missing coverage. Please review.

Project coverage is 59.85%. Comparing base (f60e40a) to head (98d0637).

Files with missing lines Patch % Lines
builder/src/optimize_prefetch.rs 0.00% 170 Missing ⚠️
src/bin/nydus-image/main.rs 0.00% 53 Missing ⚠️
contrib/nydusify/cmd/nydusify.go 0.00% 24 Missing ⚠️
Additional details and impacted files

Impacted file tree graph

@@            Coverage Diff             @@
##           master    #1661      +/-   ##
==========================================
- Coverage   60.05%   59.85%   -0.21%     
==========================================
  Files         147      147              
  Lines       49624    49791     +167     
  Branches    47078    47222     +144     
==========================================
- Hits        29802    29800       -2     
- Misses      18040    18208     +168     
- Partials     1782     1783       +1     
Files with missing lines Coverage Δ
builder/src/lib.rs 63.15% <ø> (ø)
contrib/nydusify/cmd/nydusify.go 13.09% <0.00%> (-0.43%) ⬇️
src/bin/nydus-image/main.rs 0.58% <0.00%> (-0.01%) ⬇️
builder/src/optimize_prefetch.rs 0.00% <0.00%> (ø)

... and 2 files with indirect coverage changes

builder/src/optimize_prefetch.rs Show resolved Hide resolved
builder/src/optimize_prefetch.rs Show resolved Hide resolved
builder/src/optimize_prefetch.rs Outdated Show resolved Hide resolved
}
impl PrefetchFileInfo {
fn from_input(input: &str) -> Result<Self> {
let parts: Vec<&str> = input.split_whitespace().collect();
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

How we can handle the path with blank char like /path/to/file foo? how about defining a JSON format like:

{
  "version": "v1",
  "files": [
    {
      "path": "/path/to/file",
      "ranges": [[100, 20], [140, 50]] // [[offset, size]]
    },
  ]
}

It's make extensible easier.

Copy link
Author

@Maxing1997 Maxing1997 Jan 24, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Now we have not considered this case, how about making it as a to-do item and optimize it in later patches.

builder/src/optimize_prefetch.rs Outdated Show resolved Hide resolved
contrib/nydusify/cmd/nydusify.go Outdated Show resolved Hide resolved
contrib/nydusify/cmd/nydusify.go Outdated Show resolved Hide resolved
contrib/nydusify/cmd/nydusify.go Outdated Show resolved Hide resolved
.help("Config file of backend")
.conflicts_with("backend-config")
.required(false),
)
.arg(
Arg::new("blob-dir")
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Maybe the option is unnecessary if we have backend-config option.

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

this is still needed, same in compact subcommand.
image

@imeoer
Copy link
Collaborator

imeoer commented Jan 24, 2025

Can we add a usage in docs/nydusify.md for the nydusify optimize command?

Xin Yin and others added 7 commits January 24, 2025 15:06
chunks of a file may come from multiple blobs, we should find blob file
for each chunk.

Signed-off-by: Xin Yin <yinxin.x@bytedance.com>
For nydus-image optimize subcommand, support prefetch file with ranges.

Signed-off-by: Xin Yin <yinxin.x@bytedance.com>
The bootstrap generated by optimize subcommand missed extended table,
which cause runtime use Digested ChunkMap.

Signed-off-by: Xin Yin <yinxin.x@bytedance.com>
For nydus-image optimize subcommand, skip bad prefetch file instead of
stopping whole build process, such as symlink.

Signed-off-by: Xing Ma <maxing.lan@bytedance.com>
When the inode of the prefetch files has hardlinks, there are two bugs:
1. hardlink is broken, because layer_idx is changed which is used as a
   key for identifying hardlink in inode_map;
2. prefetching of the other identical inode will fall into other blob,
   which may cause perfomance issue due to IO amplifying.

This commit fixes two above bugs.

Signed-off-by: Xing Ma <maxing.lan@bytedance.com>
Now it only updates blob_ctx during the build process without persisting
these changes to blob_info. This results in incorrect metadata which can
be shown by using nydus-image check.

Signed-off-by: maxing.lan <maxing.lan@bytedance.com>
Blob meta info is only for fs-version 6, so we move the corresponding code
in one place for better judgement branch.

Besides, fix two potential bugs:
1. for fs-version 5, we don't align the chunk offset up to 4k, wrong
   rounding up will cause coredump in runtime.
2. use BlobChunkInfoV2Ondisk instead of BlobChunkInfoV1Ondisk to keep
   consistent with generate_chunk_info() which returns
   BlobChunkInfoV2Ondisk. Theoretical risk, not fully verified.

Signed-off-by: maxing.lan <maxing.lan@bytedance.com>
@Maxing1997 Maxing1997 force-pushed the maxing_mr/optimize_fix branch 2 times, most recently from 8de409d to 2d570a0 Compare January 24, 2025 08:45
maxing.lan added 2 commits January 24, 2025 18:37
With backend configure supoorted, we can read chunk on demand by
specifying backend such as registry without downloadling whole image
during image building.

Signed-off-by: maxing.lan <maxing.lan@bytedance.com>
When localfs-backend is true (default is false), still pull whole image
to build optimized image; otherwise, use registry backend to fetch
needed chunk during building process.

Signed-off-by: maxing.lan <maxing.lan@bytedance.com>
@Maxing1997 Maxing1997 force-pushed the maxing_mr/optimize_fix branch from 2d570a0 to 98d0637 Compare January 24, 2025 10:38
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants