Skip to content
This repository has been archived by the owner on May 24, 2024. It is now read-only.

Commit

Permalink
Merge pull request #10 from tech-greedy/xinaxu/work-with-single-dir
Browse files Browse the repository at this point in the history
Work with single directory
  • Loading branch information
xinaxu authored Feb 28, 2023
2 parents 1f5283e + fd94dee commit ffb7f9e
Show file tree
Hide file tree
Showing 2 changed files with 45 additions and 11 deletions.
18 changes: 15 additions & 3 deletions README.md
Original file line number Diff line number Diff line change
Expand Up @@ -15,15 +15,24 @@ COMMANDS:
help, h Shows a list of commands or help for one command

GLOBAL OPTIONS:
--input value, -i value File to read list of files, or '-' if from stdin (default: "-")
--single When enabled, it indicates that the input is a single file or folder to be included in full, instead of a spec JSON (default: false)
--input value, -i value When --single is specified, this is the file or folder to be included in full. Otherwise this is a JSON file containing the list of files to be included in the car archive (default: "-")
--piece-size value, -s value Target piece size, default to minimum possible value (default: 0)
--out-dir value, -o value Output directory to save the car file (default: ".")
--tmp-dir value, -t value Optionally copy the files to a temporary (and much faster) directory
--parent value, -p value Parent path of the dataset
--help, -h show help (default: false)
```

The input file can be a text file that contains a list of file information SORTED by the path. i.e.
When `--single` is specified, the input is a single file or folder to be included in full, instead of a spec JSON.
```shell
# Generate car file from a single file
$ ./generate-car --single -i test_path/test_file. -o out_dir -p test_path
# Generate car file from a single folder
$ ./generate-car --single -i test_path/test_folder -o out_dir -p test_path
```

For advanced user, without specifying `--single` the input file needs to be a json file that contains a list of file information SORTED by the path. This is useful if you only want to include specific files within a directory or only part of a large file. i.e.
```json
[
{
Expand All @@ -39,4 +48,7 @@ The input file can be a text file that contains a list of file information SORTE
]
```

The tmp dir is useful when the dataset source is on slow storage such as NFS or S3FS/Goofys mount.
The output JSON dump contains `DataCid`, `PieceCid` and `PieceSize` which can be used to make a deal with Filecoin storage providers.

All files are read twice hence if the dataset source is on slow storage such as NFS or S3FS/Goofys mount, you may use tmpdir to copy the files to a fast local directory first.

38 changes: 30 additions & 8 deletions generate-car.go
Original file line number Diff line number Diff line change
Expand Up @@ -10,6 +10,7 @@ import (
"log"
"os"
"path"
"path/filepath"

commcid "github.com/filecoin-project/go-fil-commcid"
"github.com/filecoin-project/go-fil-commp-hashhash"
Expand Down Expand Up @@ -54,12 +55,12 @@ func main() {
Flags: []cli.Flag{
&cli.BoolFlag{
Name: "single",
Usage: "When enabled, it indicates that the input is a single file to be included in full, instead of a spec JSON",
Usage: "When enabled, it indicates that the input is a single file or folder to be included in full, instead of a spec JSON",
},
&cli.StringFlag{
Name: "input",
Aliases: []string{"i"},
Usage: "File to read list of files, or '-' if from stdin",
Usage: "When --single is specified, this is the file or folder to be included in full. Otherwise this is a JSON file containing the list of files to be included in the car archive",
Value: "-",
},
&cli.Uint64Flag{
Expand Down Expand Up @@ -101,12 +102,33 @@ func main() {
if err != nil {
return err
}
input = append(input, util.Finfo{
Path: inputFile,
Size: stat.Size(),
Start: 0,
End: stat.Size(),
})
if stat.IsDir() {
err := filepath.Walk(inputFile, func(path string, info os.FileInfo, err error) error {
if err != nil {
return err
}
if info.IsDir() {
return nil
}
input = append(input, util.Finfo{
Path: path,
Size: info.Size(),
Start: 0,
End: info.Size(),
})
return nil
})
if err != nil {
return err
}
} else {
input = append(input, util.Finfo{
Path: inputFile,
Size: stat.Size(),
Start: 0,
End: stat.Size(),
})
}
} else {
var inputBytes []byte
if inputFile == "-" {
Expand Down

0 comments on commit ffb7f9e

Please sign in to comment.