Releases: GoogleCloudPlatform/gcsfuse
Gcsfuse v2.5.1
Bug Fixes:
Improved error handling: GCSFuse will now retry requests to hierarchical namespace bucket APIs that encounter deadline exceeded errors. This improves stability and prevents unnecessary failures, especially in high latency environments.
Gcsfuse v2.5.0
- Hierarchical Namespace Enabled Buckets: With this release users can also mount buckets with hierarchical namespace enabled. HNS-enabled buckets offer several advantages over standard buckets when used with cloud storage fuse:
- Renaming directories is now supported natively with Hierarchical Namespace Buckets, where Rename operation is fast and atomic to benefit workloads such as AI/ML checkpointing.
- Users don't need to specify the --implicit-dirs command-line option when using hierarchical namespace enabled buckets. HNS buckets inherently understand directories, so gcsfuse does not need to simulate directories using placeholder objects ( 0-byte objects ending with '/' ).
- Kernel’s Lookup operations are also improved in HNS buckets. The Objects.list api is replaced with a more efficient Folder:get api.
- Config CLI Parity:
- All params in config-file will be available as CLI flags and vice versa going forward. Documentation links:
- Deprecated and hidden flags won't show up in the help-doc any more.
- Write Enhancements:
- Information of newly created files is added to the type cache. This helps reduce an additional GCS list call if the bucket has been mounted with –implicit-dirs flag. (PR#2303)
Gcsfuse v2.4.1
- Bug Fixes & Improvements:
- Fix for os.RemoveAll(directory) failure with kernel list cache and remote file addition: PR#2163.
- Fixed an issue in listing during creation of files in the same directory (Issue#2220, PR#2237).
- Additional performance improvements when using Parallel downloads (PR#2287).
- Metrics:
- Categorize the various (approximately 150) fs_errors into a handful of categories. This helps reduce the upper bound of cardinality of these error labels which avoids it from getting dropped by the Google Cloud Metrics pipeline (PR#2321, PR#2382, PR#2390).
- Change unit of ops and file-cache latency metrics to microseconds to improve the resolution of these metrics (PR#2380).
- Dependency Upgrades / CVE fixes:
Gcsfuse v2.1.1
This release is built on top of v2.1.0 with the additional security fixes of Golang upgrade to 1.22.4
Gcsfuse v2.4.0
- Parallel download:
- Accelerates reads of large files, by using the file cache directory as a prefetch buffer using multiple workers to download large files in parallel.
- This feature is useful for single threaded read scenarios that load large (>1GiB) files, such as model serving use cases and checkpoint restores.
- This feature is disabled by default. To enable this feature:
- Enabling the file cache feature (GKE instructions) is a prerequisite for using the parallel download feature, which uses the cache directory as a prefetch buffer. Although a cache is typically associated with repeat reads, with parallel downloads even first reads of large files are accelerated.
- The file being read must fit within the file cache directory’s available capacity, which can be controlled by max-size-mb. A value of “-1” allows it to use the cache volume’s entire capacity, or you can give it a value in Megabytes.
- If the same file will be read multiple times, increase the ttl-secs value. A value of "-1" bypasses TTL expiration and serves the file from the cache if it's available.
- Set file-cache:enable-parallel-downloads:true in the config file to enable parallel downloads. The default is false.
- Additional optional parameters:
- file-cache:parallel-downloads-per-file: The number of maximum workers to spawn per file to download the object from GCS into the file-cache. Default is 16.
- file-cache:max-parallel-downloads: The number of maximum workers that can be spawned at any given time across all the download jobs of files. The default is set to 2x the number of CPU cores on the machine. A value of -1 means no limit.
- file-cache:download-chunk-size-mb: The size of each read request in MiB that each goroutine makes to GCS when downloading the object into file-cache. Default is 50. A parallel download will only trigger if the file being read is => this value specified
- Note: If your application does high read parallelism (>8 threads), a slight performance degradation may be observed if using this feature. High read parallelism is typically seen in training workloads so should not be used for training workloads, and is therefore recommended only for model serving and checkpoint restores, which are typically single threaded large file reads.
- Enabling the file cache feature (GKE instructions) is a prerequisite for using the parallel download feature, which uses the cache directory as a prefetch buffer. Although a cache is typically associated with repeat reads, with parallel downloads even first reads of large files are accelerated.
- Addresses #1300
- Kernel-List-Cache
- List responses, that happen as a part of a readdir operation, are cached in the kernel page cache. This can significantly speed up AI/ML training runs, which do full directory listing first, by serving repeat ListObjects calls locally from the kernel page cache. Due to potential coherency/consistency issues, it is recommended to use on read only volumes, specifically for serving and training.
- This feature is disabled by default. To enable this feature:
- Control cache invalidation via the
--kernel-list-cache-ttl-secs
cli flag orfile-system:kernel-list-cache-ttl-secs
config flag, where a value of:- 0 means disabled. This is the default value.
- valid positive - represents the ttl (in seconds) to keep the directory list response in the kernel page-cache.
-1
to bypass a TTL expiration and serve the list response from the cache whenever it's available.
- Control cache invalidation via the
- Addresses #184
- CLI-flags Take Precedence over Config, behavior change for logging-flags: Going forward command-line flags will always take precedence for all settings. Change in behavior for logging flags (--log-file & --log-format), where config was taking precedence but not cli-flags will take. This will affect only if the same settings are set in both CLI and config. - #2077
Dependency Upgrades / CVE fixes:
Gcsfuse v2.3.2
This release includes:
- Code changes/fixes to kernel list cache feature.
Full Changelog: v2.3.1...v2.3.2
Gcsfuse v2.3.1
Enhancements:
- Hotfix release to restore backward compatibility for the experimental KernelListCacheTtlSecs that got broken in version v2.3.0
- Cache invalidation can be controlled via the --kernel-list-cache-ttl-secs cli flag or file-system:kernel-list-cache-ttl-secs config flag
Dependency Upgrades / CVE fixes:
- No Changes here.
What's Changed
- Revert "Move KernelListCacheTtlSecs to List by @kislaykishore in #2083
Full Changelog: v2.3.0...v2.3.1
Gcsfuse v2.3.0
Enhancements:
- Ignore Interrupts default true:
- --ignore-interrupts flag is now ENABLED by default, more details on this feature added in v2.1.0.
- Experimental kernel list cache config change:
- There is backward incompatible change in this release for configuration file, prior to this release the experimental kernel-list-cache-ttl-secs could be set using file-system:kernel-list-cache-ttl-secs, for this release the config should be list:kernel-list-cache-ttl-secs. Passing this value through cli remains the same, no change in that.
- anonymous-access config change:
- There is backward incompatible change in this release for configuration file, prior to this release anonymous access used to access an endpoint that does not require authentication, such as a publicly accessible bucket or test endpoint could be set using auth-config:anonymous-access: true, for this release the config should be gcs-auth:anonymous-access: true. Passing this value through cli remains the same, no change in that.
Dependency Upgrades / CVE fixes:
- Upgraded go lang to 1.22.4.
What's Changed
- Update direct dependencies. by @kislaykishore in #2009
- Add all config-params to params.yaml by @kislaykishore in #1999
- Convert gcloud operations in storage client - 1 by @Tulsishah in #1959
- Add documentation to params.yaml by @kislaykishore in #2010
- Make the spreadsheetId argument optional by @Tulsishah in #2012
- Delete folder api implementation by @Tulsishah in #1954
- fix e2e test for hns bucket by @Tulsishah in #2020
- Validates formatting in CI workflow by @kislaykishore in #2016
- Run all except cache tests concurrently by @kislaykishore in #2022
- Making compute crc method context cancellable by @vadlakondaswetha in #2013
- Use 1.22 as the min Go version by @kislaykishore in #2003
- Set crc32 when converting minobject to object by @vadlakondaswetha in #2028
- Add parallel downloads job by @sethiay in #2005
- Fix copyright URL by @kislaykishore in #2031
- Convert Golang compliant flags to POSIX compliant by @kislaykishore in #2030
- Automate e2e tests on TPC environment by @Tulsishah in #1943
- Make Wait for Download false for all read scenarios when parallel downloads is true by @ankitaluthra1 in #2027
- Kokoro tpc build fix by @Tulsishah in #2033
- Support fstab mount with hyphens by @ashmeenkaur in #1993
- Fix Pytorch v2 model failure by @sethiay in #2037
- Unmarshal the config object by @kislaykishore in #2035
- Implement limit on max-concurrency. by @kislaykishore in #2032
- Add max-download-parallelism validation by @kislaykishore in #2046
- Ignore interrupts by default by @ashmeenkaur in #2034
- [Config CLI Parity] add squash to config structure tags to make decoding easier by @ashmeenkaur in #2044
- [Config-CLI Parity][Fix] Populate config object when config file is not provided by @ashmeenkaur in #2042
- adds get folder method in bucket.go by @ankitaluthra1 in #2041
- Raise error if params.yaml contains invalid fields by @kislaykishore in #2049
- fix gcloud install by @Tulsishah in #2039
- Return error if the config file is invalid. by @kislaykishore in #2048
- Use atleast 1 goroutine for async job by @sethiay in #2056
- Add @kislaykishore as owners for config by @kislaykishore in #2055
- Revert "add squash to config structure tags to make decoding possible… by @ashmeenkaur in #2058
- [Config CLI Parity] temp-dir should be of type resolved path by @ashmeenkaur in #2059
- Add more tests (unit, composite) tests for parallel downloads by @sethiay in #2019
- Remove flakiness in parallel downloads unit tests by @sethiay in #2063
- Add constraints on number of args by @kislaykishore in #2066
- Rename parallel downloads flags by @sethiay in #2064
- Fix race in test by @kislaykishore in #2067
- Use default MaxDownloadParallelism based on number of CPU cores by @sethiay in #2047
- Set default max-parallel-downloads in new config by @kislaykishore in #2072
- Setting defaults for parallel downloads by @sethiay in #2073
- Use truncate and remove function while eviction from job during CRC check by @sethiay in #2015
Full Changelog: v2.2.0...v2.3.0
Gcsfuse v2.2.0
New Features:
Enhancements:
- Allow parallel lookups of files:
- Allows parallel lookup/access of files under the same directory.
- Before this release, if an application accessed two files
/gcsfuse/mount/a.txt
&/gcsfuse/mount/b.txt
in parallel, then access was
serialized (both at Kernel's FUSE driver layer and GCSFuse). - With this release, access is parallelized improving read performance up to
18x when reading 100K files using 50 threads.
Dependency Upgrades / CVE fixes:
- No dependency upgrades or CVE fixes.
What's Changed
- Moving auth and config packages to stretchr/testify by @sethiay in #1918
- BucketType method to locally store bucketType in variable by @Tulsishah in #1896
- [PR1] allow parallel lookups by @sethiay in #1866
- Include code-coverage badge in README by @kislaykishore in #1933
- Update .codecov.yml to use the PRs base as reference by @kislaykishore in #1934
- [PR2] Composite tests for Parallel dirops test. by @sethiay in #1906
- Move to stretchr testify in ratelimit & util packages by @sethiay in #1922
- Add a Cobra root command. by @kislaykishore in #1908
- [PR3] E2E tests for parallel dirops by @sethiay in #1907
- Add timeout to the coverage workflow by @kislaykishore in #1940
- Metadata-prefetch (aka Recursive listing or ls -R during mount) by @gargnitingoogle in #1930
- [PR1] Support to keep ReadDir response in kernel cache by @raj-prince in #1897
- Implement method to return bucket type from server by @Tulsishah in #1898
- Disable the "changes" feature of codecov by @kislaykishore in #1952
- Remove some flagset to decrease timeout for e2e tests by @Tulsishah in #1951
- Mark "metadata-prefetch-on-mount" as experimental by @gargnitingoogle in #1957
Full Changelog: v2.1.0...v2.2.0
Gcsfuse v2.1.0
New Features:
- anonymous-access:
Used to access an endpoint that does not require authentication, such as a publicly accessible bucket, or a test custom-endpoint.
Can be set via CLI using--anonymous-access
while mounting, or via the config file with
auth-config:
anonymous-access: true
Users that previously used custom endpoints without authentication must additionally pass the anonymous-access flag. See the Changes section below.
- Rocky Linux version 8.9 or later is now supported.
Enhancements:
- Interrupt Handling:
GCSFuse now offers enhanced control over how it responds to interruptions during file system operations. This addresses running Git clone operations on a GCSFuse mounted directory. (PR #1863, #1860) (Issues: #1016, #562, #321)
You can configure GCSFuse to ignore interruptions during file system operations via CLI using--ignore-interrupts
flag (disabled by default) or via config-file using the following config:
file-system:
ignore-interrupts: true
- TCP Connections:
Changed the max-conns-per-host default from 100 maximum TCP connections to unlimited by defaulting max-conns-per-host to 0. This change will help in scenarios where customers are concurrently running more than 100 threads for file operations on GCFuse mounted directories. (PR #1909) (Issues: #1844, #1040)
Changes:
- Custom-endpoint authentication:
Previously, if a custom endpoint is specified, authentication is disabled on the endpoint. Starting with this release, it is now enabled. To use a custom-endpoint without authentication, using the new –anonymous-access feature.
Dependency Upgrades / CVE fixes:
- Upgraded dependencies for better stability and CVE fixes (CVE-2023-45288). (PR #1811, #1894, #1916, #1915)
- Storage sdk upgraded to 1.41.0 including the fix to retry on connection reset.
What's Changed
- Change authentication flow for TPC by @Tulsishah in #1840
- Making read large files test package parallell by @Tulsishah in #1849
- Making rename dir limit test package parallell by @Tulsishah in #1850
- Adding info log for operation retry by @raj-prince in #1854
- Move e2e script in integration test dir by @Tulsishah in #1853
- Making implicit-explicit dir test package parallel by @Tulsishah in #1851
- Enabling gRPC related integration test by @raj-prince in #1823
- Fix failure for kokoro e2e tests for implicit dir by @Tulsishah in #1862
- Introducing new flag to enable hns flow by @Tulsishah in #1855
- [PR-1] Add flag and config to ignore interrupts by @ashmeenkaur in #1863
- moving from gsutil to gcloud in e2e tests by @Tulsishah in #1852
- remove flaky TestRangeReadsBeyondReadChunkSizeWithoutChunkDownloaded test by @ashmeenkaur in #1869
- Fix flaky list large dir e2e test by @Tulsishah in #1871
- switch from read_cache_release branch to master in readcache test script by @gargnitingoogle in #1872
- Making operations test package parallell by @Tulsishah in #1858
- Intro anonymous-access flag by @Tulsishah in #1827
- Using PATH env variable instead of copying gcsfuse package for integration tests by @ashmeenkaur in #1877
- Overriding authConfig flag by @Tulsishah in #1878
- fix flaky modification time check in integration tests by @ashmeenkaur in #1875
- [PR-2][Phase1 Implementation] Ignore GCSFuse interrupts by @ashmeenkaur in #1860
- Using Stretcher testify package in storage handle tests by @Tulsishah in #1867
- Upgrading depenencies as per dependabot suggestions by @sethiay in #1881
- Fix only dir mounting e2e test by @Tulsishah in #1880
- Upgrading go version to 1.22.2 by @sethiay in #1882
- Fix send failure logs on kokoro artifacts by @Tulsishah in #1886
- Printing flags in test logs by @Tulsishah in #1885
- Initializing new storage control client by @Tulsishah in #1865
- Upgrades dependencies by dependabot by @ankitaluthra1 in #1894
- Bump python dependabot dependencies by @ashmeenkaur in #1916
- upgrade storage sdk to v1.41.0 by @ashmeenkaur in #1910
- upgrade golang to 1.22.3 by @ashmeenkaur in #1917
- upgrade dependabot dependencies by @ashmeenkaur in #1915
- Moving logger, flag & perms package tests to stretchr testify by @sethiay in #1919
- fix ignore interrupt integration tests by @ashmeenkaur in #1925
- Remove default limit on max tcp connections by setting max-conns-per-host to 0 by @ashmeenkaur in #1909
- Implement code coverage checks by @kislaykishore in #1921
- Disabling gRPC test temporarily by @raj-prince in #1929
Full Changelog: v2.0.1...v2.1.0