3.23
Version 3.23 arrives three months after the previous one. In addition to datapath optimizations and bug fixes, most of the other changes are enumerated in the following
Table of Contents
- List Objects; Bucket Inventory
- Selecting
Primary
at startup; Restarting cluster when node IPs change (K8s) - S3 (backend, frontend)
- BLOBs
- Mountpath labels
- Reading shards; Reading from shards
See also:
List Objects; Bucket Inventory
- S3 backend: S3 ListObjectsV2 may return a directory !6672
- list very large buckets using bucket inventory !6682, !6684, !6686, !6689, !6692
- list-objects: optimize for prefix; add 'dont-optimize' feature flag !6685
- list very large buckets using bucket inventory (major update, API changes) !6695, !6698
- list very large buckets using bucket inventory !6704
- list-objects: support non-recursive operation (new) !6711, !6712
- refactor and code-generate (message pack) list-objects results !6714
- bucket inventory; generic no-recursion helper !6715
- bucket inventory: support arbitrary schema; add validation !6769
- list-objects: micro-optimize setting custom properties of remote objects !6770
- list very large buckets using bucket inventory !6775, !6776, !6777, !6778
- list very large buckets using bucket inventory (major) !6810, !6811
- list very large buckets using bucket inventory !6815
- list-objects: skip virtual directories !6835
- list very large buckets using bucket inventory !6847, !6851, !6853
Selecting Primary
at startup; Restarting cluster when node IPs change (K8s)
- primary role: add 'is-secondary' environment; precedence !6746
- 'original' & 'discovery' URLs (major) !6747, !6749
- cluster config: new convention for primary URL; role of the primary during: initial deployment, cluster restart !6752, !6755
- cluster restart with simultaneous change of primary (major) !6758, !6760, !6761
- primary startup: always update node net-infos !6762
- all proxies to store
RMD
(previously, only primary) !6764 - node join: remove duplicate IP check (is redundant) !6783
- K8s startup with proxies change their network infos !6785
- primary startup: initial version of the cluster map !6787
- non-primary startup: retry and refactor; factor in !6788
- K8s: primary startup when net-infos change !6789
S3 (backend, frontend)
- backend put-object interface; presigned S3 (refactoring & cleanup) !6662
- default AWS region (cleanup) !6679
s3cmd
: add negative testing !6681- backend: S3 ListObjectsV2 may return a directory !6672
- backend: consolidate environment and defaults !6678
- backend: retain S3-specific error code !6688, !6691
- move presigned URLs code to
backend
package !6801 - multipart upload: read and send next part in parallel !6803
- backend: refactor and simplify !6819
- new feature flag to enable (older) path-style addressing !6821
BLOBs
- config change: assorted feature flags now have bucket scope (major) !6664, !6666
- Python: blob-download API !6687
- Python: get and prefetch with blob-download !6708
- blob downloader (minor ref) !6793
- blob-downloader: finalize control structures; refactor !6812
- GET via blob-download !6873
- multiple blob-download jobs (fixes) !6876
- prefetch via blob-downloader !6882
Mountpath labels
- override-config,
fspaths
section (minor ref) !6718 - config change, API change: mountpath labels (major) !6721, !6722, !6725, !6726, !6733, !6734, !6735, !6736, !6738
- backward compatibility v3.22 and prior; bump CLI version !6740, !6742
- log: mountpath labels vs shared filesystems; memory pressure !6744
Reading shards; Reading from shards
- reading (from) shards: add read-until, read-one, and read-regex methods !6823
- reading shards: read-until, read-one, read-regex !6824
- WebDataset: add
wds-key
; add comments !6826 - reading .TAR, .TGZ, etc. formatted objects (a.k.a. shards) - multiple selection !6827
- GET request to select multiple archived files (feature) !6859
- GET multiple archived files in one shot (feature) !6861, !6862, !6863, !6864, !6866
- Python: GET multiple files from an archive (shard) !6860
Core
- backend put-object interface (refactoring & cleanup) !6662
- get-stats API vs attach/detach mountpaths !6669
- unwrap URL errors; remove
mux.unhandle
; CLI: more tips !6673 - removing a node from a 2-node cluster (in re: rebalance) !6674
- POST /v1/buckets handler: add one more check to URI validation !6690
- last byte (minor ref) !6694
- project layout: move and consolidate all scripts !6699
- extend RMD to reinforce cluster integrity checking !6702
- micro-optimize fast-path fqn parsing !6707
- continued refactoring !6709, !6710
- security dependabot: fix #15 and #16 !6713
- aisnode: remove logs from conf !6727
- extract and unify cluster information; add flags !6741
- copy shared FS capacity; color high/low usage pct; up cli !6743
- node flags in a cluster map vs (node | cluster) restart; node equality !6765
- receive cluster-level metadata (minor ref) !6766
- dsort: write compressed tar !6771
- dsort: read compressed tar; add linter !6772
- backend: uniform naming, common base !6774
- remove
AIS_IS_PRIMARY
environment (is obsolete) !6781 - nlog: allow setting logging to STDERR flag in config !6791
- feature flags
fsync-put
will now have (also) bucket scope !6804 - cold GET: write locally and transmit in parallel (new) !6805, !6807
- move atomic 'stopping' (ref) !6817
aisloader
: add 's3-use-path-style' command line, to use older path-style addressing !6822- cold GET (fast): fclose and check !6825
- speed-up batch jobs (prefetch, archive, copy/transform, multi-object evict/delete) !6830
LOM
: addopen-file
method !6836- nlog: while stopping !6837
- multi-object TCB/TCO; not in-cluster objects; multi-page fix !6840, !6842
- xaction registry: when hk call is premature !6843
- add metrics: get-size and put-size !6849
- memsys/SGL: add compliant 'write-to' interface impl.; amend fast/simplified 'write-to' !6854, !6856, !6857
- stats and metrics: report cumulative GET and PUT sizes in bytes !6855
- datapath query parameters: preparse, reduce size !6858
- stats: fix Prometheus label for total size !6871
- imports (ref) !6878
- move and rename 'node-state-info' and 'node-state-flags' (ref) !6879
- new metric: node-state-flags (bitwise, gauge) !6880
- add management alerts: out-of-space & low-capacity (major) !6883
- add management alerts: out-of-memory & low-on-memory !6885
- microbench: use math/rand/v2 !6886
- transition to Go 1.22 math/rand/v2; crypto/rand reader !6887
- dsort test: use rand.v2 !6888
- transition to Go 1.22 math/rand/v2; add seeded-reader !6890
- cleanup 'cos/math' (ref) !6891
- tests: fix prefix-test for remote ais cluster !6893
CLI
- 'more' fixes !6665
- more tips !6673
- warn when switching cluster to operate in reverse proxy mode !6703
- show feature flags symbolically !6705
- backward compatibility v3.22 and prior; bump CLI version !6740
- 'ais show cluster' to highlight nodes that are low on memory !6745
- 'ls' and 'show object' to support size units (raw, SI, IEC) !6795
- progress bar decorators; elapsed time !6797
- fix used and available capacity !6806
- fix 'show throughput' to not show throughput when !6813
- quiet 'show cluster', 'show performance'; misplaced flags !6814
- 'ais ls' help and inline examples; native GET: add query params !6816
- copying remote objects; progress bar; usability !6839
- extend 'ais gen-shards' to generate WD-formatted shards !6865
- add '--count-and-time-only' option !6868, !6869
- max-pages and limit !6870
- stopping jobs !6875
Python
- add test for invalid bucket name !6683
- blob-download API !6687
- add timeout option to client + version bump !6693
- get and prefetch with blob-download !6708
- tests constants and refactoring !6717
- prefetch blob-download tests !6719
- cluster performance API !6724
- remote enabled tests cleanup refactored !6731
- add missing job tests !6737
- fix formatting issues !6753
- PyTorch: add Iterable-style datasets for AIS Backend !6759
- writer for image dataset !6767
AISSource
: list all objects !6779- add example for dataset_writer !6794
- add tests for dataset writer !6799
- log missing attributes in write_dataset !6820
- update docs !6844
- add MultiShard Stream to PyTorch !6852
- GET multiple files from an archive !6860
Build, CI
- transition to Go 1.22 !6675
- upgrade OSS packages !6680, !6750, !6768
- lint: upgrade; Go 1.22 int range !6728, !6732
- CI: MacOS fix !6729
- remove HDFS backend !6773
- upgrade golang.org/x/net !6831
- lint; min/max shadow !6850
- build: transition to Go 1.22 math/rand/v2 !6892
- CI: maintenance !6838
- lint: golangci-lint !6894
Documentation
- docs: fix https getting-started !6668
- docs: amend getting started !6670
- docs: fix the broken table of contents link !6677
- blog: Very large !6874