Releases: NVIDIA/ais-k8s
Releases · NVIDIA/ais-k8s
v0.8
- operator: allow seamless upgrade (RBAC, version patching)
- switch to AIStore v3.7 default; bump Operator version to v0.8
- CI: install fix
- add pods/exec policy to support typed ETL communicators
v0.7
- Add Ansible playbooks for deploying AIStore on bare-metal Kubernetes
- fix Operator logic to create
ClusterRole
andClusterRoleBinding
in presence of multiple namespaces
v0.6
Release Highlights
- Enable AIS Prometheus exporter
- Support deployment with AWS and GCP backends
- Pin AIS target mountpaths to local PVs
- Support host networking
- Transition to Go1.16
AIStore Kubernetes - v0.5
Release Highlights
- Add Kubernetes Operator for deploying and maintaining AIS cluster in Kubernetes environment;
- Update Helm charts to support AIS v3.4;
- Add support for Terraform to spin up and manage GKE on GCP;
- Provide direct external access to cluster via LoadBalancer;
- Integrate Cilium CNI with Direct Server Return (DSR);
- Test AIS on GKE and Minikube with and without Cilium CNI and MetalLB.
Kubernetes Operator
- fix creating multiple AIS Clusters within one K8s cluster !105
- port features from helm deployment !106
- add integration test using envtest !107
- fix hostpath create directory if not present !108
- add test to check all k8s objects are created !109
- add test for deploying multiple AIS clusters in the same K8s cluster !110
- use a separate namespace for tests !111
- tests wait for daemon statefulsets to be ready !113
- update primary start-up logic; simplify init scripts !114
- implement scale up logic !115
- allow skipping test while building !116
- single state change per reconciler loop !119
- use DNS for targets and proxies !120
- enable external access !121
- using AIS APIs for testing !122
- enabling testing on GCP + detect test environments !124
- user specified storage class in tests !126
- squash all operator git commits !129
- set cluster config when CR updated !131
- skip external tests on local minikube if tunnel not running !132
- introduce event recorders !133
- scale up/down with externalLB !134
- manager include deployment type !135
- revise state management !136
- add cluster startup with external lb tests !137
- implement better error management !138
- cleanup RBACon deleting AIS cluster !139
- fix CR cleanup !140
- general code refactoring !141
- tests - wait for the cluster to actually start !145
- update to use new config !147
- use different ConfigMap for global config !151
- rewrite wait for cluster ready in tests !152
- wait for resources to be truly removed on scale-down/cleanup !163
- retry Reconcile on CR versions conflict !165
- update code to reflect new config and cmn package !168
- simplify proxy and target ready logic !169
- don't update already correct state !171
- sync with latest AIS master !172
- add metallb makefile step !173
- cleanup PVCs when reclaim policy is Delete !174
- test ensuring data safety when cluster is delete !175
- attempt graceful shutdown/decommission before cleaning up k8s resources !179
- configurable cluster DNS domain !180
- change default value of nodiskio to false !181
- include liveness script !182
- include readiness script !183
- decommission target before scale down !184
- shutdown cluster through primary !185
- skip internal deployment tests !186
- handle scaledown in proxies !187
- setup webhooks for validation !188
- implement upgrade strategy !189
Terraform
- Deploy AIStore on the cloud !16
- update readme + fix destroy script !17
- changes in scripts and instance configuration !18
- Remove VPC for ssh to work !20
- Remove k8s service pinging when starting initContainer !21
- minor: fix gitignore for .terraform directories !22
- add admin container !23
- remove nodes labels !24
- update admin container !25
- terraform gcp persistent storage !26
- Add state for smoother redeployment and destroy of resources !28
- deploy: update admin container repository and tag !29
- scripts: update aisnode image and minor fixes around deployment !30
- scripts: fix destroying k8s/gcp terraform volumes !31
- Remove setting daemonID in uuid_env !33
- improve docs !34
- destroy pvc order !35
- deploy: use global accessor instead of custom variable !37
- add ais labels to nodes directly in gcp !38
- correctly handle state var updates !39
- use default node values !40
- make 'machine_type' variable + minor refactor !41
- use single zone instead of region to deploy GKE !42
- use xfs with gcp disks !43
- remove gcp AIS disks when AIS is removed !44
- fix portable issue in
unset_state_var
!45 - trap SIGINT with information to the user !47
- add text about admin container !48
- guard
gcloud
command with cloud provider check !49 - add useful commands after the cluster is deployed !50
- better disks removal handling !51
- extend troubleshooting section in README !54
- make cluster name configurable !56
- add ssh key only if exists !57
- fix deploy script flag parsing !58
- add option to wait for the Pods to start !60
- ignore error when deleting GCP disks !61
- reuse persistent volumes when deleting AIS cluster !64
- fix state file error message !65
- terraform: persist disks on k8s cluster removal !72
- deploy: check if
kubectl
has any nodes assigned when deploying AIStore !73 - terraform: fix deleting persisted pv and pvc !74
- add deploy all gif !80
- fix output quotes bug !83
- update
aistore/aisnode
image tag !86 - add deploy flags for
aisnode
andadmin
containers !88
Helm
- split ais chart from main ais repo !1
- sync change from main repository !2
- rebalance.dont_run_time missing value !3
- Add missing 'allow_guest' value !4
- Change ENV variables to match AIStore !5
- aisloader - move chart from main repo; add automation; move playbooks !6
- introduce dockerhub repo !7
- initContainer must use --overwrite on pod label !8
- wrap aisloader control pod in a deployment !9
- Remove obsolete config option allow_guest !10
- add AIS_NODE_NAME env referring to k8s node !11
- aisloader to user AIS_ENDPOINT !13
- Remove rproxy from config !14
- Update K8s target env variable name !15
- deploy: make setting
AIS_HOST_IP
optional !32 - make graphite optional !46
- correctly remove ',' at the end of the mountpaths !53
- fix updating/downloading dependencies !59
- reduce timeout on checking for existing cluster !78
- add 'pods/log' resource for cluster role !84
- parametrize admin container image !87
- use target DNS !94
- deploy cilium CNI !95
- external access using load-balancer !96
- config: ais config rename ipv4 to hostname !97
- update pod management strategy statefulset !99
- use target DNS instead of pod IP; update readiness probe !100
- use docker images from aistore repository !101
- add support for aws buckets !102
- change
cloud
config section tobackend
!103 - deploy proxies as StatefulSet, use DNS names as hostnames !123
- role-oriented affinities, add targets anti-affinity !125
- remove non-electable proxy template, enable proxies anti-affinities(https://kubernetes.io/docs/concepts/scheduling-eviction/assign-pod-node/) !144
- update to use new config !148
- add env variables for conditional primary wait !150
- fix config struct !158
- fix log_dir config !167
CI, docs and Misc
- docs: fix supported cloud providers !36
- Update README + Improve scripts !27
- add terraform job running cluster on GCP !62
- run CLI tests when testing terraform cluster !63
- docs: update k8s and terraform readmes !66
- docs: add MacOS installation guide !67
- deploy: remove ETL pods on destroy !68
- reduce max-host timeout !69
- deploy: change ETL label !71
- docs: revisit cloud deployment readme !75
- Remove K8S_HOST_NAME env variable !76
- docs: small improvements !77
- docs: update
--wait
flag description and usage !79 - ci: add possibility to run pipeline manually !82
- docs: deploy readme cleanup !85
- ci: build, use and push
nightly
andlatest
Docker images !91 - ci: incorporate changes in building
aistore/aisnode
repo !92 - enable lint and build stages for operator !117
- create operator test phase !118
- fix failing pipeline !130
- tests: print tests logs immediately as they happen !143
- remove dont_run_time config !146
- add Makefile !153
- organize pipeline jobs, add stub for operator GCP job !154
- run operator tests on GCP !155
- add skip GCP CI on skip-ci-gcp label !156
- update config to reflect aistore repo !157
- don't run long tests after MR is merged !160
- tests: introduce short operator tests !162
- show AIStore logs when running tests !166
- change fspath config representation !170
- Add Ansible host and config examples !176
- Minor fixes, updates, and security improvements !178