Skip to content

Releases: NVIDIA/ais-k8s

v0.8

31 Aug 15:35
Compare
Choose a tag to compare
  • operator: allow seamless upgrade (RBAC, version patching)
  • switch to AIStore v3.7 default; bump Operator version to v0.8
  • CI: install fix
  • add pods/exec policy to support typed ETL communicators

v0.7

01 Jul 21:45
Compare
Choose a tag to compare
  • Add Ansible playbooks for deploying AIStore on bare-metal Kubernetes
  • fix Operator logic to create ClusterRole and ClusterRoleBinding in presence of multiple namespaces

v0.6

22 Jun 02:34
Compare
Choose a tag to compare

Release Highlights

  • Enable AIS Prometheus exporter
  • Support deployment with AWS and GCP backends
  • Pin AIS target mountpaths to local PVs
  • Support host networking
  • Transition to Go1.16

AIStore Kubernetes - v0.5

14 Apr 04:22
Compare
Choose a tag to compare

Release Highlights

Kubernetes Operator

  • fix creating multiple AIS Clusters within one K8s cluster !105
  • port features from helm deployment !106
  • add integration test using envtest !107
  • fix hostpath create directory if not present !108
  • add test to check all k8s objects are created !109
  • add test for deploying multiple AIS clusters in the same K8s cluster !110
  • use a separate namespace for tests !111
  • tests wait for daemon statefulsets to be ready !113
  • update primary start-up logic; simplify init scripts !114
  • implement scale up logic !115
  • allow skipping test while building !116
  • single state change per reconciler loop !119
  • use DNS for targets and proxies !120
  • enable external access !121
  • using AIS APIs for testing !122
  • enabling testing on GCP + detect test environments !124
  • user specified storage class in tests !126
  • squash all operator git commits !129
  • set cluster config when CR updated !131
  • skip external tests on local minikube if tunnel not running !132
  • introduce event recorders !133
  • scale up/down with externalLB !134
  • manager include deployment type !135
  • revise state management !136
  • add cluster startup with external lb tests !137
  • implement better error management !138
  • cleanup RBACon deleting AIS cluster !139
  • fix CR cleanup !140
  • general code refactoring !141
  • tests - wait for the cluster to actually start !145
  • update to use new config !147
  • use different ConfigMap for global config !151
  • rewrite wait for cluster ready in tests !152
  • wait for resources to be truly removed on scale-down/cleanup !163
  • retry Reconcile on CR versions conflict !165
  • update code to reflect new config and cmn package !168
  • simplify proxy and target ready logic !169
  • don't update already correct state !171
  • sync with latest AIS master !172
  • add metallb makefile step !173
  • cleanup PVCs when reclaim policy is Delete !174
  • test ensuring data safety when cluster is delete !175
  • attempt graceful shutdown/decommission before cleaning up k8s resources !179
  • configurable cluster DNS domain !180
  • change default value of nodiskio to false !181
  • include liveness script !182
  • include readiness script !183
  • decommission target before scale down !184
  • shutdown cluster through primary !185
  • skip internal deployment tests !186
  • handle scaledown in proxies !187
  • setup webhooks for validation !188
  • implement upgrade strategy !189

Terraform

  • Deploy AIStore on the cloud !16
  • update readme + fix destroy script !17
  • changes in scripts and instance configuration !18
  • Remove VPC for ssh to work !20
  • Remove k8s service pinging when starting initContainer !21
  • minor: fix gitignore for .terraform directories !22
  • add admin container !23
  • remove nodes labels !24
  • update admin container !25
  • terraform gcp persistent storage !26
  • Add state for smoother redeployment and destroy of resources !28
  • deploy: update admin container repository and tag !29
  • scripts: update aisnode image and minor fixes around deployment !30
  • scripts: fix destroying k8s/gcp terraform volumes !31
  • Remove setting daemonID in uuid_env !33
  • improve docs !34
  • destroy pvc order !35
  • deploy: use global accessor instead of custom variable !37
  • add ais labels to nodes directly in gcp !38
  • correctly handle state var updates !39
  • use default node values !40
  • make 'machine_type' variable + minor refactor !41
  • use single zone instead of region to deploy GKE !42
  • use xfs with gcp disks !43
  • remove gcp AIS disks when AIS is removed !44
  • fix portable issue in unset_state_var !45
  • trap SIGINT with information to the user !47
  • add text about admin container !48
  • guard gcloud command with cloud provider check !49
  • add useful commands after the cluster is deployed !50
  • better disks removal handling !51
  • extend troubleshooting section in README !54
  • make cluster name configurable !56
  • add ssh key only if exists !57
  • fix deploy script flag parsing !58
  • add option to wait for the Pods to start !60
  • ignore error when deleting GCP disks !61
  • reuse persistent volumes when deleting AIS cluster !64
  • fix state file error message !65
  • terraform: persist disks on k8s cluster removal !72
  • deploy: check if kubectl has any nodes assigned when deploying AIStore !73
  • terraform: fix deleting persisted pv and pvc !74
  • add deploy all gif !80
  • fix output quotes bug !83
  • update aistore/aisnode image tag !86
  • add deploy flags for aisnode and admin containers !88

Helm

  • split ais chart from main ais repo !1
  • sync change from main repository !2
  • rebalance.dont_run_time missing value !3
  • Add missing 'allow_guest' value !4
  • Change ENV variables to match AIStore !5
  • aisloader - move chart from main repo; add automation; move playbooks !6
  • introduce dockerhub repo !7
  • initContainer must use --overwrite on pod label !8
  • wrap aisloader control pod in a deployment !9
  • Remove obsolete config option allow_guest !10
  • add AIS_NODE_NAME env referring to k8s node !11
  • aisloader to user AIS_ENDPOINT !13
  • Remove rproxy from config !14
  • Update K8s target env variable name !15
  • deploy: make setting AIS_HOST_IP optional !32
  • make graphite optional !46
  • correctly remove ',' at the end of the mountpaths !53
  • fix updating/downloading dependencies !59
  • reduce timeout on checking for existing cluster !78
  • add 'pods/log' resource for cluster role !84
  • parametrize admin container image !87
  • use target DNS !94
  • deploy cilium CNI !95
  • external access using load-balancer !96
  • config: ais config rename ipv4 to hostname !97
  • update pod management strategy statefulset !99
  • use target DNS instead of pod IP; update readiness probe !100
  • use docker images from aistore repository !101
  • add support for aws buckets !102
  • change cloud config section to backend !103
  • deploy proxies as StatefulSet, use DNS names as hostnames !123
  • role-oriented affinities, add targets anti-affinity !125
  • remove non-electable proxy template, enable proxies anti-affinities(https://kubernetes.io/docs/concepts/scheduling-eviction/assign-pod-node/) !144
  • update to use new config !148
  • add env variables for conditional primary wait !150
  • fix config struct !158
  • fix log_dir config !167

CI, docs and Misc

  • docs: fix supported cloud providers !36
  • Update README + Improve scripts !27
  • add terraform job running cluster on GCP !62
  • run CLI tests when testing terraform cluster !63
  • docs: update k8s and terraform readmes !66
  • docs: add MacOS installation guide !67
  • deploy: remove ETL pods on destroy !68
  • reduce max-host timeout !69
  • deploy: change ETL label !71
  • docs: revisit cloud deployment readme !75
  • Remove K8S_HOST_NAME env variable !76
  • docs: small improvements !77
  • docs: update --wait flag description and usage !79
  • ci: add possibility to run pipeline manually !82
  • docs: deploy readme cleanup !85
  • ci: build, use and push nightly and latest Docker images !91
  • ci: incorporate changes in building aistore/aisnode repo !92
  • enable lint and build stages for operator !117
  • create operator test phase !118
  • fix failing pipeline !130
  • tests: print tests logs immediately as they happen !143
  • remove dont_run_time config !146
  • add Makefile !153
  • organize pipeline jobs, add stub for operator GCP job !154
  • run operator tests on GCP !155
  • add skip GCP CI on skip-ci-gcp label !156
  • update config to reflect aistore repo !157
  • don't run long tests after MR is merged !160
  • tests: introduce short operator tests !162
  • show AIStore logs when running tests !166
  • change fspath config representation !170
  • Add Ansible host and config examples !176
  • Minor fixes, updates, and security improvements !178

Assets:

Docker Images: