Action required: Users should migrate the configs in values.yaml
of previous chart releases to the new values.yaml
of the new chart. Otherwise, the monitor pods might fail when you upgrade the monitor with the new chart.
For example, configs in the old values.yaml
file:
monitor:
...
initializer:
image: pingcap/tidb-monitor-initializer:v3.0.5
imagePullPolicy: IfNotPresent
...
After migration, configs in the new values.yaml
file should be as follows:
monitor:
...
initializer:
image: pingcap/tidb-monitor-initializer:v3.0.5
imagePullPolicy: Always
config:
K8S_PROMETHEUS_URL: http://prometheus-k8s.monitoring.svc:9090
...
- Refine scheduler error messages (#1373)
- Fix the compatibility issue in Kubernetes v1.17 (#1241)
- Bind the
system:kube-scheduler
ClusterRole to thetidb-scheduler
service account (#1355)
- Fix the default
tikv-importer
configuration (#1415)
- Ensure pods unaffected when upgrading (#955)
- Move the release CI script from Jenkins into the tidb-operator repository (#1237)
- Adjust the release CI script for the
release-1.0
branch (#1320)
There is no action required if you are upgrading from v1.0.4.
- Fix the issue that backup failed when
clusterName
is too long (#1229)
- It is recommended that TiDB and Pump be deployed on the same node through the
affinity
feature and Pump be dispersed on different nodes through theanti-affinity
feature. At most only one Pump instance is allowed on each node. We added a guide to the chart. (#1251)
- Fix
tidb-scheduler
RBAC permission in Kubernetes v1.16 (#1282) - Do not set
DNSPolicy
ifhostNetwork
is disabled to keep backward compatibility (#1287)
- Fix e2e nil point dereference (#1221)
There is no action required if you are upgrading from v1.0.3.
#1202 introduced HostNetwork
support, which offers better performance compared to the Pod network. Check out our benchmark report for details.
Note:
Due to this issue of Kubernetes, the Kubernetes cluster must be one of the following versions to enable
HostNetwork
of the TiDB cluster:
v1.13.11
or laterv1.14.7
or laterv1.15.4
or later- any version since
v1.16.0
#1175 added the podSecurityContext
support for TiDB cluster Pods. We recommend setting the namespaced kernel parameters for TiDB cluster Pods according to our Environment Recommendation.
New Helm chart tidb-lightning
brings TiDB Lightning support for TiDB in Kubernetes. Check out the document for detailed user guide.
Another new Helm chart tidb-drainer
brings multiple drainers support for TiDB Binlog in Kubernetes. Check out the document for detailed user guide.
- Support HostNetwork (#1202)
- Support configuring sysctls for Pods and enable net.* (#1175)
- Add tidb-lightning support (#1161)
- Add new helm chart tidb-drainer to support multiple drainers (#1160)
- Add e2e scripts and simplify the e2e Jenkins file (#1174)
- Fix the pump/drainer data directory to avoid data loss caused by bad configuration (#1183)
- Add init sql case to e2e (#1199)
- Keep the instance label of drainer same with the TiDB cluster in favor of monitoring (#1170)
- Set
podSecuriyContext
to nil by default in favor of backward compatibility (#1184)
For historical reasons, v1.1.0.alpha
is a hot-fix branch and got this name by mistake. All fixes in that branch are cherry-picked to v1.0.4
and the v1.1.0.alpha
branch will be discarded to keep things clear.
We strongly recommend you to upgrade to v1.0.4
if you are using any version under v1.1.0.alpha
.
v1.0.4
introduces the following fixes comparing to v1.1.0.alpha.3
:
- Support HostNetwork (#1202)
- Add the permit host option for tidb-initializer job (#779)
- Fix drainer misconfiguration in tidb-cluster chart (#945)
- Set the default
externalTrafficPolicy
to be Local for TiDB services (#960) - Fix tidb-operator crash when users modify sts upgrade strategy improperly (#969)
- Add the
maxFailoverCount
limit to TiKV (#976) - Fix values file customization for tidb-operator on aliyun (#983)
- Do not limit failover count when maxFailoverCount = 0 (#978)
- Suspend the
ReplaceUnhealthy
process for TiKV auto-scaling-group on AWS (#1027) - Fix the issue that the
create_tidb_cluster_release
variable does not work (#1066)) - Add
v1
to statefulset apiVersions (#1056) - Add timezone support (#1126)
ACTION REQUIRED: This release upgrades default TiDB version to v3.0.5
which fixed a serious bug in TiDB. So if you are using TiDB v3.0.4
or prior versions, you must upgrade to v3.0.5
.
ACTION REQUIRED: This release adds the timezone
support for all charts.
For existing TiDB clusters. If the timezone
in tidb-cluster/values.yaml
has been customized to other timezones instead of the default UTC
, then upgrading tidb-operator will trigger a rolling update for the related pods.
The related pods include pump
, drainer
, discovery
, monitor
, scheduled backup
, tidb-initializer
, and tikv-importer
.
The time zone for all images maintained by tidb-operator
should be UTC
. If you use your own images, you need to make sure that the corresponding time zones are UTC
.
- Add the
timezone
support for all containers of the TiDB cluster - Support configuring resource requests and limits for all containers of the TiDB cluster
- Upgrade default TiDB version to
v3.0.5
(#1132) - Add the
timezone
support for all containers of the TiDB cluster (#1122) - Support configuring resource requests and limits for all containers of the TiDB cluster (#853)
The AWS Terraform script uses auto-scaling-group for all components (PD/TiKV/TiDB/monitor). When an ec2 instance fails the health check, the instance will be replaced. This is helpful for those applications that are stateless or use EBS volumes to store data.
But a TiKV Pod uses instance store to store its data. When an instance is replaced, all the data on its store will be lost. TiKV has to resync all data to the newly added instance. Though TiDB is a distributed database and can work when a node fails, resyncing data can cost much if the dataset is large. Besides, the ec2 instance may be recovered to a healthy state by rebooting.
So we disabled the auto-scaling-group's replacing behavior in v1.0.2
.
Auto-scaling-group scaling process can also be suspended according to its documentation if you are using v1.0.1
or prior versions.
- Suspend ReplaceUnhealthy process for AWS TiKV auto-scaling-group
- Add a new VM manager
qm
in stability test - Add
tikv.maxFailoverCount
limit to TiKV - Set the default
externalTrafficPolicy
to beLocal
for TiDB service in AWS/GCP/Aliyun - Add provider and module versions for AWS
- Fix the issue that tkctl version does not work when the release name is un-wanted
- Migrate statefulsets apiVersion to
app/v1
which fixes compatibility with Kubernetes 1.16 and above versions - Fix the issue that the
create_tidb_cluster_release
variable in AWS Terraform script does not work - Fix compatibility issues by adding
v1beta1
to statefulset apiVersions - Fix the issue that TiDB Loadbalancer is empty in Terraform output
- Fix a compatibility issue of TiKV
maxFailoverCount
- Fix Terraform providers version constraint issues for GCP and Aliyun
- Fix values file customization for tidb-operator on Aliyun
- Fix tidb-operator crash when users modify statefulset upgrade strategy improperly
- Fix drainer misconfiguration
- Fix the issue that tkctl version does not work when the release name is un-wanted (#1065)
- Fix the issue that the
create_tidb_cluster_release
variable in AWS terraform script does not work (#1062) - Fix compatibility issues for (#1012): add
v1beta1
to statefulset apiVersions (#1054) - Enable ConfigMapRollout by default in stability test (#1036)
- Fix the issue that TiDB Loadbalancer is empty in Terraform output (#1045)
- Migrate statefulsets apiVersion to
app/v1
which fixes compatibility with Kubernetes 1.16 and above versions (#1012) - Only expect TiDB cluster upgrade to be complete when rolling back wrong configuration in stability test (#1030)
- Suspend ReplaceUnhealthy process for AWS TiKV auto-scaling-group (#1014)
- Add a new VM manager
qm
in stability test (#896) - Fix provider versions constraint issues for GCP and Aliyun (#959)
- Fix values file customization for tidb-operator on Aliyun (#971)
- Fix a compatibility issue of TiKV
tikv.maxFailoverCount
(#977) - Add
tikv.maxFailoverCount
limit to TiKV (#965) - Fix tidb-operator crash when users modify statefulset upgrade strategy improperly (#912)
- Set the default
externalTrafficPolicy
to beLocal
for TiDB service in AWS/GCP/Aliyun (#947) - Add note about setting PV reclaim policy to retain (#911)
- Fix drainer misconfiguration (#939)
- Add provider and module versions for AWS (#926)
- ACTION REQUIRED: We fixed a serious bug (#878) that could cause all
PD
andTiKV
pods to be accidentally deleted whenkube-apiserver
fails. This would cause TiDB service outage. So if you are usingv1.0.0
or prior versions, you must upgrade tov1.0.1
. - ACTION REQUIRED: The backup tool image pingcap/tidb-cloud-backup uses a forked version of
Mydumper
. The current versionpingcap/tidb-cloud-backup:20190610
contains a serious bug that could result in a missing column in the exported data. This is fixed in #29. And the default image used now contains this fixed version. So if you are using the old version image for backup, you must upgrade to usepingcap/tidb-cloud-backup:201908028
and do a new full backup to avoid potential data inconsistency.
- Modularize GCP Terraform
- Add a script to remove orphaned k8s disks
- Support
binlog.pump.config
,binlog.drainer.config
configurations for Pump and Drainer - Set the resource limit for the
tidb-backup
job - Add
affinity
to Pump and Drainer configurations - Upgrade local-volume-provisioner to
v2.3.2
- Reduce e2e run time from
60m
to20m
- Prevent the Pump process from exiting with
0
if the Pump becomesoffline
- Support expanding cloud storage PV dynamically by increasing PVC storage size
- Add the
tikvGCLifeTime
option to do backup - Add important parameters to
tikv.config
andtidb.config
invalues.yaml
- Support restoring the TiDB cluster from a specified scheduled backup directory
- Enable cloud storage volume expansion & label local volume
- Document and improve HA algorithm
- Support specifying the permit host in the
values.tidb.permitHost
chart - Add the zone label and reserved resources arguments to kubelet
- Update the default backup image to
pingcap/tidb-cloud-backup:20190828
- Fix the TiKV scale-in failure in some cases after the TiKV failover
- Fix error handling for UpdateService
- Fix some orphan pods cleaner bugs
- Fix the bug of setting the
StatefulSet
partition - Fix ad-hoc full backup failure due to incorrect
claimName
- Fix the offline Pump: the Pump process will exit with
0
if going offline - Fix an incorrect condition judgment
- Clean up
tidb.pingcap.com/pod-scheduling
annotation when the pod is scheduled (#790) - Update tidb-cloud-backup image tag (#846)
- Add the TiDB permit host option (#779)
- Add the zone label and reserved resources for nodes (#871)
- Fix some orphan pods cleaner bugs (#878)
- Fix the bug of setting the
StatefulSet
partition (#830) - Add the
tikvGCLifeTime
option (#835) - Add recommendations options to Mydumper (#828)
- Fix ad-hoc full backup failure due to incorrect
claimName
(#836) - Improve
tkctl get
command output (#822) - Add important parameters to TiKV and TiDB configurations (#786)
- Fix the issue that
binlog.drainer.config
is not supported in v1.0.0 (#775) - Support restoring the TiDB cluster from a specified scheduled backup directory (#804)
- Fix
extraLabels
description invalues.yaml
(#763) - Fix tkctl log output exception (#797)
- Add a script to remove orphaned K8s disks (#745)
- Enable cloud storage volume expansion & label local volume (#772)
- Prevent the Pump process from exiting with
0
if the Pump becomesoffline
(#769) - Modularize GCP Terraform (#717)
- Support
binlog.pump.config
configurations for Pump and Drainer (#693) - Remove duplicate key values (#758)
- Fix some typos (#738)
- Extend the waiting time of the
CheckManualPauseTiDB
process (#752) - Set the resource limit for the
tidb-backup
job (#729) - Fix e2e test compatible with v1.0.0 (#757)
- Make incremental backup test work (#764)
- Add retry logic for
LabelNodes
function (#735) - Fix the TiKV scale-in failure in some cases (#726)
- Add affinity to Pump and Drainer (#741)
- Refine cleanup logic (#719)
- Inject a failure by pod annotation (#716)
- Update README links to point to correct
pingcap.com/docs
URLs for English and Chinese (#732) - Document and improve HA algorithm (#670)
- Fix an incorrect condition judgment (#718)
- Upgrade local-volume-provisioner to v2.3.2 (#696)
- Reduce e2e test run time (#713)
- Fix Terraform GKE scale-out issues (#711)
- Update wording and fix format for v1.0.0 (#709)
- Update documents (#705)
- ACTION REQUIRED:
tikv.storeLabels
was removed fromvalues.yaml
. You can directly set it withlocation-labels
inpd.config
. - ACTION REQUIRED: the
--features
flag of tidb-scheduler has been updated to thekey={true,false}
format. You can enable the feature by appending=true
. - ACTION REQUIRED: you need to change the configurations in
values.yaml
of previous chart releases to the newvalues.yaml
of the new chart. Otherwise, the configurations will be ignored when upgrading the TiDB cluster with the new chart.
The pd
section in old values.yaml
:
pd:
logLevel: info
maxStoreDownTime: 30m
maxReplicas: 3
The pd
section in new values.yaml
:
pd:
config: |
[log]
level = "info"
[schedule]
max-store-down-time = "30m"
[replication]
max-replicas = 3
The tikv
section in old values.yaml
:
tikv:
logLevel: info
syncLog: true
readpoolStorageConcurrency: 4
readpoolCoprocessorConcurrency: 8
storageSchedulerWorkerPoolSize: 4
The tikv
section in new values.yaml
:
tikv:
config: |
log-level = "info"
[server]
status-addr = "0.0.0.0:20180"
[raftstore]
sync-log = true
[readpool.storage]
high-concurrency = 4
normal-concurrency = 4
low-concurrency = 4
[readpool.coprocessor]
high-concurrency = 8
normal-concurrency = 8
low-concurrency = 8
[storage]
scheduler-worker-pool-size = 4
The tidb
section in old values.yaml
:
tidb:
logLevel: info
preparedPlanCacheEnabled: false
preparedPlanCacheCapacity: 100
txnLocalLatchesEnabled: false
txnLocalLatchesCapacity: "10240000"
tokenLimit: "1000"
memQuotaQuery: "34359738368"
txnEntryCountLimit: "300000"
txnTotalSizeLimit: "104857600"
checkMb4ValueInUtf8: true
treatOldVersionUtf8AsUtf8mb4: true
lease: 45s
maxProcs: 0
The tidb
section in new values.yaml
:
tidb:
config: |
token-limit = 1000
mem-quota-query = 34359738368
check-mb4-value-in-utf8 = true
treat-old-version-utf8-as-utf8mb4 = true
lease = "45s"
[log]
level = "info"
[prepared-plan-cache]
enabled = false
capacity = 100
[txn-local-latches]
enabled = false
capacity = 10240000
[performance]
txn-entry-count-limit = 300000
txn-total-size-limit = 104857600
max-procs = 0
The monitor
section in old values.yaml
:
monitor:
create: true
...
The monitor
section in new values.yaml
:
monitor:
create: true
initializer:
image: pingcap/tidb-monitor-initializer:v3.0.5
imagePullPolicy: IfNotPresent
reloader:
create: true
image: pingcap/tidb-monitor-reloader:v1.0.0
imagePullPolicy: IfNotPresent
service:
type: NodePort
...
Please check cluster configuration for detailed configuration.
- Stop all etcds and kubelets
- Simplify GKE SSD setup
- Modularization for AWS Terraform scripts
- Turn on the automatic failover feature by default
- Enable configmap rollout by default
- Enable stable scheduling by default
- Support multiple TiDB clusters management in Alibaba Cloud
- Enable AWS NLB cross zone load balancing by default
- Fix sysbench installation on bastion machine of AWS deployment
- Fix TiKV metrics monitoring in default setup
- Allow upgrading TiDB monitor along with TiDB version (#666)
- Specify the TiKV status address to fix monitoring (#695)
- Fix sysbench installation on bastion machine for AWS deployment (#688)
- Update the
git add upstream
command to usehttps
in contributing document (#690) - Stability cases: stop kubelet and etcd (#665)
- Limit test cover packages (#687)
- Enable nlb cross zone load balancing by default (#686)
- Add TiKV raftstore parameters (#681)
- Support multiple TiDB clusters management for Alibaba Cloud (#658)
- Adjust the
EndEvictLeader
function (#680) - Add more logs (#676)
- Update feature gates to support
key={true,false}
syntax (#677) - Fix the typo meke to make (#679)
- Enable configmap rollout by default and quote configmap digest suffix (#678)
- Turn automatic failover on (#667)
- Sets node count for default pool equal to total desired node count (#673)
- Upgrade default TiDB version to v3.0.1 (#671)
- Remove storeLabels (#663)
- Change the way to configure TiDB/TiKV/PD in charts (#638)
- Modularize for AWS terraform scripts (#650)
- Change the
DeferClose
function (#653) - Increase the default storage size for Pump from 10Gi to 20Gi in response to
stop-write-at-available-space
(#657) - Simplify local SDD setup (#644)
- stop kube-proxy
- upgrade tidb-operator
- get the TS first and increase the TiKV GC life time to 3 hours before the full backup
- Add endpoints list and watch permission for controller-manager
- Scheduler image is updated to use "k8s.gcr.io/kube-scheduler" which is much smaller than "gcr.io/google-containers/hyperkube". You must pre-pull the new scheduler image into your airgap environment before upgrading.
- Full backup data can be uploaded to or downloaded from Amazon S3
- The terraform scripts support manage multiple TiDB clusters in one EKS cluster.
- Add
tikv.storeLables
setting - on GKE one can use COS for TiKV nodes with small data for faster startup
- Support force upgrade when PD cluster is unavailable.
- fix unbound variable in the backup script
- Give kube-scheduler permission to update/patch pod status
- fix tidb user of scheduled backup script
- fix scheduled backup to ceph object storage
- Fix several usability problems for AWS terraform deployment
- fix scheduled backup bug: segmentation fault when backup user's password is empty
- bugfix: segmentation fault when backup user's password is empty (#649)
- Small fixes for terraform aws (#646)
- TiKV upgrade bug fix (#626)
- improving the readability of some code (#639)
- support force upgrade when pd cluster is unavailable (#631)
- Add new terraform version requirement to AWS deployment (#636)
- GKE local ssd provisioner for COS (#612)
- remove tidb version from build (#627)
- refactor so that using the PD API avoids unnecessary imports (#618)
- add storeLabels setting (#527)
- Update google-kubernetes-tutorial.md (#622)
- Multiple clusters management in EKS (#616)
- Add Amazon S3 support to the backup/restore features (#606)
- pass TiKV upgrade case (#619)
- separate slow log with tidb server log by default (#610)
- fix the problem of unbound variable in backup script (#608)
- fix notes of tidb-backup chart (#595)
- Give kube-scheduler ability to update/patch pod status. (#611)
- Use kube-scheduler image instead of hyperkube (#596)
- fix pull request template grammar (#607)
- local SSD provision: reduce network traffic (#601)
- {origin/HEAD} {origin/master} add operator upgrade case (#579)
- fix bug that tikv status is always upgrade (#598)
- build without debugger symbols (#592)
- improve error messages (#591)
- fix tidb user of scheduled backup script (#594)
- fix dt case bug (#571)
- GKE terraform (#585)
- fix scheduled backup to ceph object storage (#576)
- Add stop kube-scheduler/kube-controller-manager test cases (#583)
- Add endpoints list and watch permission for controller-manager (#590)
- refine fullbackup (#570)
- Make sure go modules files are always tidy and up to date (#588)
- Local SSD on GKE (#577)
- stability-test: stop kube-proxy case (#556)
- fix resource unit (#573)
- Give local-volume-provisioner pod a QoS of Guaranteed (#569)
- Check PD enpoints status when it's unhealthy. (#545)
- ACTION REQUIRED:
nodeSelectorRequired
was removed from values.yaml. - ACTION REQUIRED: Comma-separated values support in
nodeSelector
has been dropped, please use new-addedaffinity
field which has a more expressive syntax.
- ConfigMap rollout
- One PD replicas
- Stop TiDB Operator itself
- TiDB stable scheduling
- Disaster tolerance and data regions disaster tolerance
- Fix many bugs of stability test
- Introduce ConfigMap rollout management. With the feature gate open, configuration file changes will be automatically applied to the cluster via a rolling update. Currently, the
scheduler
andreplication
configurations of PD can not be changed via ConfigMap rollout. You can usepd-ctl
to change these values instead, see #487 for details. - Support stable scheduling for pods of TiDB members in tidb-scheduler.
- Support adding additional pod annotations for PD/TiKV/TiDB, e.g. fluentbit.io/parser.
- Support the affinity feature of k8s which can define the rule of assigning pods to nodes
- Allow pausing during TiDB upgrade
- GCP one-command deployment
- Refine user guides
- Improve GKE, AWS, Aliyun guide
- Upgrade default TiDB version to v3.0.0-rc.1
- fix bug in reporting assigned nodes of tidb members
tkctl get
can show cpu usage correctly now- Adhoc backup now appends the start time to the PVC name by default.
- add the privileged option for TiKV pod
tkctl upinfo
can show nodeIP podIP port now- get TS and use it before full backup using mydumper
- Fix capabilities issue for
tkctl debug
command
- Add capabilities and privilege mode for debug container (#537)
- docs: note helm versions in deployment docs (#553)
- deploy/aws: split public and private subnets when using existing vpc (#530)
- release v1.0.0-beta.3 (#557)
- Gke terraform upgrade to 0.12 and fix bastion instance zone to be region agnostic (#554)
- get TS and use it before full backup using mydumper (#534)
- Add port podip nodeip to tkctl upinfo (#538)
- fix disaster tolerance of stability test (#543)
- add privileged option for tikv pod template (#550)
- use staticcheck instead of megacheck (#548)
- Refine backup and restore documentation (#518)
- Fix stability tidb pause case (#542)
- Fix tkctl get cpu info rendering (#536)
- Fix aliyun tf output rendering and refine documents (#511)
- make webhook configurable (#529)
- Add pods disaster tolerance and data regions disaster tolerance test cases (#497)
- Remove helm hook annotation for initializer job (#526)
- stability test: Add stable scheduling e2e test case (#524)
- upgrade tidb version in related documentations (#532)
- stable scheduling: fix bug in reporting assigned nodes of tidb members (#531)
- reduce wait time and fix stablity test (#525)
- tidb-operator: fix documentation usability issues in GCP document (#519)
- stability cases added: pd replicas 1 and stop tidb-operator (#496)
- pause-upgrade stability test (#521)
- fix restore script bug (#510)
- stability: retry truncating sst files upon failure (#484)
- upgrade default tidb to v3.0.0-rc.1 (#520)
- add --namespace when create backup secret (#515)
- New stability test case for ConfigMap rollout (#499)
- docs: Fix issues found in Queeny's test (#507)
- Pause rolling-upgrade process of tidb statefulset (#470)
- Gke terraform and guide (#493)
- support the affinity feature of k8s which define the rule of assigning pods to nodes (#475)
- Support adding additional pod annotations for PD/TiKV/TiDB (#500)
- Document about PD configuration issue (#504)
- Refine aliyun and aws cloud tidb configurations (#492)
- tidb-operator: update wording and add note (#502)
- Support stable scheduling for TiDB (#477)
- fix
make lint
(#495) - Support updating configuraion on the fly (#479)
- docs/aws: update AWS deploy docs after testing (#491)
- add release-note to pull_request_template.md (#490)
- Design proposal of stable scheduling in TiDB (#466)
- Update DinD image to make it possible to configure HTTP proxies (#485)
- readme: fix a broken link (#489)
- Fixed typo (#483)
- Refactored e2e test
- Added stability test, 7x24 running
- One-command deployment for AWS, Aliyun
- Minikube deployment for testing
- Tkctl cli tool
- Refactor backup chart for ease use
- Refine initializer job
- Grafana monitor dashboard improved, support multi-version
- Improved user guide
- Contributing documentation
- Fix PD start script, add join file when startup
- Fix TiKV failover take too long
- Fix PD ha when replcias is less than 3
- Fix a tidb-scheduler acquireLock bug and emit event when scheduled failed
- Fix scheduler ha bug with defer deleting pods
- Fix bug when using shareinformer without deepcopy
- Remove pushgateway from TiKV pod
- Add GitHub templates for issue reporting and PR
- Automatically set the scheduler K8s version
- Swith to go module
- Support slow log of TiDB
- Don't initialize when there is no tidb.password (#282)
- fix join script (#285)
- Document tool setup and e2e test detail in Contributing.md (#288)
- Update setup.md (#281)
- Support slow log tailing sidcar for tidb instance (#290)
- Flexible tidb initializer job with secret set outside of helm (#286)
- Ensure SLOW_LOG_FILE env variable is always set (#298)
- fix setup document description (#300)
- refactor backup (#301)
- Abandon vendor and refresh go.sum (#311)
- set the SLOW_LOG_FILE in the startup script (#307)
- automatically set the scheduler K8s version (#313)
- tidb stability test main function (#306)
- stability: add fault-trigger server (#312)
- Yinliang/backup and restore add adhoc backup and restore functison (#316)
- stability: add scale & upgrade case functions (#309)
- add slack (#318)
- log dump when test failed (#317)
- stability: add fault-trigger client (#326)
- monitor checker (#320)
- stability: add blockWriter case for inserting data (#321)
- add scheduled-backup test case (#322)
- stability: port ddl test as a workload (#328)
- stability: use fault-trigger at e2e tests and add some log (#330)
- add binlog deploy and check process (#329)
- fix e2e can not make (#331)
- multi tidb cluster testing (#334)
- fix bakcup test bugs (#335)
- delete blockWrite.go use blockwrite.go instead (#333)
- remove vendor (#344)
- stability: add more checks for scale & upgrade (#327)
- stability: support more fault injection (#345)
- rewrite e2e (#346)
- stability: add failover test (#349)
- fix ha when replcias is less than 3 (#351)
- stability: add fault-trigger service file (#353)
- fix dind doc (#352)
- Add additionalPrintColumns for TidbCluster CRD (#361)
- refactor stability main function (#363)
- enable admin privilege for prom (#360)
- Updated Readme with New Info (#365)
- Build CLI (#357)
- add extraLabels variable in tidb-cluster chart (#373)
- fix tikv failover (#368)
- Separate and ensure setup before e2e-build (#375)
- Fix codegen.sh and lock related dependencies (#371)
- stability: add sst-file-corruption case (#382)
- use release name as default clusterName (#354)
- Add util class to support to add annotations to Grafana (#378)
- Use grafana provisioning to replace dashboard installer (#388)
- ensure test env is ready before cases running (#386)
- remove monitor config job check (#390)
- Update local-pv documentation (#383)
- Update Jenkins links in README.md (#395)
- fix e2e workflow in CONTRIBUTING.md (#392)
- Support running stability test out of cluster (#397)
- update tidb secret docs and charts (#398)
- Enable blockWriter write pressure in stability test (#399)
- Support debug and ctop commands in CLI (#387)
- marketplace update (#380)
- dashboard:update editable value from true to false (#394)
- add fault inject for kube proxy (#384)
- use
ioutil.TempDir()
create charts and operator repo's directories (#405) - Improve workflow in docs/google-kubernetes-tutorial.md (#400)
- Support plugin start argument for tidb instance (#412)
- Replace govet with official vet tool (#416)
- allocate 24 PVs by default (after 2 clusters are scaled to (#407)
- refine stability (#422)
- Record event as grafana annotation in stability test (#414)
- add GitHub templates for issue reporting and PR (#420)
- add TiDBUpgrading func (#423)
- fix operator chart issue (#419)
- fix stability issues (#433)
- change cert generate method and add pd and kv prestop webhook (#406)
- a tidb-scheduler bug fix and emit event when scheduled failed (#427)
- Shell completion for tkctl (#431)
- Delete an duplicate import (#434)
- add etcd and kube-apiserver faults (#367)
- Fix TiDB Slack link (#444)
- fix scheduler ha bug (#443)
- add terraform script to auto deploy TiDB cluster on AWS (#401)
- Adds instructions to access Grafana in GKE tutorial (#448)
- fix label selector (#437)
- no need to set ClusterIP when syncing headless service (#432)
- docs on how to deploy tidb cluster with tidb-operator in minikube (#451)
- add slack notify (#439)
- fix local dind env (#440)
- Add terraform scripts to support alibaba cloud ACK deployment (#436)
- Fix backup data compare logic (#454)
- stability test: async emit annotations (#438)
- Use TiDB v2.1.8 by default & remove pushgateway (#435)
- Fix bug use shareinformer without copy (#462)
- Add version command for tkctl (#456)
- Add tkctl user manual (#452)
- Fix binlog problem on large scale (#460)
- Copy kubernetes.io/hostname label to PVs (#464)
- AWS EKS tutorial change to new terraform script (#463)
- docs/minikube: update documentation of minikube installation (#471)
- docs/dind: update documentation of DinD installation (#458)
- docs/minikube: add instructions to access Grafana (#476)
- support-multi-version-dashboard (#473)
- docs/aliyun: update aliyun deploy docs after testing (#474)
- GKE local SSD size warning (#467)
- update roadmap (#376)