- tweak of batcher yield
- pack.Payload reuse memory, json.NewEncoder(os.Stdout)
- metrics isolation by cluster
- participant starts slow
- [06/06/17 15:06:11 CST] [TRAC] ( engine.go:281) engine starting...
- [06/06/17 15:06:11 CST] [TRAC] ( engine.go:343) [10.9.1.1:9877] participant starting...
- [06/06/17 15:06:41 CST] [INFO] ( engine.go:349) [10.9.1.1:9877] participant started
- cluster
- monitor resources cost and rebalance
- support multiple projects
- resource group
- FIXME access denied leads to orphan resource
- myslave should have no checkpoint, placed in Input
- enhance Decision.Equals to avoid thundering herd
- myslave server_id uniq across the cluster
- add Operator for Filter
- count, filter, regex, sort, split, rename
- RowsEvent avro
- inc binlog replication recv buffer size
- alert mysql binlog lags
- dbc participants -i // show internal buffers
- model.RowsEvent add dbus timestamp
- HY000 auto heal
- multiversion config in zk
- model.RowsEvent add dbus timestamp
- controller
- a participant is electing, then shutdown took a long time(blocked by CreateLiveNode)
- 2 phase rebalance: close participants then notify new resources
- what if RPC fails
- leader.onBecomingLeader is parallal: should be sequential
- hot reload raises cluster herd: participant changes too much
- when leader make decision, it persists to zk before RPC for leader failover
- owner of resource
- leader RPC has epoch info
- if Ack fails(zk crash), resort to local disk(load on startup)
- engine shutdown, controller still send rpc
- test cases
- sharded resources
- brain split
- zk dies or kill -9, use cache to continue work
- kill -9 participant/leader, and reschedule
- cluster chaos monkey
- kafka producer qos
- batcher only retries after full batch ack, add timer?
- KafkaConsumer might not be able to Stop
- kguard integration
- router finding matcher is slow
- hot reload on config file changed
- each Input have its own recycle chan, one block will not block others
- when Input stops, Output might still need its OnAck
- KafkaInput plugin
- use scheme to distinguish type of DSN
- plugins Run has no way of panic
- (replication.go:117) [zabbix] invalid table id 2968, no correspond table map event
- make canal, high cpu usage
- because CAS backoff 1us, cpu busy
- ugly design of Input/Output ack mechanism
- we might learn from storm bolt ack
- some goroutine leakage
- telemetry mysql.binlog.lag/tps tag name should be input name
- pipeline
- 1 input, multiple output
- filter to dispatch dbs of a single binlog to different output
- kill Packet.input field
- visualized flow throughput like nifi
- router metrics
- dbusd api server
- logging
- share zkzone instance
- presence and standby mode
- graceful shutdown
- master must drain before leave cluster
- KafkaOutput metrics
- binlog tps
- kafka tps
- lag
- hub is shared, what if a plugin blocks others
- currently, I have no idea how to solve this issue
- Batcher padding
- shutdown kafka
- zk checkpoint vs kafka checkpoint
- kafka follower stops replication
- can a mysql instance with miltiple databases have multiple Log/Position?
- kafka sync produce in batch
- DDL binlog
- drop table y;
- trace async producer Successes channel and mark as processed
- metrics
- telemetry and alert
- what if replication conn broken
- position will be stored in zk
- play with binlog_row_image
- project feature for multi-tenant
- bug fix
- kill dbusd, dbusd-slave did not leave cluster
- next log position leads to failure after resume
- KafkaOutput only support 1 partition topic for MysqlbinlogInput
- table id issue
- what if invalid position
- router stat wrong Total:142,535,625 0.00B speed:22,671/s 0.00B/s max: 0.00B/0.00B
- ffjson marshalled bytes has NL before the ending bracket
- test cases
- restart mysql master
- mysql kill process
- race detection
- tc drop network packets and high latency
- mysql binlog zk session expire
- reset binlog pos, and check kafka did not recv dup events
- MysqlbinlogInput max_event_length
- min.insync.replicas=2, shutdown 1 kafka broker then start
- GTID
- place config to central zk znode and watch changes
- Known issues
- Binlog Dump thread not close github/gh-ost#292
- Roadmap
- pubsub audit reporter
- universal kafka listener and outputer
- a big DELETE statement might kill dbusd
- It might exceed max event size: 1MB mysql seems to auto-chunk the big event into chunks of small events
- It might malloc a very big memory in RowsEvent struct
- mysql packet max payload len = (1<<24 -1)
- OSC tools will make 'ALTER' very complex, whence dbusd not able to clear table columns cache
- use SQL comment to solve it
-
mysqlbinlog input peak with mock output
- 140k event per second
- 30k row event per second
- 260Mb network bandwidth
- KafkaOutput 35K msg per second
- it takes 2h25m to zero lag for platform of 2d lag
-
dryrun MockInput -> MockOutput
- 2.1M packet/s