What's Changed
Added
- Add workflow info details in QueryToken by @3vilhamster in #6265
- [Wf-diagnostics] Introduce a new api to diagnose a workflow execution by @sankari165 in #6268
- [Wf-Diagnostics] Diagnose workflow execution from cli by @sankari165 in #6271
- More logs for matching simulation tests by @Shaddoll in #6270
- [Wf-Diagnostics] Set query handler for diagnostics workflow to provide result by @sankari165 in #6273
- Add canary jitter workflow debugging log by @bowenxia in #6278
- Matching simulation comparison tool by @taylanisikdemir in #6287
- Add StatsReporter component to estimate QPS by @Shaddoll in #6286
- Support custom address broadcasting for ringpop to work in k8s by @taylanisikdemir in #6288
- [Wf-Diagnostics] emit metrics from diagnostics workflow by @sankari165 in #6299
- Add rolling window QPS tracker by @Shaddoll in #6295
- [Wf-Diagnostics] introduce emitter interface in w/f diagnostics by @sankari165 in #6309
- [Wf-Diagnostics] Introduce Diagnostics starter workflow as parent workflow to run diagnostics by @sankari165 in #6310
- Add more test for history_replicator by @bowenxia in #6313
- Add a doc introducing scalable tasklist by @Shaddoll in #6319
- Created Shard Manager Service by @jakobht in #6297
- Add more logs when secondary processor has issues by @neil-xie in #6323
- [Wf-Diagnostics] Emit usage logs after workflow diagnostics run by @sankari165 in #6316
- Feature/zonal isolation zone discovery by @davidporter-id-au in #6301
- Introduce new type MatchingPollForActivityTaskResponse by @Shaddoll in #6325
- Introduce weighted load balancer by @Shaddoll in #6315
- Add unit test for history config by @Shaddoll in #6334
- Add unit tests to common/types/history by @timl3136 in #6336
- Add test for replication_task by @bowenxia in #6335
- Added a mode tag to the workflow ID ratelimit metric and log by @jakobht in #6344
- [Wf-Diagnostics] add timeout error to failures by @sankari165 in #6346
- Add more unit tests for common/types/history by @timl3136 in #6341
- Add test for QueryWorkflow by @Shaddoll in #6348
- Add additional unit tests for history and replicator in common/types by @timl3136 in #6347
- [matching] Simplity poller extraction in task list manager by @3vilhamster in #6333
- [Wf-Diagnostics] Introduce new invariant to identify activity and workflow failures by @sankari165 in #6339
- Created a seperate listWithRing for services that has a hashring by @jakobht in #6350
- Add additional unit tests for common/types/replicator by @timl3136 in #6353
- Add unit tests for remaining functions in common/types/replicator by @timl3136 in #6356
- Add tests for types/mapper/thrift/admin.go by @natemort in #6352
- Added tests to service/worker/scanner.go by @fimanishi in #6349
- Add tests for transfer_active_task_executor by @fimanishi in #6359
- Added tests for task/task_util.go by @fimanishi in #6362
- Add more logs to inspect OpenSearch missing updates issue by @neil-xie in #6364
- Adds a bit more coverage to the domain callback methods by @davidporter-id-au in #6373
- [Wf-Diagnostics] Include failure issues identification and rootcause in diagnostics by @sankari165 in #6370
- Create interface and mock for matcher by @fimanishi in #6374
- Add TaskListPartitionConfig message to proto by @Shaddoll in #6358
- Support custom yarpc peer chooser for p2p connections by @taylanisikdemir in #6345
Changed
- Refactor visibility triple manager by @neil-xie in #6267
- advance ack-level to avoid querying the same (empty) tasks next time by @dkrotx in #6258
- Concurrency primitives need concurrent tests by @Groxx in #6274
- Simplify common/locks.Lock, 5-10x speedup by @Groxx in #6275
- Update simulation tests results to show matched tasks per tasklist by @Shaddoll in #6276
- Ratelimiter polish / fix: improve zero -> nonzero filling behavior for new ratelimiters by @Groxx in #6280
- Wrap errors from child workflow in canary sanity workflow by @fimanishi in #6279
- Always notify subscribers on membership change by @dkrotx in #6283
- Disconnect dangling pollers on membership lost by @dkrotx in #6272
- error-out if we can't Subscribe to membershipResolver by @dkrotx in #6290
- Easier support for multiple instances locally by @jakobht in #6289
- Refactor pinot custom string query in pinot_query_validator by @bowenxia in #6298
- Update change logs for previous releases by @neil-xie in #6306
- move permember ratelimiter to it's own package by @dkrotx in #6304
- Introduce round robin load balancer to matching client by @Shaddoll in #6300
- Update matching simulation test to support round robin load balancer by @Shaddoll in #6311
- Refactor test code for readability by @bowenxia in #6308
- [CLI] upgrade urfave/cli to v2 by @shijiesheng in #6285
- Refactor PeerProvider & hashring interaction by @dkrotx in #6296
- Unit test to cover 88.7% for history replicator by @bowenxia in #6314
- [CLI] replace BackgroundContext with CLI's context by @shijiesheng in #6328
- [Wf-Diagnostics] Refactor to move all timeout related checks under one directory by @sankari165 in #6332
- [CLI] start/signalstart workflow requests should include headers from opentracing SpanContext by @shijiesheng in #6329
- Switch to dependency injection for the main CLI by @Groxx in #6331
- Current refresh interval is too high by @dkrotx in #6357
- Ring member refresh log improvements by @taylanisikdemir in #6361
- Improve unit tests for history/config by @Shaddoll in #6354
- refactor/testing domain update callback by @davidporter-id-au in #6365
Fixed
- Handle custom string not equal case for Pinot by @bowenxia in #6266
- Minor global ratelimiter fix: don't reduce values when "boosting" by @Groxx in #6281
- Fix port string to uint16 parsing by @taylanisikdemir in #6291
- Update latest release auto setup tag name to lower case by @neil-xie in #6292
- [Wf-Diagnostics] Unmarshal metadata for timeout issues and rootcause by @sankari165 in #6294
- Pinot handle customer keyword type empty val by @bowenxia in #6302
- Minor fix for timer usage by @Shaddoll in #6305
- Fix parent close policy by @Shaddoll in #6307
- Refactor visibility migration code and add support for OpenSearch visibility migration by @neil-xie in #6284
- Fix race condition in Describe handler by @Shaddoll in #6312
- Temporary patch: ensure errors lead to exit(1) in main funcs by @Groxx in #6318
- Bugfix: server was ending when in log-debug mode by @Groxx in #6321
- Fix task reader timer by @Shaddoll in #6324
- CLI cleanup: exit-1 on error, and use consistent error printing everywhere by @Groxx in #6322
- Fix slice init length by @cuishuang in #6293
- [Wf-Diagnostics] fix tasklist name in workflow trigger from frontend api by @sankari165 in #6327
- slow down ratelimiter comparison tests, fix one by @Groxx in #6330
- [Wf-Diagnostics] Point to activity task for activity failures by @sankari165 in #6355
- [Wf-Diagnostics] rootcause simple worker service caused activity and workflow failures by @sankari165 in #6351
- [CLI] fix incorrect propagation of span context in start workflow by @shijiesheng in #6363
- Fix backlog count for sticky tasklist by @Shaddoll in #6367
- Fix bug that OpenSearch client not use external version for index requests by @neil-xie in #6368
- Squash NaN bugs and prevent them from coming back. by @Groxx in #6375
Removed
- Get rid of time.After in for loops by @Shaddoll in #6303
- Removes a dependency on service startup for unneeded services by @davidporter-id-au in #6338
- Reverted 6338 by @davidporter-id-au in #6340
- [CLI] ErrorAndExit deprecated by @samkitshah1262 in #6337
New Contributors
- @cuishuang made their first contribution in #6293
- @samkitshah1262 made their first contribution in #6337
Full Changelog: v1.2.13...v1.2.14