Releases: cadence-workflow/cadence
v0.3.14 Release
Schema Changes
This release includes changes to Cadence core schema. All the schema changes required by new features are backwards compatible. Please make sure to deploy cadence schema 0.9 using cadence-cassandra-tool before deploying this release of Cadence server. Following are the changes which requires 0.9 version of schema:
- 710b688 - Add customized data map to domain type (#863)
- 3a79fc0 - Apply 5 min delay for standby task (#920)
New Features and Improvements
Cross DC Support (In Development)
We have made a lot of progress on Cross DC replication support for Cadence workflows, but it is still not ready for production release yet. We are code complete on this feature and currently in the testing phase. Please see XDC V1 release for a list of pending tasks we are currently working on before this feature is ready for production. Here is the list of changes which went into this release for XDC support.
- 8f6952d - only global domain will use the v2 domain table (#831)
- 0fecb80 - Support for retrying messages within replicator processor (#827)
- b7d4dea - Multiple bugfixes (#823)
- 1a167d7 - Ut state builder (#808)
- 583fe1c - Handle more edge case when apply events in history replicator (#836)
- 3d019ba - handling case when reset workflow history, the target workflow has already done continue as new (#853)
- 524538d - some behavior change on worker (#847)
- a8958f0 - compact duplicate code, add more logging (#846)
- 278a773 - Add EnableGlobalDomain as dynamic config (#858)
- 9f78eca - bugfix: non global domain workflow, when started, should not generate replication task (#862)
- 710b688 - Add customized data map to domain type (#863)
- 36c617d - retry timer task for 100 times (#872)
- b501bf1 - Optimize replication task generation (#869)
- 0e397c5 - optimize queue ack manager, in case the task IDs are not sequencial. (#875)
- 5530848 - handle DomainNotActive error in timer / transfer queue (#878)
- 5cdfeb2 - dynamic set the retry for retryable error on worker processor (#883)
- 622d1a0 - bugfix: buffered replication task should be persisted (#884)
- 515df70 - Make task processors reselient to LimitExceededError (#891)
- ec2c6da - Bugfix: failover trigger should also notify activer timer / transfer … (#886)
- 9ea16c0 - Add more metrics and log, remove some dead code (#893)
- e5f07f2 - Make worker retry on 2 level, make history replicator allow option to force buffer events (#894)
- 6edece4 - Update service transient retry error predicate (#904)
- ce946a7 - Bugfix: worker should not retry on non err (#908)
- 6fa8589 - Fix replicationTask for updating domainData (#923)
- 3651ac8 - Replication worker gets stuck on busy retry (#938)
- 77a9979 - include service tag on logs emitted by worker role (#939)
- 3c9a409 - Handle service deployment causing buffer events not flushed (#940)
- 5ab74ef - bugfix: workflow can be reset after finished, while close transfer task is already created, causing panic (#935)
- fcc9b0a - Reset of mutable state should use event version (#937)
- cb87bd7 - Simplify replication worker retry logic (#945)
- 8156cde - Minimize failover impact (#948)
- 31b98fa - Pass through request timeout to replicator code path (#961)
- 23ba2b9 - Yarpc error code 13 should not be retriable (#958)
- 302f239 - Few performance optimization. (#960)
- a496241 - Add time sync functionality, to move the view of time of a remote cluster (#952)
- ab7d664 - Bugfix: timer for standby activity is not generated (#968)
- b21b809 - Adding more metrics, fix some bugs (#965)
- 152d202 - Only generate replication task for updated which has history events (#973)
- 3a79fc0 - Apply 5 min delay for standby task (#920)
- ddd0307 - Clear buffers on resetting of mutable state (#978)
- 0a3ff55 - Bugfix replication protocol (#979)
- aa5e689 - Add metrics on shard info (#980)
- 4769b55 - Bugfix: when buffering events, the last write version should remain the same (#982)
- 992a81a - Bugfix: rpc call cancel function should not use defer in for loop (#986)
- 09a1a9c - Bugfix: when failover happen and workflow has pending decision, new events on the active dide (aftter the failover) will be buffered. workflow execution context should first store the standby events (after the failover), then flush the active events. (#991)
- 287a0da - Handle deletion of task during failover (#976)
Dynamic Config
Converted all Cadence service config knobs to support dynamically updating them based on external config store. Checkout the dynamic config interface in package to bootstrap your own config store to allow dynamically updating service config.
- 1733b61 - Change cadence config to dynamic (#851)
- 49fd3c0 - Refactor and fix matching dynamic configs (#857)
- 57e8196 - Add test to ensure dynamic config key has mapped value (#922)
Stability Fixes
- 50b5d43 - add jitter to timer / transfer / replication queue sanity scan (#864)
- 4dbdfd4 - Return ShardOwnershipLostError as catch all from persistence layer (#873)
- 1844352 - Backoff on shard creation and adjust default number of persistence conns (#876)
- da1ccd6 - Fix missing decision timeout for transient decisions (#889)
- f11fff3 - Rate limit request on matching host (#899)
- 308cfc0 - Deadlock in matching engine (#909)
- ad7b522 - bugfix: rate limiter is not passed as pointer (#906)
- 7eaf360 - Fix leaking goroutine in matching (#911)
- a9a3727 - Fix check idle tasklist to include active task writers (#916)
- 98ea187 - Use current time as reference when timing out activities (#931)
- cbbff60 - Add dedupe logic for heartbeat timer creation (#962)
- 37c98ee - Force close decision on limit exceeded during task processing (#971)
Bug Fixes
- 85dbdf5 - Resolve #711: CLI panics on domain update when domain doesn't exist (#838)
- 1276b22 - bugfix: domain cache previously has a hard expiration, which cannot be updated, this expiration, in combination with periodical refresh of domains in v2 table, can cause domain missing for a short period of time. (#855)
- 11c8e38 - Fix bug in merging domain data (#918)
- 2b92fd6 - Bugfix: deadlock when domain failover (#919)
- fd7a1db - Batch of fixes (#927)
- 1864329 - Fix lock issue for timer ack level (#942)
- 19e4de2 - Add maximum timeout protection (#946)
- 17023c7 - Fix flaky test for visibility (#977)
- c4650a1 - Fix SignalWithStart open visibility not recorded (#974)
- c19a5e6 - Whitelist context deadline exceeded error for retry (#981)
- 9b832fa - Change protection of decision timeout to warn instead of reject (#993)
Operational Improvements
- 4fda6d0 - Implement ListDomain API (#879)
- d3e8f10 - Schema Tool option to configure request timeout for cql client (#900)
- 3e7ac1a - Fix noisy log (#917)
CLI Improvements
Misc
- 893c74c - Add client integration test for data converter (#809)
- e3b6cef - Fix firewall warning for CLI (#833)
- 4f8c0c3 - update docker-compose.yml (#839)
- 922b0e6 - Added header to marker (#798)
- ab30c8a - Fix flaky TestClientDataConverter_Failed (#865)
- 8510333 - Upgrade gocql version (#866)
- ca127f6 - Increase the retry initial interval on matching (#901)
- 1e0ec6c - workflow/child_workflow retry (#885)
- 492faf0 - Revert "workflow/child_workflow retry (#885)" (#910)
- 1b984ea - Add updated thriftrw gen files (#925)
- bc02cc7 - Update client version to 0.7.1 (#924)
- 1d64401 - Fix inconsistent client interface (#954)
- b6715c4 - Release CLI 0.5.4 (#953)
- 1341820 - Add integration test (#975)
- 65f9834 - glide up on cadence dependencies (#987)
v0.3.13 Release
Change log:
8d47f5a V0.3.12 patch (#829)
476c9bb Prevent duplicate user timer creation (#832)
8f6952d only global domain will use the v2 domain table (#831)
8441b49 Admin CLI: add describeHistoryHost (#826)
dccf888 Add ActivityScheduleTimeout deduction logic (#822)
7154288 Fix CLI parseTime for listworkflow (#824)
34c6e50 Fix print event version in CLI (#821)
5f40ae1 CLI: add history event version and full detail options (#817)
0d5583f Use workflowID as partition key on replication message (#820)
b947875 bugfix: should use shard's domain notification version (#818)
9eaa3dc Conflict Resolver bugfix (#816)
54e34f4 Reliable Domain Change Notification (#777)
42f44b6 Fix bug in adiminCLI: convert domain name to donmainID using domainCache (#812)
6573843 Always enable sticky when worker ask new task from complete (#811)
166ef58 Ut conflict resolver (#806)
772d653 Implement describeMutableState (#805)
1eb53c3 fix typo in getMutableState (#801)
a7ebd05 Multiple bugfixes (#803)
dd00c27 Fix error during Ringpop refresh (#802)
edfa972 Mutiple Bugfixes (#794)
4ae0147 CrossDC bugfixes to replication task generation and conflict resolver (#799)
ad90819 Update and move dockerfile-cli, add to auto build (#793)
076fb3d Add Dockerfile for CLI (#730)
0aef123 Add tips/directions for prod setups with cadence-cassandra-tool (#788)
v0.3.12 Release
Bugfix, reset mutable state empty UUID (#784) * add .vscode to .gitignore * bugfix: empty uuid in reset mutable state
v0.3.11 Release
This releases includes some changes necessary for cross dc (still in progress) and other bugfixes.
New Features and Improvements
CLI Improvement
#606 CLI: make show,list workflow looks better
#636 CLI: add descirbe workflow execution
#640 Add unit tests for CLI commands
#649 Increase CLI version to 0.5.2
#657 Add CLI show workflow progress
New Feature
#621 SignalWithStart API: Cadence added support sending signals ensuring signal delivery for the following cases:
- if workflow is running, it will signal success (same as existing SignalWorkflow behavior);
- If workflow is not running, it will restart that workflow and then signal;
- If workflow is not found, it will start workflow using input args and then signal
Cross DC Support (In Development)
This release has partial changes which are needed for supporting replication of workflow execution state across Cadence clusters. This feature is in development and should not be enabled in production clusters. Most of the changes included in this release either require spinning up a new Cadence role (worker) or hidden behind a Global Domain feature flag.
#604 Persistence support for replication state for execution
#618 Add helper functions in domain cache for events replication
#619 Bugfix: when setting up new cluster, there should be a way do replication of domain
#624 Support for generating replication task on workflow execution updates
#628 Support workflow execution CRUD without replication state
#625 DB & schema change for timer / transfer queue cross DC support
#632 Publish replication task to Kafka after reading from replicator queue
#635 Replication task processor bugfixes
#630 Separate timer queue ack manager in separate file, add functionality to timer queue processor to be cluster aware.
#643 Apply replication history events to passive cluster
#639 Add standby timer processing logic, separate existing timer processing logic into active & standby
#650 Add transfer task standby processor skeleton
#671 Add configuration check to enable standby timer processor
Stability Fixes
#616 Shard consistency is not using local qurom
#642 Fix multiple bugs in frontend
#644 Return BadRequestError from beginning
#655 When multiple activity got timeouted, there will be at most one being actually deleted in Cassandra
#658 Recreate activity heartbeat timeout after first timer fire
#665 Cache get function, when in pin mode, should not increase the counter before return
#670 Relax heartbeat timer check to allow heartbeats with incorrect IDs
#667 Mutable state should be reset if the operation is not successfully
Schema Changes
#604 Persistence support for replication state for execution
#625 DB & schema change for timer / transfer queue cross DC support
Miscellaneous
#601 history service should do event reordering making sure corresponding events for decision will have exactly the same order and no irrelevant event will be inserted in between, so client can predict the event ID of a corresponding decision.
#602 separate timer in timer queue processor into dedicated file, add UT
#631 Add retry in some frontend API
#647 Add missing fields to WorkflowExecutionStartedEventAttributes
#654 Add new members to receive docker build notifications
#627 Bump cadence-web to 1.1.1
#666 Fix misspell
v0.3.9 Release
This releases fixes an issue with docker image to correctly bring up Cadence server.
v0.3.8 Release
This is a patch for release to v0.3.7 to fix the following critical stability issue:
1f1d16f - Visibility records not getting moved to closed_executions for child workflows (#610)
v0.3.7 Release
New Features and Improvements
Cadence CLI
This release includes long awaited cadence CLI. Please see for more details
- 4f0eb40 - Cadence CLI (#577)
- 08652db - CLI: Add list workers of tasklist (#597)
- 0a4abbe - Increase CLI version (#603)
Cross DC Support (In Development)
This release has partial changes which are needed for supporting replication of workflow execution state across Cadence clusters. This feature is in development and should not be enabled in production clusters. Most of the changes included in this release either require spinning up a new Cadence role (worker) or hidden behind a Global Domain feature flag.
- 1a9baeb - refactor existing domain API for cross DC, refactor existing domain p… (#527)
- 952e86d - add separate config files for cross dc (#530)
- d429f53 - move cross DC domain replication config from 0.4 to 0.5 (#542)
- f059d31 - bugfix: add missing config for docker (#546)
- 2740e6b - Cadence Worker service to host replicator (#563)
- ad78f5f - make register domain aware of active cluster name (#576)
- 095fbd7 - Config changes to start replicator for standby cluster (#575)
- d914e07 - rename domain version to db_version, move failover_version to top level (#582)
- f265fd8 - Kafka based publisher for replication tasks (#585)
- 4a60ac3 - Replicate domain (#586)
- 5c6cb08 - wire replicator transmission to domain APIs (#590)
- 5782955 - use mock kafka producer in frontend before cross DC is ready (#594)
RequestCancel and Signal Decision Improvements
We made multiple fixes to transfer queue processing of request cancel decision and fixed quite a lot of edge cases with processing of both request cancel workflow and signal workflow decision handling.
- ce52993 - bugfix: request cancel info should be pass down to persistence layer, add functionality to allow workflow signal and cancellation to specify target child workflow only (#544)
- b1a2a0f - make request cancel workflow idempotent (#595)
Support for Heartbeat Using ActivityID
Dynamic Config
Added support for dynamic config for various service config knobs for Cadence roles. This allows to integrate Cadence server with custom configuration mechanism used for on premise deployment.
- 0182700 - Define dynamic config and integrate in service bootstrap (#543)
- 3f14209 - Create type functions and filter options for dynamic config (#587)
- d17af66 - bugfix dynamic config (#596)
Stability Fixes
- 051f9de - bugfix: potential null pointer error in transfer queue processor (#541)
- 24b89d9 - Validation of decision attributes (#555)
- 7d658c9 - bugfix: #573 sticky query should enforce sticky decision timeout (#579)
- 35bf3de - bugfix: parent workflow, when signaling child workflow, can experienc… (#607)
Operational Improvements
Schema Changes
This release includes changes to Cadence core schema. All the schema changes required by new features are backwards compatible. Please make sure to deploy cadence schema 0.5 using cadence-cassandra-tool before deploying this release of Cadence server. Following are the changes which requires 0.5 version of schema:
- 1a9baeb - refactor existing domain API for cross DC, refactor existing domain p… (#527)
- ce52993 - bugfix: request cancel info should be pass down to persistence layer, add functionality to allow workflow signal and cancellation to specify target child workflow only (#544)
- 4a60ac3 - Replicate domain (#586)
Miscellaneous
- Add the cadence-web UI to docker compose (#525)
- Fix README for cassandra tool (#553)
- 083cb3c - update docker-compose to latest release (#549)
- 9b56a18 - Propagate cassandra port in load schema (#558)
- 9c2d1a1 - Change cassandra test setup to use input port (#552)
- 9f82f5a - update docker to golang 1.9.3 (#580)
v0.3.6 Release
New Features and Improvements
ActivityTimeout Processing
Tasklist Throttling
Bug fix to ensure low throttling numbers do not cause CPU capacity spikes.
- 4725ace - Remove throttling logs and set min burst size (#523)
- 6fe6949 - Optimize task buffer throttling (#526)
- 99f4899 - Fix throttling burst and add debug logs (#529)
Avoid tasklist leaks
Identify and expire unused tasklists
- b7dc627 - Add ttl to expire and avoid leaks of sticky task lists (#510)
- 11a308a - Unload tasklist to avoid leaks when no poller has queried recently (#519)
Cadence UI
Add docker support for cadence UI
Schema changes
Signal Workflow
Add support for sending signals in workflows in code and prevent workflows from signaling themselves.
- bd9eb9f - Add signal external workflow decision (#485)
- 8cc9319 - Handle workflow signal itself (#539)
Miscellaneous
v0.3.5 Release
New Features and Improvements
Sticky Query
Support for QueryTask to also use cached decider state rather than replay the entire execution to generate the result for query.
- 4550c4e - Add Sticky Query to Cadence Server (#464)
- 62c7c29 - bugfix sticky query for old client (#481)
StartWorkflowExecution Flags
Support for deduping workflow execution event after completion
- 8eaced5 - implement customized deduplication of start workflow execution API (#463)
- b98524b - add handling of child workflow ID reuse logic (#500)
Long Poll For Workflow Completion
GetWorkflowExecutionHistory API now supports long poll completion event.
- 5e9c49d - implement filter which allow caller choose all events or only close event of when dumping history events (#489)
Tasklist Throttling
Visibility Improvements
- 36c6f54 - add outstanding activities to the result of DescribeWorkflowExecution (#475)
- 5121974 - Add API DescribeTaskList (#483)
PPROF Handler
- 1fdb543 - add pprof config and start up logic (#478)
- 113a644 - pprof should be initialized only once per process (#502)
Stability Fixes
- e3afa22 - expose attempt to decision task (#466)
- c3a2b1e - bugfix timer queue processor (#480)
- bb03ce5 - Add metrics to persistence for visibility (#486)
- 8a89e91 - bugfix: metrics client is initialized 2 times per shard (#484)
- aa29ebb - Fix workflow timeout not created for new execution when ContinuedAsNew (#487)
- 6ff9cbd - Separate history/matching failure metrics from cadence failures (#488)
- 6fd468a - bugfix: domain retention is in days, not in seconds, so when deleting the current execution when finished, we should do a conversion (#490)
- 7a51af2 - Move history and matching failure metric to common (#497)
- 7ca011c - Missing Tasklist name on DecisionTaskScheduled history event (#499)
- 65e165c - ChildWorkflow timeout not communicated to parent execution (#504)
Schema Changes
This release includes changes to Cadence core schema. All the schema changes required by new features are backwards compatible. Please make sure to deploy cadence schema 0.3 using cadence-cassandra-tool before deploying this release of Cadence server. Following are the changes which requires 0.3 version of schema:
- 4550c4e - Add Sticky Query to Cadence Server (#464)
- 5e9c49d - implement filter which allow caller choose all events or only close event of when dumping * history events (#489)
Miscellaneous
v0.3.4 Release
This release fixes some critical stability issues introduced in v0.3.3. You can skip v0.3.3 and directly upgrade to this release on top of v0.3.2, but make sure to upgrade schema as outlined in release notes for v0.3.3.
Stability Fixes
- 6ed2014 - WorkflowExecution stuck fix on transient decision timeout (#450)
- 7ad6d5b - fix buffered events bug (#449)
- fffdc7e - TimerQueueProcessor to scan DB for existing timers on init (#455)
- c844eb7 - check if workflow is closed when processing sticky timeout timer (#458)
- 963d13c - TimerQueueProcessor stuck fix on large backlog on available timers (#460)
Miscellaneous
Schema Changes If Upgrading From v0.3.2
All the schema changes required by new features are backwards compatible please make sure to deploy cadence schema 0.2 and visibility schema 0.2 using the cadence-cassandra-tool before deploying this release of Cadence server. Following are the changes we requires 0.2 version of schema: