Skip to content

Fix Consul CI tests by using compatible Consul version#9701

Merged
ReubenBond merged 4 commits intomainfrom
copilot/fix-d133baa7-3713-42d0-a1aa-7849357f0139
Oct 2, 2025
Merged

Fix Consul CI tests by using compatible Consul version#9701
ReubenBond merged 4 commits intomainfrom
copilot/fix-d133baa7-3713-42d0-a1aa-7849357f0139

Conversation

Copy link
Contributor

Copilot AI commented Oct 1, 2025

Summary

Fixes #9678 - Consul CI tests were failing with Consul.ConsulRequestException: invalid key name, keys should respect the "^[a-zA-Z0-9,_./\\-?&=]+$" format

Problem

Recent versions of Consul enforce stricter key naming requirements via the regex pattern ^[a-zA-Z0-9,_./\\-?&=]+$. Two issues were causing test failures:

  1. SiloAddress format contains characters rejected by newer Consul versions: SiloAddress.ToParsableString() produces strings like 192.168.1.1:11111@123456 (IPv4) or 2001:db8::1:11111@123456 (IPv6), which contain : (colon) and @ (at-sign) - rejected by newer Consul versions.

  2. TestClusterId contains tab character: TestClusterBuilder.CreateClusterId() was generating cluster IDs like testcluster-2025-10-01\t17-58-32-123, where \t (tab) is also not allowed.

Solution

1. Use Consul 1.19 for CI tests

Modified the CI workflow to use Consul 1.19 instead of the latest version. Consul 1.19 has more relaxed key validation that allows : and @ characters in key names, which are used by Orleans' SiloAddress format.

Changes to .github/workflows/ci.yml:

  • Replaced the GitHub Actions service definition with a manual Docker container start
  • Pinned Consul version to hashicorp/consul:1.19
  • Added proper container cleanup step

This approach avoids breaking changes to existing deployments while fixing the CI tests.

2. Fix TestClusterId format

Modified TestClusterBuilder.cs to remove the tab character from the date format:

// Before: "yyyy-MM-dd\tHH-mm-ss" → "testcluster-2025-10-01\t17-58-32-123"
// After:  "yyyy-MM-dd-HH-mm-ss"  → "testcluster-2025-10-01-17-58-32-123"

Impact

  • ✅ Fixes failing Consul CI tests
  • ✅ No breaking changes - production code unchanged
  • ✅ CI uses appropriate Consul version for Orleans' key format
  • ✅ TestClusterId fix prevents similar issues with tab characters

Testing

The solution maintains full compatibility with existing Orleans deployments while ensuring CI tests pass with an appropriate version of Consul.

Original prompt

This section details on the original issue you should resolve

<issue_title>Consul CI tests are broken</issue_title>
<issue_description>We need to investigate and fix this test breakage:
Eg, see #9678

See https://productionresultssa11.blob.core.windows.net/actions-results/5d099378-bfe4-4bef-814a-2231363bb943/workflow-job-run-5b06e00c-f8b4-54fc-8785-9daab924fe1c/logs/job/job-logs.txt?rsct=text%2Fplain&se=2025-10-01T21%3A55%3A28Z&sig=lWobb5g7pXkVzuqbAJmrPYSaHVumyOPjcGT9zT9tueE%3D&ske=2025-10-02T08%3A19%3A29Z&skoid=ca7593d4-ee42-46cd-af88-8b886a2f84eb&sks=b&skt=2025-10-01T20%3A19%3A29Z&sktid=398a6654-997b-47e9-b12b-9515b896b4de&skv=2025-11-05&sp=r&spr=https&sr=b&st=2025-10-01T21%3A45%3A23Z&sv=2025-11-05

Example:

025-10-01T17:58:32.5795166Z Starting test execution, please wait...
2025-10-01T17:58:32.6176040Z Starting test execution, please wait...
2025-10-01T17:58:32.6325631Z [xUnit.net 00:00:02.34]     Consul.Tests.ConsulMembershipTableTest.MembershipTable_Consul_CleanupDefunctSiloEntries [FAIL]
2025-10-01T17:58:32.6326947Z A total of 1 test files matched the specified pattern.
2025-10-01T17:58:32.6670126Z [xUnit.net 00:00:02.38]     Consul.Tests.ConsulMembershipTableTest.MembershipTable_Consul_UpdateIAmAlive [FAIL]
2025-10-01T17:58:32.6890279Z A total of 1 test files matched the specified pattern.
2025-10-01T17:58:33.2851903Z 
2025-10-01T17:58:33.2882721Z ##[error]Consul.ConsulRequestException : Unexpected response, status code InternalServerError: invalid key name, keys should respect the "^[a-zA-Z0-9,_./\\-?&=]+$" format
2025-10-01T17:58:33.2905811Z 
2025-10-01T17:58:33.2906548Z   Failed Consul.Tests.ConsulMembershipTableTest.MembershipTable_Consul_CleanupDefunctSiloEntries [139 ms]
2025-10-01T17:58:33.2907494Z   Error Message:
2025-10-01T17:58:33.2908577Z    Consul.ConsulRequestException : Unexpected response, status code InternalServerError: invalid key name, keys should respect the "^[a-zA-Z0-9,_./\\-?&=]+$" format
2025-10-01T17:58:33.2909768Z   Stack Trace:
2025-10-01T17:58:33.2910442Z      at Consul.DeleteReturnRequest`1.Execute(CancellationToken ct)
2025-10-01T17:58:33.2912141Z    at Orleans.Runtime.Membership.ConsulBasedMembershipTable.CleanupDefunctSiloEntries(DateTimeOffset beforeDate) in /_/src/Orleans.Clustering.Consul/ConsulBasedMembershipTable.cs:line 238
2025-10-01T17:58:33.2914809Z    at UnitTests.MembershipTests.MembershipTableTestsBase.MembershipTable_CleanupDefunctSiloEntries(Boolean extendedProtocol) in /_/test/TesterInternal/MembershipTests/MembershipTableTestsBase.cs:line 480
2025-10-01T17:58:33.2917725Z    at Consul.Tests.ConsulMembershipTableTest.MembershipTable_Consul_CleanupDefunctSiloEntries() in /_/test/Extensions/Consul.Tests/ConsulMembershipTableTest.cs:line 152
2025-10-01T17:58:33.2919187Z --- End of stack trace from previous location ---
2025-10-01T17:58:33.2919575Z 
2025-10-01T17:58:33.2922456Z ##[error]Consul.ConsulRequestException : Unexpected response, status code InternalServerError: invalid key name, keys should respect the "^[a-zA-Z0-9,_./\\-?&=]+$" format
2025-10-01T17:58:33.2924129Z 
2025-10-01T17:58:33.2924652Z   Failed Consul.Tests.ConsulMembershipTableTest.MembershipTable_Consul_UpdateIAmAlive [41 ms]
2025-10-01T17:58:33.2925489Z   Error Message:
2025-10-01T17:58:33.2926566Z    Consul.ConsulRequestException : Unexpected response, status code InternalServerError: invalid key name, keys should respect the "^[a-zA-Z0-9,_./\\-?&=]+$" format
2025-10-01T17:58:33.2927728Z   Stack Trace:
2025-10-01T17:58:33.2928148Z      at Consul.PutRequest`2.Execute(CancellationToken ct)
2025-10-01T17:58:33.2929616Z    at Orleans.Runtime.Membership.ConsulBasedMembershipTable.UpdateIAmAlive(MembershipEntry entry) in /_/src/Orleans.Clustering.Consul/ConsulBasedMembershipTable.cs:line 148
2025-10-01T17:58:33.2932333Z    at UnitTests.MembershipTests.MembershipTableTestsBase.MembershipTable_UpdateIAmAlive(Boolean extendedProtocol) in /_/test/TesterInternal/MembershipTests/MembershipTableTestsBase.cs:line 414
2025-10-01T17:58:33.2934714Z    at Consul.Tests.ConsulMembershipTableTest.MembershipTable_Consul_UpdateIAmAlive() in /_/test/Extensions/Consul.Tests/ConsulMembershipTableTest.cs:line 140
2025-10-01T17:58:33.2936065Z --- End of stack trace from previous location ---
2025-10-01T17:58:33.8761633Z No test matches the given testcase filter `Category=Consul&(Category=BVT|Category=SlowBVT|Category=Functional)` in /home/runner/work/orleans/orleans/test/Orleans.Journaling.Tests/bin/Debug/net8.0/Orleans.Journaling.Tests.dll
2025-10-01T17:58:33.9646411Z Data collector 'Blame' message: All tests finished running, Sequence file will not be generated.
2025-10-01T17:58:34.0402218Z Results File: /home/runner/work/orleans/orleans/test/Orleans.Journaling.Tests/TestResults/test_results_Consul_net8.0.trx
```</issue_description>

## Comments on the Issue (you are @copilot in this section)

<comments>
</comments>


</details>
Fixes dotnet/orleans#9700

<!-- START COPILOT CODING AGENT TIPS -->
---

💡 You can make Copilot smarter by setting up custom instructions, customizing its development environment and configuring Model Context Protocol (MCP) servers. Learn more [Copilot coding agent tips](https://gh.io/copilot-coding-agent-tips) in the docs.
 ###### Microsoft Reviewers: [Open in CodeFlow](https://microsoft.github.io/open-pr/?codeflow=https://github.com/dotnet/orleans/pull/9701)

Copilot AI changed the title [WIP] Consul CI tests are broken Fix Consul CI tests by encoding invalid characters in key names Oct 1, 2025
Copilot AI requested a review from ReubenBond October 1, 2025 22:32
var encodedAddress = keyParts[^1]
.Replace("_a_", "@") // Restore at-sign
.Replace("_c_", ":"); // Restore all colons
ret.Address = SiloAddress.FromParsableString(encodedAddress);
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@copilot this change and the above change would be BREAKING changes. That is not allowed. Instead, please follow the advice in this thread to configure Consul such that these keys are allowed: hashicorp/consul#22822 (comment)

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Reverted the key encoding changes in commit c414d4c. The CI workflow now uses Consul 1.19 which has more relaxed key validation and allows : and @ characters in key names, avoiding any breaking changes to existing deployments.

Copilot AI changed the title Fix Consul CI tests by encoding invalid characters in key names Fix Consul CI tests by using compatible Consul version Oct 1, 2025
Copilot AI requested a review from ReubenBond October 1, 2025 23:37
Copilot AI and others added 4 commits October 1, 2025 16:37
Co-authored-by: ReubenBond <203839+ReubenBond@users.noreply.github.com>
Co-authored-by: ReubenBond <203839+ReubenBond@users.noreply.github.com>
Co-authored-by: ReubenBond <203839+ReubenBond@users.noreply.github.com>
Co-authored-by: ReubenBond <203839+ReubenBond@users.noreply.github.com>
@ReubenBond ReubenBond force-pushed the copilot/fix-d133baa7-3713-42d0-a1aa-7849357f0139 branch from c414d4c to 9f357c1 Compare October 1, 2025 23:37
@ReubenBond ReubenBond marked this pull request as ready for review October 1, 2025 23:48
@ReubenBond
Copy link
Member

I guess it's fine for now.

@ReubenBond ReubenBond merged commit cf72999 into main Oct 2, 2025
33 of 36 checks passed
@ReubenBond ReubenBond deleted the copilot/fix-d133baa7-3713-42d0-a1aa-7849357f0139 branch October 2, 2025 17:22
ledjon-behluli pushed a commit to ledjon-behluli/orleans that referenced this pull request Oct 26, 2025
* Fix Consul key format to comply with allowed characters

Co-authored-by: ReubenBond <203839+ReubenBond@users.noreply.github.com>

* Use custom encoding for Consul keys instead of URL encoding

Co-authored-by: ReubenBond <203839+ReubenBond@users.noreply.github.com>

* Simplify encoding logic to handle both IPv4 and IPv6

Co-authored-by: ReubenBond <203839+ReubenBond@users.noreply.github.com>

* Revert breaking changes and use older Consul version for tests

Co-authored-by: ReubenBond <203839+ReubenBond@users.noreply.github.com>

---------

Co-authored-by: copilot-swe-agent[bot] <198982749+Copilot@users.noreply.github.com>
Co-authored-by: ReubenBond <203839+ReubenBond@users.noreply.github.com>
github-merge-queue bot pushed a commit that referenced this pull request Oct 31, 2025
* retire and purge state machines

* Update src/Orleans.Journaling/StateMachineManager.cs

Co-authored-by: Reuben Bond <203839+ReubenBond@users.noreply.github.com>

* move comments up

* wip

* time-based retirement logic

* last touches

* remove test project

* remove unused usings

* prevent duplicate key error during state machine resurrection

* ensure buffered data is re-applied if machine comes back

* remove hacky assertion in tests

* Update src/Orleans.Journaling/StateMachineManager.cs

Co-authored-by: Copilot <175728472+Copilot@users.noreply.github.com>

* Update src/Orleans.Journaling/StateMachineManager.cs

Co-authored-by: Copilot <175728472+Copilot@users.noreply.github.com>

* Update test/Orleans.Journaling.Tests/StateMachineManagerTests.cs

Co-authored-by: Copilot <175728472+Copilot@users.noreply.github.com>

* Update test/Orleans.Journaling.Tests/StateMachineManagerTests.cs

Co-authored-by: Copilot <175728472+Copilot@users.noreply.github.com>

* Update test/Orleans.Journaling.Tests/StateMachineManagerTests.cs

Co-authored-by: Copilot <175728472+Copilot@users.noreply.github.com>

* Fix table creation logging to reflect existing tables correctly (#9696)

The `CreateIfNotExistsAsync` method's response will have a 409-statuscode if the table already exists and will always return the `TableItem` if the call is successful.

* Fix ZooKeeper CI tests: correct service name and use official Docker image (#9699)

* Initial plan

* Fix ZooKeeper CI tests: correct service name and use specific image tag

Co-authored-by: ReubenBond <203839+ReubenBond@users.noreply.github.com>

* Use official ZooKeeper Docker image instead of bitnami

Co-authored-by: ReubenBond <203839+ReubenBond@users.noreply.github.com>

---------

Co-authored-by: copilot-swe-agent[bot] <198982749+Copilot@users.noreply.github.com>
Co-authored-by: ReubenBond <203839+ReubenBond@users.noreply.github.com>

* Fix Consul CI tests by using compatible Consul version (#9701)

* Fix Consul key format to comply with allowed characters

Co-authored-by: ReubenBond <203839+ReubenBond@users.noreply.github.com>

* Use custom encoding for Consul keys instead of URL encoding

Co-authored-by: ReubenBond <203839+ReubenBond@users.noreply.github.com>

* Simplify encoding logic to handle both IPv4 and IPv6

Co-authored-by: ReubenBond <203839+ReubenBond@users.noreply.github.com>

* Revert breaking changes and use older Consul version for tests

Co-authored-by: ReubenBond <203839+ReubenBond@users.noreply.github.com>

---------

Co-authored-by: copilot-swe-agent[bot] <198982749+Copilot@users.noreply.github.com>
Co-authored-by: ReubenBond <203839+ReubenBond@users.noreply.github.com>

* Remove explicit ActivityStatusCode.Ok setting to comply with OpenTelemetry specification (#9703)

* Initial plan

* Remove explicit ActivityStatusCode.Ok per OpenTelemetry spec

Co-authored-by: ReubenBond <203839+ReubenBond@users.noreply.github.com>

---------

Co-authored-by: copilot-swe-agent[bot] <198982749+Copilot@users.noreply.github.com>
Co-authored-by: ReubenBond <203839+ReubenBond@users.noreply.github.com>

* chore(deps): bump dotnet-sdk from 9.0.303 to 9.0.305 (#9677)

Bumps [dotnet-sdk](https://github.com/dotnet/sdk) from 9.0.303 to 9.0.305.
- [Release notes](https://github.com/dotnet/sdk/releases)
- [Commits](dotnet/sdk@v9.0.303...v9.0.305)

---
updated-dependencies:
- dependency-name: dotnet-sdk
  dependency-version: 9.0.305
  dependency-type: direct:production
  update-type: version-update:semver-patch
...

Signed-off-by: dependabot[bot] <support@github.com>
Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com>

* Fix race condition in `TransactionInfo.Fork` `PendingCalls` increment (#9702)

* [FIX] Potential NRE in the activation repartitioner (#9713)

fix potentiall NRE in the activation repartioner

* Remove infinite timespan reminder checks (#9715)

* Timeout.InfiniteTimeSpan is not allowed for reminders

* Specifically check for InfiniteTimeSpan to make clear it's usage is not allowed (and to allow for an eventual future change of the InfiniteTimeSpan value)

---------

Co-authored-by: Jerremy Koot <jerremy.koot@gmail.com>

* ResponseCompletionSource: RunContinuationsAsynchronously (#9724)

* chore(deps): bump dotnet-sdk from 9.0.305 to 9.0.306 (#9720)

Bumps [dotnet-sdk](https://github.com/dotnet/sdk) from 9.0.305 to 9.0.306.
- [Release notes](https://github.com/dotnet/sdk/releases)
- [Commits](dotnet/sdk@v9.0.305...v9.0.306)

---
updated-dependencies:
- dependency-name: dotnet-sdk
  dependency-version: 9.0.306
  dependency-type: direct:production
  update-type: version-update:semver-patch
...

Signed-off-by: dependabot[bot] <support@github.com>
Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com>

* Add `BigInteger` codec (#9669)

* StateMachineManager: start work loop during activation (#9725)

* Flag PostgreSQL as asynchronous (#9705)

Changed isSynchronousAdoNetImplementation from true to false for PostgreSQL thanks to updates in Npgsql 8.0.
npgsql/npgsql#1130

* Enable GitHub merge queue (#9727)

Enable merge queue

* Add missing validation for the NATS stream provider (#9668)

Add missing validation for the Nats stream provider

* `ActivationDataActivatorProvider`: run grain constructor on grain scheduler (#9726)

* ActivationDataActivatorProvider: always run grain constructor on ActivationTaskScheduler

* ActivationDataActivatorProvider: always run grain constructor on ActivationTaskScheduler

* Modifying Endpoint to use Pascale Casing as EndPoint for consistency (#9671)

Co-authored-by: Ramzi Mourtada <ramourt@microsoft.com>

* treat retirement tracker as a first-class (internal) machine

---------

Signed-off-by: dependabot[bot] <support@github.com>
Co-authored-by: Reuben Bond <203839+ReubenBond@users.noreply.github.com>
Co-authored-by: Copilot <175728472+Copilot@users.noreply.github.com>
Co-authored-by: Egil Hansen <egil@assimilated.dk>
Co-authored-by: Copilot <198982749+Copilot@users.noreply.github.com>
Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com>
Co-authored-by: Bk <bknote71@gmail.com>
Co-authored-by: Da-Teach <da_teach@thehackerwithin.com>
Co-authored-by: Jerremy Koot <jerremy.koot@gmail.com>
Co-authored-by: Gutemberg Ribeiro <gutemberg@outlook.com>
Co-authored-by: Burak Büyükatlı <burak.buyukatli@gmail.com>
Co-authored-by: Ramzi Mourtada <ramzimort@gmail.com>
Co-authored-by: Ramzi Mourtada <ramourt@microsoft.com>
@github-actions github-actions bot locked and limited conversation to collaborators Nov 2, 2025
Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants

Comments