Skip to content

improve OAuth token refresh#1037

Merged
eschultink merged 11 commits intorc-v0.5.12from
s211-multiple-tokens
Nov 11, 2025
Merged

improve OAuth token refresh#1037
eschultink merged 11 commits intorc-v0.5.12from
s211-multiple-tokens

Conversation

@eschultink
Copy link
Member

@eschultink eschultink commented Oct 29, 2025

Features

  • add SecretStore method to get potentially multiple versions of secrets
  • refactor where proactive refresh, token caching happens; more natural to manage in the strategy, rather than in refresh impl
  • use getAvailableVersions to grab up to 5 versions; sort by

Future-work

  • thread-safety; lambda has hard cap of 1 concurrent request per lambda; gcp gen2 doesn't have this cap, but currently set to 1 as well; we should 1) make this thread-safe by adding locks, 2) inc concurrency in the gcp case
  • use getAvailableVersions to try other access tokens, rather than refresh; this is not straightforward in current structure; we utilize Google's library, so only implement the refresh() call; we don't directly know which token, if any, the client attempted request with; or why it's trying to refresh (expiry? 401 on response? something else?)
  • use getAvailableVersions to try other refresh tokens, if refresh failse with one; similarly, not straightforward using current lib. the whole concept of using a refreshToken is hidden in the payload builder implementation. This just builds the request, without knowledge of what token was used previously, or why refresh failed with last token.

I believe only jira, confluence, github-enterprise-server, and dropbox-buisness (deprecated) currently use this oauth2tokenRefresh strateyg + grant_type=refresh_token. @aperez-worklytics - i am not jira/confluence admin of our test instance, so haven't been able to test.

Change implications

  • this is a risky change, as token refreshes brittle for Jira, Confluence, Zoom - which are finicky about having multiple tokens outstanding for a given connection.
  • dependencies added/changed? no
  • something important to note in future release notes? no

Note

Introduce versioned secret access across stores and refactor OAuth refresh to cache/proactively refresh and select the best shared access token.

  • Secrets/Config:
    • Add ConfigService.ConfigValueVersion and SecretStore#getAvailableVersions for versioned values.
    • Implement version listing in SecretManagerConfigService (GCP), ParameterStoreConfigService (AWS SSM), and SecretsManagerSecretStore (AWS SM); return empty for env-var-only.
    • Wire-through support in CompositeSecretStore, CompositeConfigService, and CachingConfigServiceDecorator.
  • OAuth Refresh Strategy (OAuthRefreshTokenSourceAuthStrategy):
    • Move token caching and proactive refresh logic into the strategy; support shared-token usage toggle and cacheability.
    • On startup/refresh, read up to 5 access-token versions via getAvailableVersions; choose by latest expiration, then version.
    • Use Lazy refresh handler; keep lock-based refresh coordination; rotate refresh token if provided and old-enough.
    • Update tests; make AccessTokenDto.expirationDate nullable (Long).

Written by Cursor Bugbot for commit 982d5f9. This will update automatically on new commits. Configure here.

@eschultink eschultink self-assigned this Oct 29, 2025
@Inject
Clock clock;
@Inject
Lazy<OAuth2CredentialsWithRefresh.OAuth2RefreshHandler> refreshHandlerProvider;
Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

made lazy to avoid cycle, bc refresh handler now uses the strategy as well;



@Override
public Credentials getCredentials(Optional<String> userToImpersonate) {
Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

this is the main part of the change; proactively pulling the accessToken here from config/cache; this

  1. solves an old TODO; main benefit of which is to avoid apparent refresh() call, although i think in practice that refresh() call would have pulled token from cache anyways
  2. better proactive refresh; I think the proactive token refresh check may never have been really hit, as was buried in the refresh handler - so only ever called when the accessToken needed to be refreshed, which defeats the point.

@eschultink eschultink marked this pull request as ready for review October 30, 2025 03:44
Copy link
Contributor

Copilot AI left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull Request Overview

This PR refactors OAuth token refresh logic to improve reliability and add support for retrieving multiple secret versions. The key change is moving proactive token refresh and caching from the TokenRefreshHandlerImpl to the OAuthRefreshTokenSourceAuthStrategy, making the architecture more maintainable. Additionally, a new getAvailableVersions method is added to SecretStore to support retrieving previous versions of secrets, laying groundwork for fallback retry logic with N-1 token versions.

Key Changes:

  • Moved token caching and refresh logic from TokenRefreshHandlerImpl to OAuthRefreshTokenSourceAuthStrategy
  • Added getAvailableVersions() method to SecretStore interface with implementations for GCP Secret Manager, AWS Secrets Manager, and AWS Parameter Store
  • Refactored tests to reflect the new architecture

Reviewed Changes

Copilot reviewed 12 out of 12 changed files in this pull request and generated 4 comments.

Show a summary per file
File Description
java/core/src/main/java/co/worklytics/psoxy/gateway/SecretStore.java Added getAvailableVersions() method to interface for retrieving secret version history
java/core/src/main/java/co/worklytics/psoxy/gateway/ConfigService.java Added ConfigValueVersion class to represent versioned configuration values
java/core/src/main/java/co/worklytics/psoxy/gateway/impl/oauth/OAuthRefreshTokenSourceAuthStrategy.java Moved token caching, refresh decision logic, and shared token management from TokenRefreshHandlerImpl to strategy class; made strategy @Singleton
java/core/src/test/java/co/worklytics/psoxy/gateway/impl/oauth/OAuthRefreshTokenSourceAuthStrategyTest.java Updated tests to work with refactored architecture where token refresh logic lives in strategy
java/impl/gcp/src/main/java/co/worklytics/psoxy/SecretManagerConfigService.java Implemented getAvailableVersions() for GCP Secret Manager with version sorting and filtering
java/impl/aws/src/main/java/co/worklytics/psoxy/aws/SecretsManagerSecretStore.java Implemented getAvailableVersions() for AWS Secrets Manager filtering to AWSCURRENT/AWSPREVIOUS versions
java/impl/aws/src/main/java/co/worklytics/psoxy/aws/ParameterStoreConfigService.java Implemented getAvailableVersions() for AWS Parameter Store using parameter history
java/impl/cmd-line/src/main/java/co/worklytics/psoxy/CommandLineConfigService.java Implemented getAvailableVersions() returning single current version
java/core/src/main/java/co/worklytics/psoxy/gateway/CompositeSecretStore.java Delegated getAvailableVersions() to preferred/fallback stores
java/core/src/main/java/co/worklytics/psoxy/gateway/impl/CompositeConfigService.java Delegated getAvailableVersions() with fallback logic
java/core/src/main/java/co/worklytics/psoxy/gateway/impl/CachingConfigServiceDecorator.java Delegated getAvailableVersions() without caching version lists
java/core/src/main/java/co/worklytics/psoxy/gateway/impl/oauth/RefreshTokenTokenRequestBuilder.java Reordered imports (formatting change only)

💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.

ListSecretVersionsRequest request = ListSecretVersionsRequest.newBuilder()
.setParent(secretName.toString())
.setFilter("state:ENABLED")
.setPageSize(Math.max(limit, NUMBER_OF_VERSIONS_TO_RETRIEVE))
Copy link

Copilot AI Oct 30, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The Math.max(limit, NUMBER_OF_VERSIONS_TO_RETRIEVE) logic appears incorrect. This will request more versions than needed when limit is greater than NUMBER_OF_VERSIONS_TO_RETRIEVE, potentially retrieving unnecessary data. Should likely be Math.min(limit, NUMBER_OF_VERSIONS_TO_RETRIEVE) to respect the limit parameter, or simply use limit if NUMBER_OF_VERSIONS_TO_RETRIEVE is a reasonable upper bound.

Suggested change
.setPageSize(Math.max(limit, NUMBER_OF_VERSIONS_TO_RETRIEVE))
.setPageSize(Math.min(limit, NUMBER_OF_VERSIONS_TO_RETRIEVE))

Copilot uses AI. Check for mistakes.
Comment on lines +429 to +430
// TODO: breaks abstraction?? whether there *is* a refresh token at all depends on
// the PayloadBuilder
Copy link

Copilot AI Oct 30, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

[nitpick] This TODO comment identifies an abstraction violation where the refresh handler is making assumptions about the PayloadBuilder implementation. Consider refactoring to make the presence of refresh tokens part of the PayloadBuilder interface contract, or document why this coupling is acceptable.

Suggested change
// TODO: breaks abstraction?? whether there *is* a refresh token at all depends on
// the PayloadBuilder
// NOTE: This is an intentional abstraction violation. Whether there *is* a refresh token at all
// depends on the PayloadBuilder implementation. This coupling is acceptable here because the
// refresh handler is only used in contexts where PayloadBuilder is guaranteed to provide a refresh token.
// If this changes, refactor to make refresh token presence part of the PayloadBuilder interface contract.

Copilot uses AI. Check for mistakes.
Copy link
Member

@jlorper jlorper left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

A couple of minor comments

protected final Duration MIN_PROACTIVE_TOKEN_REFRESH = Duration.ofMinutes(1L);
@VisibleForTesting
protected final Duration MAX_PROACTIVE_TOKEN_REFRESH =
MIN_PROACTIVE_TOKEN_REFRESH.plusMinutes(9L);
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
MIN_PROACTIVE_TOKEN_REFRESH.plusMinutes(9L);
MIN_PROACTIVE_TOKEN_REFRESH.plusMinutes(8L);

1+8 = 9, you mean that or 10?

Comment on lines +104 to +107
REFRESH_ENDPOINT(false, SupportedSource.ENV_VAR_OR_REMOTE), CLIENT_ID(false,
SupportedSource.ENV_VAR_OR_REMOTE), GRANT_TYPE(false,
SupportedSource.ENV_VAR), ACCESS_TOKEN(true,
SupportedSource.ENV_VAR_OR_REMOTE),
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

weird indent

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

cursor does this; i think it's actually due to IDE (vscode)

(v.versionStages().contains("AWSCURRENT") || v.versionStages().contains("AWSPREVIOUS")))
.sorted((v1, v2) -> {
// Sort by lastAccessedDate if available, otherwise by createdDate
if (v1.lastAccessedDate() != null && v2.lastAccessedDate() != null) {
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

what's the rationale for sorting by accessed date? I would expect just createdDate DESC

Optional.ofNullable(p.getRight().getExpirationDate()).orElse(null),
Comparator.nullsLast(Comparator.reverseOrder()))
.thenComparing(
p -> p.getLeft().getVersion(),
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Double checking: can p be null?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

out of abundance of caution, be defensive; I think in some impl it might be.

throws IOException;
}

// TODO: big enough to move to its own file
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

👍

eschultink and others added 2 commits November 10, 2025 20:29
…/OAuthRefreshTokenSourceAuthStrategy.java

Co-authored-by: Copilot <175728472+Copilot@users.noreply.github.com>
Copy link

@cursor cursor bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This PR is being reviewed by Cursor Bugbot

Details

You are on the Bugbot Free tier. On this plan, Bugbot will review limited PRs each billing cycle.

To receive Bugbot reviews on all of your PRs, visit the Cursor dashboard to activate Pro and start your 14-day free trial.

return ConfigService.ConfigValueVersion.builder()
.value(value.get())
.lastModifiedDate(Instant.ofEpochSecond(version.getCreateTime().getSeconds()))
.version(extractVersionNumber(version.getName()).toString())
Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Bug: Version Extraction: Unhandled Null Pointer

Calling .toString() on the result of extractVersionNumber() causes a NullPointerException when the version string cannot be parsed as an integer. The extractVersionNumber method returns null in its catch block, but this null value isn't handled before calling .toString().

Fix in Cursor Fix in Web

if (accessToken == null) {
return true;
}
Instant expiresAt = accessToken.getExpirationTime().toInstant();
Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Bug: Null expiration breaks AccessToken refresh.

Calling getExpirationTime().toInstant() on an AccessToken causes a NullPointerException when the token has a null expiration time. Tokens retrieved from secret storage can have null expiration dates (as created by AccessTokenDto.asAccessToken()), but shouldRefresh doesn't handle this case before calling .toInstant() on the potentially null Date.

Fix in Cursor Fix in Web

Comparator.nullsLast(Comparator.reverseOrder()))
.thenComparing(
p -> p.getLeft().getVersion(),
Comparator.reverseOrder()))
Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Bug: Reverse order sorting fails on null values.

Using Comparator.reverseOrder() to sort by version throws NullPointerException when versions are null. The getVersion() can return null (as set in CachingConfigServiceDecorator and CompositeConfigService), but Comparator.reverseOrder() doesn't handle null values. Should use Comparator.nullsLast(Comparator.reverseOrder()) instead.

Fix in Cursor Fix in Web

eschultink and others added 2 commits November 10, 2025 20:35
…siteConfigService.java

Co-authored-by: aperez-worklytics <75276364+aperez-worklytics@users.noreply.github.com>
@eschultink eschultink enabled auto-merge (squash) November 11, 2025 04:36
Comparator.nullsLast(Comparator.reverseOrder()))
.thenComparing(
p -> p.getLeft().getVersion(),
Comparator.reverseOrder()))
Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Bug: String Comparison Breaks Version Numeric Order

The getVersion() returns a String, but Comparator.reverseOrder() performs lexicographic comparison, not numeric. Version "10" would sort before "2" (as "9", "8", "2", "10", "1" in descending lexicographic order), causing the wrong token to be selected when version numbers exceed single digits. This breaks the intended behavior of selecting the most recent version as a tiebreaker.

Fix in Cursor Fix in Web

@eschultink eschultink merged commit 4494564 into rc-v0.5.12 Nov 11, 2025
49 checks passed
@eschultink eschultink deleted the s211-multiple-tokens branch November 11, 2025 04:54
Copy link

@cursor cursor bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Bug: Missing Return: Logic Flow Violation

Missing return statement before Optional.empty() at line 120. When property.isEnvVarOnly() is true, execution continues instead of returning early, causing the method to attempt reading from Secret Manager for env-var-only properties, which violates the intended behavior and may result in unexpected errors or incorrect values being returned.

java/impl/gcp/src/main/java/co/worklytics/psoxy/SecretManagerConfigService.java#L118-L121

@Override
public Optional<String> getConfigPropertyAsOptional(ConfigProperty property) {
if (property.isEnvVarOnly()) {
Optional.empty();

Fix in Cursor Fix in Web


eschultink added a commit that referenced this pull request Nov 12, 2025
* Bump org.apache.httpcomponents.client5:httpclient5 in /java/gateway-core

Bumps [org.apache.httpcomponents.client5:httpclient5](https://github.com/apache/httpcomponents-client) from 5.4.1 to 5.4.3.
- [Changelog](https://github.com/apache/httpcomponents-client/blob/rel/v5.4.3/RELEASE_NOTES.txt)
- [Commits](apache/httpcomponents-client@rel/v5.4.1...rel/v5.4.3)

---
updated-dependencies:
- dependency-name: org.apache.httpcomponents.client5:httpclient5
  dependency-version: 5.4.3
  dependency-type: direct:production
...

Signed-off-by: dependabot[bot] <support@github.com>

* prep v0.5.12

* Salesforce rules refresh (#1039)

* Update rules

* Support example with POST

* bump many dependencies (#1038)

* apache commons codec to 1.19.0

* more dep updates

* webhook collector auth perf (#1036)

* add logic to keep lambdas warm

* fix style

* init public keys in mem for webhook collectors on start-up

* aws replay tool (#1034)

* aws replace tool

* add documentation

* webhook collector output prefix support  (#1035)

* improve webhook collector mode configuration

* support webhook output prefix

* fix style

* fix path stuff

* fix missed rename

* fix various test issues

* gmail empty header fix (#1040)

* deal with empty header case more correctly; that's quite clearly a bug

* add test of empty CC case

* GCP VPC doc improvements (#1042)

* improve GCP vpc docs + conditions

* fix style, unrelated

* improve error feedback on network connectivity issues

* doc that GCP VPC needs external connectivity via router/nat for non-google sources

* move validation up to dop

* try to skip compile in tests

* Apply suggestion from @Copilot

Co-authored-by: Copilot <175728472+Copilot@users.noreply.github.com>

* Apply suggestion from @Copilot

Co-authored-by: Copilot <175728472+Copilot@users.noreply.github.com>

---------

Co-authored-by: Copilot <175728472+Copilot@users.noreply.github.com>

* fix some references to role constants (#1043)

* config to deploy packages as mvn libs via GitHub Packages (#1041)

* improve OAuth token refresh (#1037)

* interface to secret store to get multiple versions of things

* refactor how token refreshin works, so potentially more proactive / better re-use

* pick most recent token based on expirationDate

* cr feedback

* fix refactor

* remove stray bracket

* Update java/core/src/main/java/co/worklytics/psoxy/gateway/impl/oauth/OAuthRefreshTokenSourceAuthStrategy.java

Co-authored-by: Copilot <175728472+Copilot@users.noreply.github.com>

* Update java/core/src/main/java/co/worklytics/psoxy/gateway/impl/CompositeConfigService.java

Co-authored-by: aperez-worklytics <75276364+aperez-worklytics@users.noreply.github.com>

* drop sort on lastAccessDate; only date-level granularity, so why bother

---------

Co-authored-by: Copilot <175728472+Copilot@users.noreply.github.com>
Co-authored-by: aperez-worklytics <75276364+aperez-worklytics@users.noreply.github.com>

* more psoxy constants (#1044)

* expose more outputs relevant to gcp-hosted deployments

* gcp min perms to host

* expose gcp network roles, for completenes

* doc gcp vpc roles

* doc that Project IAM Admin is required

* fix docs

* Update infra/modules/psoxy-constants/main.tf

Co-authored-by: Copilot <175728472+Copilot@users.noreply.github.com>

* Update infra/modules/psoxy-constants/outputs.tf

Co-authored-by: Copilot <175728472+Copilot@users.noreply.github.com>

* Update docs/gcp/vpc.md

Co-authored-by: Copilot <175728472+Copilot@users.noreply.github.com>

---------

Co-authored-by: Copilot <175728472+Copilot@users.noreply.github.com>

* prep v0.5.12 (#1046)

* fix bad comment

* try to make sbom stuff more robust

* drop mvn plugins that are problematic, see to cause build errors

* use reactor build

* sboms

* fix sbom generation

* update package-lock in psoxy-test

* fix gcp release artifact script

* prompt user if gcp artifact has already been published

* prompt user if aws artifact has already been published

* warn if we're NOT running in the expected place

---------

Signed-off-by: dependabot[bot] <support@github.com>
Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com>
Co-authored-by: aperez-worklytics <75276364+aperez-worklytics@users.noreply.github.com>
Co-authored-by: Copilot <175728472+Copilot@users.noreply.github.com>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

4 participants