Skip to content

Comments

Add idempotency adapter and E2E coverage#14773

Merged
huaxingao merged 6 commits intoapache:mainfrom
huaxingao:idempotency_4
Jan 16, 2026
Merged

Add idempotency adapter and E2E coverage#14773
huaxingao merged 6 commits intoapache:mainfrom
huaxingao:idempotency_4

Conversation

@huaxingao
Copy link
Contributor

This is the 4th PR for Idempotency Key. Introduce test-only idempotency in RESTCatalogAdapter:

  • in-memory store keyed by Idempotency-Key,
  • cached-success replay
  • TTL expiry
  • optional “503 after success” simulation.

Add E2E tests:

  • testIdempotentDuplicateCreateReturnsCached
  • testIdempotencyKeyLifetimeExpiredTreatsAsNew
  • testIdempotentCreateReplayAfterSimulated503

First PR: #14649
SecondPR: #14700
Third PR: #14740

@github-actions github-actions bot added the core label Dec 5, 2025
long now = System.currentTimeMillis();
boolean expired =
existing.status == IdempotencyEntry.Status.FINALIZED
&& (now - existing.firstSeenMillis) > idempotencyLifetimeMillis;
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

can we use Instant.now().minusMillis(existing.firstSeenMillis) here as that is more readable when adding isBefore / isAfter

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

you could also make this a function on the entry itself, so that you'd only call if(!entry.expired()) here

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Fixed. Thanks!

}

// Test hooks/configuration for idempotency behavior
public static void simulate503OnFirstSuccessForKey(String key) {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'm not really a fan of such a way to configure this. We should explore doing this similarly how remote scan planning sets the planning behavior for particular tests

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks for the suggestion. I removed the static hook from CatalogHandlers and mirrored the remote planning pattern:

  • Added a per-adapter IdempotencyBehavior in RESTCatalogAdapter and apply it post‑success in execute(...).
  • Tests configure behavior via the adapter (e.g., adapterForRESTServer.simulate503OnFirstSuccessForKey(key)), while routes still call CatalogHandlers.withIdempotency(...).

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I don't think we should add this functionality at all to the adapter. Instead, what about just having a custom adapter that simulates this behavior for that particular test case. We already use custom adapters with custom behavior in a lot of other test methods.
My main concern is that we're trying to force "test conditions" into code that shouldn't have such test conditions, because RESTCatalogAdapter could be used in real usage scenarios and it shouldn't ideally contain test code. Additionally, this opens up a precedent for future code modifications where it might seem ok to add more and more testing conditions to the adapter or to CatalogHandlers

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I guess one alternative would be to have a custom adapter in the testFixtures, similar to

class RESTServerCatalogAdapter extends RESTCatalogAdapter {
where we have a custom adapter that performs credential vending

Copy link
Contributor Author

@huaxingao huaxingao Jan 8, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks — agreed. I removed the test hook from CatalogHandlers and kept RESTCatalogAdapter free of test knobs.
The transient-503 simulation is now implemented only in a test-only adapter used by TestRESTCatalog (HeaderValidatingAdapter), so production-ish code paths don’t carry test conditions.

AuthSession httpSession = am.initSession(httpBase, conf);
RESTClient http = httpBase.withAuthSession(httpSession);

CreateTableRequest req =
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

seems a bit weird to create the request here. Can we not mock stuff as we do in a bunch of other tests and verify that the correct requests were sent?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I’ve added a reusable verify helper and now assert the request shape (method/path/headers, including Idempotency‑Key) after the POSTs. I’m keeping a real CreateTableRequest in the E2E tests because it’s needed to exercise serialization and replay.

@huaxingao huaxingao force-pushed the idempotency_4 branch 2 times, most recently from 47aae84 to a8a0a16 Compare December 15, 2025 18:34
@huaxingao
Copy link
Contributor Author

@nastra @amogh-jahagirdar @singhpk234 Could you please take a look when you have a moment? Thanks!

simulate503OnFirstSuccessKeys.add(key);
}

private static OAuthTokenResponse handleOAuthRequest(Object body) {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

can we wrap this to plan-api endpoints too ?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Good point. Plan endpoints aren’t wrapped with withIdempotency yet. I’ll follow up with a separate PR to add Idempotency-Key support for the plan APIs (wrap handlers + add tests) and we can extend the simulation coverage there.

}

/** Test helper to simulate a transient 503 after the first successful mutation for a key. */
public void simulate503OnFirstSuccessForKey(String key) {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

nit : should we make it generic like simulate status code or something ?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Good suggestion. I kept it 503-specific since the test hook is only used to exercise retry behavior for transient 503/commit-state-unknown; happy to generalize later if we add tests needing other codes.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It would be great to not add test-specific code to the adapter/catalog handler. See my other comment on this: #14773 (comment)

Copy link
Contributor

@singhpk234 singhpk234 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Mostly LGTM, thanks @huaxingao left minor comments

Copy link
Contributor

@singhpk234 singhpk234 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM, thanks @huaxingao !

Lets give some time to @nastra @amogh-jahagirdar incase they have further feedbacks !

@huaxingao
Copy link
Contributor Author

@nastra I removed the test hook from CatalogHandlers and kept RESTCatalogAdapter free of test knobs.
The transient-503 simulation is now implemented only in a test-only adapter used by TestRESTCatalog (HeaderValidatingAdapter). Could you please check if this looks OK to you? Thanks!

Copy link
Contributor

@amogh-jahagirdar amogh-jahagirdar left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

sorry for the late review, overall I think it looks reasonable, just minor comments, will wait in case @nastra wanted to do another pass. Thanks @huaxingao

Comment on lines 143 to 144
private final Set<String> simulate503OnFirstSuccessKeys =
org.apache.iceberg.relocated.com.google.common.collect.Sets.newConcurrentHashSet();
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Minor: Maybe this should be a mapping from the key to a generic throwable that's thrown, then we could add other failure cases and make sure idempotency returns or throws as expected.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Good idea. I generalized the test-only hook to a key -> RuntimeException map so tests can simulate different post-success transient failures while still keeping simulate503OnFirstSuccessForKey as a convenience wrapper for the current case.

{
if (PropertyUtil.propertyAsBoolean(vars, "purgeRequested", false)) {
CatalogHandlers.purgeTable(catalog, tableIdentFromPathVars(vars));
CatalogHandlers.withIdempotency(
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Minor: I wonder if withIdempotency even needs to be exposed? Feels like it could just be invoked internally in each of the existing CatalogHandlers methods?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Good point. The idempotency behavior is driven by the HTTP request headers, so I’m keeping the wrapping at the routing/adapter layer rather than pushing HTTPRequest through every CatalogHandlers.* method.
That said, I agree it doesn’t need to be exposed publicly — I made withIdempotency(...) package-private and it’s only used internally by the REST adapter/routes.

// replaces) the IN_PROGRESS entry for this Idempotency-Key. Only the leader executes the
// action and finalizes the entry; concurrent requests for the same key ("followers") wait on
// the latch and then replay the finalized result/error.
AtomicBoolean isLeader = new AtomicBoolean(false);
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Minor: I'd probably just call this isFirst. Leader/follower makes it seem more than it really is.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Changed to isFirst. Thanks

Comment on lines 3402 to 3406
TableIdentifier ident = env.first();
Pair<RESTClient, Map<String, String>> httpAndHeaders = env.second();
RESTClient http = httpAndHeaders.first();
Map<String, String> headers = httpAndHeaders.second();
CreateTableRequest req = createReq(ident);
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Nit: I think this bit is repeated across tests, any way to abstract some of this behind a helper?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I factored the repeated prepareIdempotentEnv unpacking into a small IdempotentEnv helper

@huaxingao
Copy link
Contributor Author

Thanks @amogh-jahagirdar @geruh @nastra @singhpk234 for your review! Are there any more comments? If not, I plan to merge this PR in two days.

@huaxingao huaxingao merged commit 4f57687 into apache:main Jan 16, 2026
46 checks passed
@huaxingao
Copy link
Contributor Author

I merged this PR. If there are any more comments, I will address in follow-ups. Thank you all for reviewing this PR!

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

Projects

None yet

Development

Successfully merging this pull request may close these issues.

5 participants