Skip to content

Conversation

@hantangwangd
Copy link
Member

@hantangwangd hantangwangd commented Jul 18, 2024

Description

This PR fix the session corruption cause by a failed statement in non-autocommit transaction, enable executing rollback to end the aborted transaction block.

Motivation

Fix: #23246

Test Plan

Contributor checklist

  • Please make sure your submission complies with our development, formatting, commit message, and attribution guidelines.
  • PR description addresses the issue accurately and concisely. If the change is non-trivial, a GitHub Issue is referenced.
  • Documented new properties (with its default value), SQL syntax, functions, or other functionality.
  • If release notes are required, they follow the release notes guidelines.
  • Adequate tests were added if applicable.
  • CI passed.

Release Notes

== RELEASE NOTES ==

General Changes
* Fix ROLLBACK statement to ensure it successfully abort non-auto commit transactions corrupted by failed statements

import static com.facebook.presto.spark.util.PrestoSparkUtils.getActionResultWithTimeout;
import static com.facebook.presto.spi.StandardErrorCode.GENERIC_INTERNAL_ERROR;
import static com.facebook.presto.spi.StandardErrorCode.INVALID_SESSION_PROPERTY;
import static com.facebook.presto.sql.QueryUtil.isRollBack;
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This couples to a new package & module. Probably not worth it for the trivial method you're importing.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks for the comment, I also struggled here for a while. Fixed as your suggestion.

@tdcmeehan tdcmeehan self-assigned this Jul 18, 2024
Copy link
Contributor

@tdcmeehan tdcmeehan left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It seems like ignoreTransactionState is more like isRollingBack. Do you think that's an accurate way of describing it?

@hantangwangd
Copy link
Member Author

It seems like ignoreTransactionState is more like isRollingBack. Do you think that's an accurate way of describing it?

My thought is, we may want to support other statements in the future for this situation (such as commit, although the commit result should fail, but it should be able to close the entire transaction too). So name this flag ignoreTransactionState rather than isRollback, do you think it makes sense?

@elharo
Copy link
Contributor

elharo commented Sep 23, 2024

Can you resolve the conflicts?

@hantangwangd
Copy link
Member Author

Can you resolve the conflicts?

Sure, the conflicts are resolved!

@Override
public void checkCanSetRole(ConnectorTransactionHandle transaction, ConnectorIdentity identity, AccessControlContext context, String role, String catalogName)
{
SemiTransactionalHiveMetastore metastore = getMetastore(transaction);
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Can you help me understand why we need to check for null as part of this bug fix?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

When a statement in a non-autocommit transaction fails, InMemoryTransactionManager.abortInternal() is triggered, which causes the relevant hive connector transaction manager to clean up it's corresponding connector transaction. However, the transaction in InMemoryTransactionManager is just marked as failed, but not removed, referring to https://github.com/prestodb/presto/blob/master/presto-main/src/main/java/com/facebook/presto/execution/QueryStateMachine.java#L918.

Therefore, if we do not check the null value here, an NPE exception will be encountered when executing the rollback statement.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

If it's possible for metastore to be null, then shouldn't we check everywhere that we call metastore? and also label it as @Nullable?

@justrelax19
Copy link

@tdcmeehan and @hantangwangd any update on the above PR since its been open for about 5 months now and if we can merge and close it ?
Because we are also receiving the aborted transaction block issue and seems like the above PR will fix it.

@steveburnett
Copy link
Contributor

@hantangwangd, to move this one towards merge-ready, when you have time would you rebase and resolve the file conflicts?

@hantangwangd
Copy link
Member Author

Sure, I will rebase and resolve the conflicts later this day. @steveburnett @justrelax19

Copy link
Contributor

@ZacBlanco ZacBlanco left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I mostly agree with Tim's thought from his comment here

But I also understand your reasoning in the response. Can we add a well-written javadoc which explains the meaning and use of this field so that it's clearer to developers the intention of this name? isRollback seems clear to me, but isIgnoreTransactionState does not

@Override
public void checkCanSetRole(ConnectorTransactionHandle transaction, ConnectorIdentity identity, AccessControlContext context, String role, String catalogName)
{
SemiTransactionalHiveMetastore metastore = getMetastore(transaction);
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

If it's possible for metastore to be null, then shouldn't we check everywhere that we call metastore? and also label it as @Nullable?

{
TransactionalMetadata metadata = hiveTransactionManager.get(transaction);
return metadata.getMetastore();
return metadata == null ? null : metadata.getMetastore();
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think we should annotate this method with @Nullable if we are making the return behavior like this, or return Optional<SemiTransactionalHiveMetastore>

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Good suggestion, I have changed the method to return an Optional<SemiTransactionalHiveMetastore>, and check everywhere we call getMetastore(transaction). Please take a look when convenient!

@hantangwangd
Copy link
Member Author

I mostly agree with Tim's thought from his comment here

But I also understand your reasoning in the response. Can we add a well-written javadoc which explains the meaning and use of this field so that it's clearer to developers the intention of this name? isRollback seems clear to me, but isIgnoreTransactionState does not

Thanks for the feedback, I agree with you that isIgnoreTransactionState might bring some confusions to future developers. To make its intention more clearer, and to make it applicable to some other statements (for example, maybe commit in the future) as well, do you think it makes sense to call it enableRollback? I'm open on this, and do not have a strong inclination, please let me know your thought.

@hantangwangd hantangwangd force-pushed the fix_transaction branch 2 times, most recently from 72abe0f to 55570d8 Compare April 21, 2025 10:20
Copy link
Contributor

@ZacBlanco ZacBlanco left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Two minor things, otherwise LGTM

assertUpdate("drop table if exists test_non_autocommit_table");
}
catch (Exception e) {
// ignored for connector compatibility
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

What does "for connector compatibility imply" here? Do some connectors behave differently?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Nice catch! Originally, I put the test case into AbstractTestIntegrationSmokeTest, and then found that its subclass TestJdbcIntegrationSmokeTest doesn't support table deletion, so I added this logic for compatibility.

Then I moved this test case to AbstractTestDistributedQueries, since there are much more subclasses of AbstractTestIntegrationSmokeTest that should ignore this test case because they don't support creating tables.

After recheck the subclasses, I found there is no need to add this logic for compatibility in AbstractTestDistributedQueries. So now it's removed again!

Copy link
Contributor

@ZacBlanco ZacBlanco left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks for this fix! LGTM

@hantangwangd hantangwangd merged commit 34ef2c7 into prestodb:master Apr 22, 2025
98 checks passed
@hantangwangd hantangwangd deleted the fix_transaction branch April 22, 2025 23:12
@ZacBlanco ZacBlanco mentioned this pull request May 29, 2025
21 tasks
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

Session is completely corrupted by the failed statement in a non-autocommit transaction

6 participants