Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Ensure the deletion consistency of topic and schema #14608

Merged
merged 3 commits into from
Apr 27, 2022
Merged

Ensure the deletion consistency of topic and schema #14608

merged 3 commits into from
Apr 27, 2022

Conversation

yuruguo
Copy link
Contributor

@yuruguo yuruguo commented Mar 8, 2022

Master Issue: #12795
Dev email: discuss and reply

Motivation

Deleting topic also deletes its schema.

Documentation

  • no-need-doc

@github-actions github-actions bot added the doc-not-needed Your PR changes do not impact docs label Mar 8, 2022
Copy link
Member

@nlu90 nlu90 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM except one question.

futures.add(deleteSchema().thenApply(schemaVersion -> null));
}

futures.add(deleteSchema().thenApply(schemaVersion -> null));
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

What happens to topics with multiple versions fo schema? Will all versions be cleaned?

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yes, will delete all the versions

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Hmmm, it's a little confusing for using persistent schema storage for a non-persistent topic, after the broker restart and the producer/consumer will not connect to a non-persistent topic, the non-persistent topic is equivalent to being deleted, In some cases that users using a random name for non-persistent topic, after the application restarts, the topic name will be changes, looks like here will introduce a schema resource leak.

@codelipenghui codelipenghui added this to the 2.11.0 milestone Mar 15, 2022
@codelipenghui codelipenghui added the release/important-notice The changes which are important should be mentioned in the release note label Mar 15, 2022
Copy link
Contributor

@codelipenghui codelipenghui left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I noticed there is a behavior change that might impact users' applications if the schema auto uploading is disabled.

Before this change, users can delete the topic without deleting schema, so that the application can connect to the newly created topic with the same topic name and existing schema, but after this change, not able to connect to the topic again.

Hmmm, before, users usually use the delete topic to truncate data of the topic. After #10326 introduced truncate topic, I think for now users should to use the truncate method to cleanup data of the topic. So there is no reason why the user still wants to use this topic, but also delete it.

We'd better clarify in the document.

futures.add(deleteSchema().thenApply(schemaVersion -> null));
}

futures.add(deleteSchema().thenApply(schemaVersion -> null));
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yes, will delete all the versions

futures.add(deleteSchema().thenApply(schemaVersion -> null));
}

futures.add(deleteSchema().thenApply(schemaVersion -> null));
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Hmmm, it's a little confusing for using persistent schema storage for a non-persistent topic, after the broker restart and the producer/consumer will not connect to a non-persistent topic, the non-persistent topic is equivalent to being deleted, In some cases that users using a random name for non-persistent topic, after the application restarts, the topic name will be changes, looks like here will introduce a schema resource leak.

@yuruguo
Copy link
Contributor Author

yuruguo commented Mar 15, 2022

I noticed there is a behavior change that might impact users' applications if the schema auto uploading is disabled.

Before this change, users can delete the topic without deleting schema, so that the application can connect to the newly created topic with the same topic name and existing schema, but after this change, not able to connect to the topic again.

Hmmm, before, users usually use the delete topic to truncate data of the topic. After #10326 introduced truncate topic, I think for now users should to use the truncate method to cleanup data of the topic. So there is no reason why the user still wants to use this topic, but also delete it.

We'd better clarify in the document.

I have added explanation in the CLI / API.
PTAL again, thx!

Copy link
Contributor

@codelipenghui codelipenghui left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Please also help add a test to make sure the schema been removed after the topic has been deleted.

@Parameter(names = { "-d",
"--deleteSchema" }, description = "Delete schema while deleting topic")
@Parameter(names = {"-d", "--deleteSchema"}, description = "Delete schema while deleting topic, "
+ "but the parameter is invalid and the schema is always deleted", hidden = true)
private boolean deleteSchema = false;
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Should also update the default value to true?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Maybe we can't change the default value to true because that would change its usage.
Before modification: it means false if you don't specify -d, and it means true if you specify -d;
After modification: it means true if you don’t specify -d, but it means false only you specify -d false.
The two are not fully compatible.
It is similar to #12128

@Parameter(names = { "-d",
"--deleteSchema" }, description = "Delete schema while deleting topic")
@Parameter(names = {"-d", "--deleteSchema"}, description = "Delete schema while deleting topic, "
+ "but the parameter is invalid and the schema is always deleted", hidden = true)
private boolean deleteSchema = false;
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Should also update the default value to true?

Copy link
Contributor Author

@yuruguo yuruguo Mar 23, 2022

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

same with above

Comment on lines -584 to -602
if (deleteSchema) {
return pulsar().getBrokerService()
.deleteSchemaStorage(topicName.getPartition(0).toString())
.thenCompose(unused ->
internalRemovePartitionsAuthenticationPoliciesAsync(numPartitions))
.thenCompose(unused2 ->
internalRemovePartitionsTopicAsync(numPartitions, force));
}
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

If remove these lines, how does the method remove the schema?

Copy link
Contributor Author

@yuruguo yuruguo Mar 23, 2022

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Comment on lines -1031 to -1057
return t.deleteSchema().thenCompose(schemaVersion -> {
log.info("Successfully delete topic {}'s schema of version {}", t.getName(), schemaVersion);
return t.delete();
});
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

If remove these lines how do we delete the schema?

Copy link
Contributor Author

@yuruguo yuruguo Mar 23, 2022

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

t.delete() will delete schema by default.

details: delete() -> delete(boolean,boolean,boolean) -> deleteSchema()

@yuruguo
Copy link
Contributor Author

yuruguo commented Mar 23, 2022

Please also help add a test to make sure the schema been removed after the topic has been deleted.

The test testDeleteTopicAndSchemaForV1 in SchemaTest can cover this change for v1/topic and I have added addtional test for v2/topic.

@yuruguo yuruguo requested a review from codelipenghui March 28, 2022 23:29
@yuruguo yuruguo requested a review from eolivelli April 27, 2022 12:14
@yuruguo yuruguo merged commit c06b17d into apache:master Apr 27, 2022
dlg99 pushed a commit to dlg99/pulsar that referenced this pull request Nov 8, 2022
* Ensure the deletion consistency of topic and schema

(cherry picked from commit c06b17d)
dlg99 pushed a commit to datastax/pulsar that referenced this pull request Nov 8, 2022
* Ensure the deletion consistency of topic and schema

(cherry picked from commit c06b17d)
nicoloboschi added a commit to datastax/pulsar that referenced this pull request Jan 10, 2023
nicoloboschi added a commit to datastax/pulsar that referenced this pull request Jan 10, 2023
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
area/schema doc-not-needed Your PR changes do not impact docs release/important-notice The changes which are important should be mentioned in the release note
Projects
None yet
Development

Successfully merging this pull request may close these issues.

4 participants