-
Notifications
You must be signed in to change notification settings - Fork 3k
API, Flink: Correct Handling of Table Names Containing Dots #14932
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
base: main
Are you sure you want to change the base?
Conversation
|
@peach12345: Thanks for the PR. Could we create unit tests ensuring that the fix works, and making sure that changes added later will not break this behavior again? |
|
@pvary: I will add some tests for those cases. |
mxm
left a comment
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This is an interesting find. On the one hand, using dots in table names does not seem to be a good practice. On the other hand, it should be valid in SQL, if properly escaped.
|
|
||
| /** Identifies a table in iceberg catalog. */ | ||
| public class TableIdentifier { | ||
| public class TableIdentifier implements Serializable { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Does making this class Serializable really fix the root cause? It looks like the parse(String) method would have to have a different parsing logic to respect escaped dots.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The root cause was the following line:
this.identifier = tableIdentifier.toString();
Instead of storing the string value, I stored the TableIdentifier object itself. As a result, the TableIdentifier class needs to implement Serializable.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
That makes sense, but wouldn't TableIdentifier.parse(tableName) still suffer from the same issue?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yes, you’re right. When using the parse function, dots are treated as separators.
Maybe parsing isn’t the best approach in this case. Would it make more sense to work with the complete object instead? Would of course be a breaking change
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Because the information about which part of the string represents the namespace is lost.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think the parse function is intended to work with a String. If you pass in the object instead, you could also just use the normal constructor. Parsing something like namespace.`table.name` should work.
| * specific language governing permissions and limitations | ||
| * under the License. | ||
| */ | ||
| package org.apache.iceberg.flink; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This doesn't seem to be related directly to Flink, but rather directly to Iceberg core.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Please ignore. Will rewrite those tests.
This Pull Request fixes an issue where table names containing dots were incorrectly treated as separators in the Catalog. As a result, table names were unnecessarily parsed and split. The update ensures that dots within table names are handled correctly and no longer trigger the parsing logic.
#12607