Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Code refactor for Iceberg support #792

Merged
merged 14 commits into from
Jul 23, 2024
Merged

Conversation

sfc-gh-alhuang
Copy link
Contributor

The Iceberg support PR on feature branch includes some refactor. Merge the refactor code to reduce conflict when merging feature branch in the future.

This PR includes following changes

  1. Split StreamingIngestStage into StorageManager and StreamingIngestStorage. The StorageManager is responsible for managing storage. This is useful for streaming to Iceberg which a StorageManager could manage multiple storages.
  2. Add SnowflakeServiceClient to handle all api calls to server. As we are going to add some endpoints for Iceberg table streaming, having an all-in-one place api caller makes development easier.

@sfc-gh-alhuang sfc-gh-alhuang marked this pull request as ready for review July 11, 2024 22:14
* @param <T> The type of chunk data
* @param <TLocation> the type of location that's being managed (internal stage / external volume)
*/
interface StorageManager<T, TLocation> {
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

nit: IStorageManager, since its an interface? (similarly for all other interfaces?)
I don't think this is something we've really followed consistently everywhere else, but check with Toby on whether this is something we want to start.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

yeah, IStorageManager is better

@@ -228,8 +225,14 @@ public class SnowflakeStreamingIngestClientInternal<T> implements SnowflakeStrea
this.setupMetricsForClient();
}

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Please create a JIRA to track passing this snowflakeServiceClient all the way into OAuthClient, which also does RPC with the snowflake service.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Jira created.

}
DropChannelRequestInternal dropChannelRequest =
new DropChannelRequestInternal(
this.storageManager.getClientPrefix() + "_" + counter.getAndIncrement(),
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

nit: should getNextRequestId kind of behavior be in the storage manager perhaps? Can be a future PR, just curious what you think @sfc-gh-tzhang ?

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Makes sense to me, then counter can be moved to storageManager as well

@@ -161,10 +157,6 @@ public String getOffsetToken() {
return this.offsetToken;
}
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

  1. please undo these changes. you're changing public API surface of a released SDK.

  2. @sfc-gh-tzhang do we have CI tools to catch changes to public API surface? If not lets add a backlog item to catch accidental API surface area changes before a release?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Undo changes.

@@ -26,7 +26,7 @@ public enum ErrorCode {
INVALID_ENCRYPTED_KEY("0018"),
INVALID_DATA_IN_CHUNK("0019"),
IO_ERROR("0020"),
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It looks like error code actually gets exposed to customers via SFException.getErrorCode().
Thus this is a breaking change, please don't rename an error code.

@sfc-gh-tzhang pl confirm?

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

yeah, I tagged you in another thread, please don't modify existing error codes, just add new ones, see #794 (comment)

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Reverted change.

Copy link
Contributor

@sfc-gh-tzhang sfc-gh-tzhang left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Left some comments, otherwise LGTM, thanks!

}
DropChannelRequestInternal dropChannelRequest =
new DropChannelRequestInternal(
this.storageManager.getClientPrefix() + "_" + counter.getAndIncrement(),
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Makes sense to me, then counter can be moved to storageManager as well

* @param <T> The type of chunk data
* @param <TLocation> the type of location that's being managed (internal stage / external volume)
*/
interface StorageManager<T, TLocation> {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

yeah, IStorageManager is better

* The StreamingIngestRequest interface is a marker interface used for type safety in the {@link
* SnowflakeServiceClient} for streaming ingest API request.
*/
interface StreamingIngestRequest {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

IStreamingIngestRequest?

@@ -26,7 +26,7 @@ public enum ErrorCode {
INVALID_ENCRYPTED_KEY("0018"),
INVALID_DATA_IN_CHUNK("0019"),
IO_ERROR("0020"),
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

yeah, I tagged you in another thread, please don't modify existing error codes, just add new ones, see #794 (comment)

@sfc-gh-alhuang sfc-gh-alhuang merged commit e808bbd into master Jul 23, 2024
48 checks passed
@sfc-gh-alhuang sfc-gh-alhuang deleted the iceberg-support-refactor branch July 23, 2024 02:22
sfc-gh-kgaputis pushed a commit to sfc-gh-kgaputis/snowflake-ingest-java that referenced this pull request Sep 12, 2024
Co-authored-by: Hitesh Madan <hitesh.madan@snowflake.com>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants