-
Notifications
You must be signed in to change notification settings - Fork 183
Add resume functionality for data transfer from Azure Table Storage to Cosmos DB table API #72
base: develop
Are you sure you want to change the base?
Conversation
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think the idea is nice, but implementation is broken in multiple ways.
First, I think this should be completely agnostic to the sink that is used. Basically, if source can resume, then it doesn't matter, if sink is aware of this or not.
Second, we should not create random files behind user's back. It should be explicitly asked where user wants to store the continuation token, with an option to just present it on cancel.
Third, current logic to generate the continuation token takes dependency on something that can change in the future without breaking continuation ability. For instance, it generates the "signature" based on the JSON serialized source and target configuration. Now, if user decides to change the batch size in the sink - it will be a new signature, thus resume won't be possible.
There are multiple other issues, so I didn't even finish the review.
I think we should work on a proper design for continuation first, and then jump onto implementation. We already have IDs returned by sources, so technically we can use that as a token for source. Infrastructure knows when item was successfully inserted into the target, so we don't need to modify the sink to report that. Also, to be future-proof we should design the configuration in such a way so that it allows continuation to be stored in other places, than just the local file. We don't have to implement them now, but it should not be a breaking change later.
No description provided.