-
Notifications
You must be signed in to change notification settings - Fork 1
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
New fastsync branch #11
Conversation
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Looks good, just a couple small questions/discussions if you want to take a look.
import singer | ||
from singer import metadata | ||
from singer import utils | ||
import singer.metrics as metrics |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Remove unused imports
@@ -32,7 +42,19 @@ def generate_bookmark_keys(catalog_entry): | |||
bookmark_keys = base_bookmark_keys | |||
|
|||
return bookmark_keys | |||
|
|||
def write_dataframe_record(row, catalog_entry, stream_version, columns, table_stream, time_extracted): |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Are catalog_entry
or columns
used anywhere? If not can we remove them?
) | ||
|
||
columns.sort() | ||
select_sql = common.fast_sync_generate_select_sql(catalog_entry, columns) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think we already discussed this, but again just to confirm - we want full syncs to always default to fast sync right? We don't even want the option to do a full sync without it?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
It's ok if that's what we want to do, but I think my preference is to provide the option to use either fast sync or the Singer spec for the full table refresh. That could look something like moving all of this logic into another method called fastsync_table
and keeping the old logic in sync_table
. From there in init.py in the method do_sync_full_table
we could add a line for if fastsync: fastsync_table
else: sync_table
.
This way we could turn fast sync on on a table by table basis if we wanted.
No description provided.