Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

New fastsync branch #11

Merged
merged 1 commit into from
Aug 11, 2022
Merged

New fastsync branch #11

merged 1 commit into from
Aug 11, 2022

Conversation

brose7230
Copy link
Collaborator

No description provided.

@brose7230 brose7230 self-assigned this Jun 9, 2022
Copy link

@eric-roll eric-roll left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Looks good, just a couple small questions/discussions if you want to take a look.

import singer
from singer import metadata
from singer import utils
import singer.metrics as metrics

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Remove unused imports

@@ -32,7 +42,19 @@ def generate_bookmark_keys(catalog_entry):
bookmark_keys = base_bookmark_keys

return bookmark_keys

def write_dataframe_record(row, catalog_entry, stream_version, columns, table_stream, time_extracted):

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Are catalog_entry or columns used anywhere? If not can we remove them?

)

columns.sort()
select_sql = common.fast_sync_generate_select_sql(catalog_entry, columns)

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think we already discussed this, but again just to confirm - we want full syncs to always default to fast sync right? We don't even want the option to do a full sync without it?

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It's ok if that's what we want to do, but I think my preference is to provide the option to use either fast sync or the Singer spec for the full table refresh. That could look something like moving all of this logic into another method called fastsync_table and keeping the old logic in sync_table. From there in init.py in the method do_sync_full_table we could add a line for if fastsync: fastsync_table else: sync_table.

This way we could turn fast sync on on a table by table basis if we wanted.

@brose7230 brose7230 merged commit d0b098a into master Aug 11, 2022
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants