You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
{{ message }}
This repository has been archived by the owner on Sep 23, 2024. It is now read-only.
The recharge pipeline currently is facing two problems:
it occasionally will not finish running and will seemingly run infinitely, loading no data into redshift
it only loads 50 rows into redshift no matter what I set the batch size at
The batch_size_rows column currently is configured to 1000, but it has loaded 100 rows twice when the batch_size_rows is set to 100, but has also only loaded 50 rows at this setting as well
This is while using the following components:
Extractor:
name: dev-tap-recharge-subscriptions
inherits from: tap-recharge-subscriptions
Loader:
name: target-redshift-pipelinewise-recharge
variant: transferwise
inherits from: target-redshift
Pipelines:
name: dev-recharge-pipelinewise
Will not finish running and will run infinitely
When the pipeline runs infinitely, “INFO METRIC: {"type": "timer", "metric": "http_request_duration", "value": 1.217686414718628, "tags": {"http_status_code": 200, "status": "succeeded"}} cmd_type=extractor job_id=dev-recharge-pipelinewise name=dev-tap-recharge-subscriptions run_id=147b6430-da73-4bae-8eb9-26e5aaa733de stdio=stderr" will be produced continuously. The pipeline will never finish, and no data will ever be loaded on redshift. Occasionally, after breaking the pipeline manually and running it again, the pipeline will immediately run in a short (<20 seconds) time, and data will be loaded into redshift successfully. There is no noticeable pattern for why this only sometimes happens, and I haven’t been able to figure out what causes this to occur.
Only loads 50 rows into redshift no matter what it is set to
The second problem with the pipeline is that it only loads 50 rows into redshift no matter what I set the batch size at. The batch_size_rows column currently is configured to 1000, but it has loaded 100 rows twice when the batch_size_rows is set to 100, but has also only loaded 50 rows at this setting as well.
Troubleshooting Techniques tried
Ways I’ve tried to troubleshoot these issues
Transferwise/Pipelinewise Variant Loader(s):
Creating and configuring a new loader using transferwise to load recharge data
Manually configuring the datamill loader in meltano.yml
Setting the replication key (updated_at) to ascending OR descending in streams.py
Creating a state file for the pipeline to follow off of/Copying a successful state file from a working pipeline to use for recharge pipeline
Recharge Issue Summary: Transferwise
Meltano Version 1.102.0
Linux System
Redshift Database
tap recharge repo: link
target redshift (transferwise): link
infinity pipelines
Reproducing error:
Run pipeline into Redshift table (manually)
The recharge pipeline currently is facing two problems:
it occasionally will not finish running and will seemingly run infinitely, loading no data into redshift
it only loads 50 rows into redshift no matter what I set the batch size at
The batch_size_rows column currently is configured to 1000, but it has loaded 100 rows twice when the batch_size_rows is set to 100, but has also only loaded 50 rows at this setting as well
This is while using the following components:
Extractor:
name: dev-tap-recharge-subscriptions
inherits from: tap-recharge-subscriptions
Loader:
name: target-redshift-pipelinewise-recharge
variant: transferwise
inherits from: target-redshift
Pipelines:
name: dev-recharge-pipelinewise
When the pipeline runs infinitely, “INFO METRIC: {"type": "timer", "metric": "http_request_duration", "value": 1.217686414718628, "tags": {"http_status_code": 200, "status": "succeeded"}} cmd_type=extractor job_id=dev-recharge-pipelinewise name=dev-tap-recharge-subscriptions run_id=147b6430-da73-4bae-8eb9-26e5aaa733de stdio=stderr" will be produced continuously. The pipeline will never finish, and no data will ever be loaded on redshift. Occasionally, after breaking the pipeline manually and running it again, the pipeline will immediately run in a short (<20 seconds) time, and data will be loaded into redshift successfully. There is no noticeable pattern for why this only sometimes happens, and I haven’t been able to figure out what causes this to occur.
The second problem with the pipeline is that it only loads 50 rows into redshift no matter what I set the batch size at. The batch_size_rows column currently is configured to 1000, but it has loaded 100 rows twice when the batch_size_rows is set to 100, but has also only loaded 50 rows at this setting as well.
Troubleshooting Techniques tried
Ways I’ve tried to troubleshoot these issues
Transferwise/Pipelinewise Variant Loader(s):
Creating and configuring a new loader using transferwise to load recharge data
Manually configuring the datamill loader in meltano.yml
Setting the replication key (updated_at) to ascending OR descending in streams.py
Creating a state file for the pipeline to follow off of/Copying a successful state file from a working pipeline to use for recharge pipeline
Documented Conversations
Conversation link
Location
topics discussed
CRITICAL cursor already closed / connection already closed · Issue #48 · datamill-co/target-redshift
Github Issue - target-Redshift (datamill)
SSL connection has been closed unexpectedly
https://meltano.slack.com/archives/C01TCRBBJD7/p1649777577117929 - Connect to preview
Slack
SSL connection has been closed unexpectedly
https://meltano.slack.com/archives/C01UTUSP34M/p1654122008077529 - Connect to preview
Slack
Batch Size & SSL Connection
The text was updated successfully, but these errors were encountered: