Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

🎉 Source Pinterest: Add report stream for CAMPAIGN level #28672

Merged
merged 3 commits into from
Jul 31, 2023

Conversation

lazebnyi
Copy link
Collaborator

@lazebnyi lazebnyi commented Jul 25, 2023

What

The change aims to overcome the 90-day data retrieval limitation for Pinterest analytics. Currently, the data pulled from the campaign_analytics stream is restricted to a maximum of 90 days back, causing us to miss out on important data that our Customers require for making informed decisions. By adopting a new approach using the endpoint POST - https://api.pinterest.com/v5/ad_accounts/{ad_account_id}/reports, we can now retrieve data beyond the 90-day mark, up to 914 days back, with a DAY granularity.

https://github.com/airbytehq/oncall/issues/2010

How

The solution involves the following steps:

  • Generating the Report: Instead of relying on the campaign_analytics stream, we will now utilize the new endpoint to POST a report request to Pinterest, specifying the desired date range and metrics.
  • Waiting for Report Completion: After submitting the report request, we will patiently wait for Pinterest to process the data and make the report ready for download.
  • Downloading Enriched Data: Once the report is ready, we will fetch it using the provided URL in the response, obtaining a more comprehensive dataset that includes data beyond the 90-day restriction.

🚨 User Impact 🚨

No

Pre-merge Actions

Updating a connector

Community member or Airbyter

  • Grant edit access to maintainers (instructions)
  • Unit & integration tests added

Airbyter

If this is a community PR, the Airbyte engineer reviewing this PR is responsible for the below items.

  • Create a non-forked branch based on this PR and test the below items on it
  • Build is successful
  • If new credentials are required for use in CI, add them to GSM. Instructions.

@octavia-squidington-iii octavia-squidington-iii added area/connectors Connector related issues area/documentation Improvements or additions to documentation connectors/source/pinterest labels Jul 25, 2023
@github-actions
Copy link
Contributor

github-actions bot commented Jul 25, 2023

Before Merging a Connector Pull Request

Wow! What a great pull request you have here! 🎉

To merge this PR, ensure the following has been done/considered for each connector added or updated:

  • PR name follows PR naming conventions
  • Breaking changes are considered. If a Breaking Change is being introduced, ensure an Airbyte engineer has created a Breaking Change Plan.
  • Connector version has been incremented in the Dockerfile and metadata.yaml according to our Semantic Versioning for Connectors guidelines
  • You've updated the connector's metadata.yaml file any other relevant changes, including a breakingChanges entry for major version bumps. See metadata.yaml docs
  • Secrets in the connector's spec are annotated with airbyte_secret
  • All documentation files are up to date. (README.md, bootstrap.md, docs.md, etc...)
  • Changelog updated in docs/integrations/<source or destination>/<name>.md with an entry for the new version. See changelog example
  • Migration guide updated in docs/integrations/<source or destination>/<name>-migrations.md with an entry for the new version, if the version is a breaking change. See migration guide example
  • If set, you've ensured the icon is present in the platform-internal repo. (Docs)

If the checklist is complete, but the CI check is failing,

  1. Check for hidden checklists in your PR description

  2. Toggle the github label checklist-action-run on/off to re-run the checklist CI.

@octavia-squidington-iii
Copy link
Collaborator

source-pinterest test report (commit 97cffbfd29) - ❌

⏲️ Total pipeline duration: 04mn51s

Step Result
Validate airbyte-integrations/connectors/source-pinterest/metadata.yaml
Connector version semver check
QA checks
Code format checks
Connector package install
Build source-pinterest docker image for platform linux/x86_64
Unit tests
Acceptance tests

🔗 View the logs here

Please note that tests are only run on PR ready for review. Please set your PR to draft mode to not flood the CI engine and upstream service on following commits.
You can run the same pipeline locally on this branch with the airbyte-ci tool with the following command

airbyte-ci connectors --name=source-pinterest test

@octavia-squidington-iii
Copy link
Collaborator

source-pinterest test report (commit 098b5ce9fd) - ❌

⏲️ Total pipeline duration: 04mn09s

Step Result
Validate airbyte-integrations/connectors/source-pinterest/metadata.yaml
Connector version semver check
QA checks
Code format checks
Connector package install
Build source-pinterest docker image for platform linux/x86_64
Unit tests
Acceptance tests

🔗 View the logs here

Please note that tests are only run on PR ready for review. Please set your PR to draft mode to not flood the CI engine and upstream service on following commits.
You can run the same pipeline locally on this branch with the airbyte-ci tool with the following command

airbyte-ci connectors --name=source-pinterest test

@octavia-squidington-iii
Copy link
Collaborator

source-pinterest test report (commit 6a8977f6c2) - ✅

⏲️ Total pipeline duration: 09mn02s

Step Result
Validate airbyte-integrations/connectors/source-pinterest/metadata.yaml
Connector version semver check
QA checks
Code format checks
Connector package install
Build source-pinterest docker image for platform linux/x86_64
Unit tests
Acceptance tests

🔗 View the logs here

Please note that tests are only run on PR ready for review. Please set your PR to draft mode to not flood the CI engine and upstream service on following commits.
You can run the same pipeline locally on this branch with the airbyte-ci tool with the following command

airbyte-ci connectors --name=source-pinterest test

yield record

def should_retry(self, response: requests.Response) -> bool:
if isinstance(response.json(), dict):
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

this check has no sense, let's remove it

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

  1. this original code, just moved to stream file
  2. reponse.json() can return list object as well

report_info.report_status = report_status
if report_status in {ReportStatus.DOES_NOT_EXIST, ReportStatus.EXPIRED, ReportStatus.FAILED, ReportStatus.CANCELLED}:
message = "Report generation failed."
raise ReportGenerationFailure(message)
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I suggest we should make a queue of reports which should be appended with a failed report and retried when we finish iterating over the queue. This should be repeated for a couple of times if we keep on getting the same error for the report.

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

To generate a queue with reports , it is necessary to override the logic at the read method level. The read_records level is only aware of one slice at a time. So, I think can leave this solution to fix oncall and after update with more complex logic as follow up


pending_report_status = [report_info for report_info in report_infos if report_info.report_status != ReportStatus.FINISHED]

if len(pending_report_status) > 0:
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

this should be changed as well if we're going to implement a queue of reports

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

raise ReportStatusError(error)
return report_status.report_status, report_status.url

def _fetch_report_data(self, url: str) -> dict:
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

perhaps we should wrap _fetch_report_data and _verify_report_status in streams to make it retriable and unify interfaces

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I understand that both _fetch_report_data and _verify_report_status may share some similarities in their nature, but it's important to acknowledge that they serve distinct purposes. Leaving them as they are would actually help maintain code clarity, as it allows each function to focus on its specific task without combining their functionalities.

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I didn't mean to wrap them in a single stream, rather two streams. But it's ok to leave it for later, when the oncall issue is resolved

@lazebnyi lazebnyi requested a review from davydov-d July 26, 2023 12:04
@lazebnyi lazebnyi merged commit 6768281 into master Jul 31, 2023
@lazebnyi lazebnyi deleted the lazebnyi/2010-add-report-stream branch July 31, 2023 09:54
@sentry-io
Copy link

sentry-io bot commented Jul 31, 2023

Suspect Issues

This pull request has been deployed and Sentry has observed the following issues:

  • ‼️ requests.exceptions.JSONDecodeError: Expecting value: line 1 column 1 (char 0) /airbyte/integration_code/source_pinterest/stre... View Issue
  • ‼️ requests.exceptions.JSONDecodeError: Expecting value: line 1 column 1 (char 0) /airbyte/integration_code/source_pinterest/stre... View Issue

Did you find this useful? React with a 👍 or 👎

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
area/connectors Connector related issues area/documentation Improvements or additions to documentation connectors/source/pinterest
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants