Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

🚨🚨✨ Source Instagram: Add primary keys for UserLifetimeInsights and UserInsights; add airbyte_type to timestamp fields #32500

Merged
merged 10 commits into from
Nov 17, 2023

Large diffs are not rendered by default.

Original file line number Diff line number Diff line change
Expand Up @@ -7,7 +7,7 @@ data:
connectorSubtype: api
connectorType: source
definitionId: 6acf6b55-4f1e-4fca-944e-1a3caef8aba8
dockerImageTag: 1.0.16
dockerImageTag: 2.0.0
dockerRepository: airbyte/source-instagram
githubIssueLabel: source-instagram
icon: instagram.svg
Expand All @@ -19,6 +19,13 @@ data:
oss:
enabled: true
releaseStage: generally_available
releases:
breakingChanges:
2.0.0:
message:
This release introduces a default primary key for the streams UserLifetimeInsights and UserInsights.
Additionally, the format of timestamp fields has been updated in the UserLifetimeInsights, UserInsights, Media and Stories streams to include timezone information.
upgradeDeadline: "2023-12-03"
suggestedStreams:
streams:
- media
Expand Down
Original file line number Diff line number Diff line change
Expand Up @@ -53,7 +53,8 @@
},
"timestamp": {
"type": ["null", "string"],
"format": "date-time"
"format": "date-time",
"airbyte_type": "timestamp_with_timezone"
},
"username": {
"type": ["null", "string"]
Expand Down Expand Up @@ -94,7 +95,8 @@
},
"timestamp": {
"type": ["null", "string"],
"format": "date-time"
"format": "date-time",
"airbyte_type": "timestamp_with_timezone"
},
"username": {
"type": ["null", "string"]
Expand Down
Original file line number Diff line number Diff line change
Expand Up @@ -47,7 +47,8 @@
},
"timestamp": {
"type": ["null", "string"],
"format": "date-time"
"format": "date-time",
"airbyte_type": "timestamp_with_timezone"
},
"username": {
"type": ["null", "string"]
Expand Down
Original file line number Diff line number Diff line change
Expand Up @@ -9,7 +9,8 @@
},
"date": {
"type": ["null", "string"],
"format": "date-time"
"format": "date-time",
"airbyte_type": "timestamp_with_timezone"
},
"follower_count": {
"type": ["null", "integer"]
Expand Down
Original file line number Diff line number Diff line change
Expand Up @@ -9,7 +9,8 @@
},
"date": {
"type": ["null", "string"],
"format": "date-time"
"format": "date-time",
"airbyte_type": "timestamp_with_timezone"
},
"metric": {
"type": ["null", "string"]
Expand Down
Original file line number Diff line number Diff line change
Expand Up @@ -11,6 +11,7 @@
import pendulum
from airbyte_cdk.models import SyncMode
from airbyte_cdk.sources.streams import IncrementalMixin, Stream
from airbyte_cdk.sources.utils.transform import TransformConfig, TypeTransformer
from cached_property import cached_property
from facebook_business.adobjects.igmedia import IGMedia
from facebook_business.exceptions import FacebookRequestError
Expand All @@ -19,6 +20,29 @@
from .common import remove_params_from_url


class DatetimeTransformerMixin:
transformer: TypeTransformer = TypeTransformer(TransformConfig.CustomSchemaNormalization)

@staticmethod
@transformer.registerCustomTransform
def custom_transform_datetime_rfc3339(original_value, field_schema):
"""
Transform datetime string to RFC 3339 format
"""
if (
original_value
and "format" in field_schema
and field_schema["format"] == "date-time"
and field_schema["airbyte_type"] == "timestamp_with_timezone"
):
# Parse the ISO format timestamp
dt = pendulum.parse(original_value)

# Convert to RFC 3339 format
return dt.to_rfc3339_string()
return original_value


class InstagramStream(Stream, ABC):
"""Base stream class"""

Expand Down Expand Up @@ -121,10 +145,10 @@ def read_records(
yield self.transform(record)


class UserLifetimeInsights(InstagramStream):
class UserLifetimeInsights(DatetimeTransformerMixin, InstagramStream):
"""Docs: https://developers.facebook.com/docs/instagram-api/reference/ig-user/insights"""

primary_key = None
primary_key = ["business_account_id", "metric", "date"]
maxi297 marked this conversation as resolved.
Show resolved Hide resolved
LIFETIME_METRICS = ["audience_city", "audience_country", "audience_gender_age", "audience_locale"]
period = "lifetime"

Expand Down Expand Up @@ -156,7 +180,7 @@ def request_params(
return params


class UserInsights(InstagramIncrementalStream):
class UserInsights(DatetimeTransformerMixin, InstagramIncrementalStream):
"""Docs: https://developers.facebook.com/docs/instagram-api/reference/ig-user/insights"""

METRICS_BY_PERIOD = {
Expand All @@ -176,7 +200,7 @@ class UserInsights(InstagramIncrementalStream):
"lifetime": ["online_followers"],
}

primary_key = None
primary_key = ["business_account_id", "date"]
cursor_field = "date"

# For some metrics we can only get insights not older than 30 days, it is Facebook policy
Expand Down Expand Up @@ -295,7 +319,7 @@ def _state_has_legacy_format(self, state: Mapping[str, Any]) -> bool:
return False


class Media(InstagramStream):
class Media(DatetimeTransformerMixin, InstagramStream):
"""Children objects can only be of the media_type == "CAROUSEL_ALBUM".
And children object does not support INVALID_CHILDREN_FIELDS fields,
so they are excluded when trying to get child objects to avoid the error
Expand Down Expand Up @@ -403,7 +427,7 @@ def _get_insights(self, item, account_id) -> Optional[MutableMapping[str, Any]]:
raise error


class Stories(InstagramStream):
class Stories(DatetimeTransformerMixin, InstagramStream):
"""Docs: https://developers.facebook.com/docs/instagram-api/reference/ig-user/stories"""

def read_records(
Expand Down
Original file line number Diff line number Diff line change
Expand Up @@ -208,9 +208,9 @@ def test_user_lifetime_insights_read(api, config, user_insight_data, requests_mo
@pytest.mark.parametrize(
"values,expected",
[
({"end_time": "test_end_time", "value": "test_value"}, {"date": "test_end_time", "value": "test_value"}),
({"end_time": "2020-05-04T07:00:00+0000", "value": "test_value"}, {"date": "2020-05-04T07:00:00+0000", "value": "test_value"}),
({"value": "test_value"}, {"date": None, "value": "test_value"}),
({"end_time": "test_end_time"}, {"date": "test_end_time", "value": None}),
({"end_time": "2020-05-04T07:00:00+0000"}, {"date": "2020-05-04T07:00:00+0000", "value": None}),
({}, {"date": None, "value": None}),
],
ids=[
Expand Down
9 changes: 9 additions & 0 deletions docs/integrations/sources/instagram-migrations.md
Original file line number Diff line number Diff line change
@@ -0,0 +1,9 @@
# Instagram Migration Guide

## Upgrading to 2.0.0

This release adds a default primary key for the streams UserLifetimeInsights and UserInsights, and updates the format of timestamp fields in the UserLifetimeInsights, UserInsights, Media and Stories streams to include timezone information.

To ensure uninterrupted syncs, users should:
- Refresh the source schema
- Reset affected streams
1 change: 1 addition & 0 deletions docs/integrations/sources/instagram.md
Original file line number Diff line number Diff line change
Expand Up @@ -93,6 +93,7 @@ AirbyteRecords are required to conform to the [Airbyte type](https://docs.airbyt

| Version | Date | Pull Request | Subject |
|:--------|:-----------|:---------------------------------------------------------|:--------------------------------------------------------------------------------------------------------------------------|
| 2.0.0 | 2023-11-17 | [32500](https://github.com/airbytehq/airbyte/pull/32500) | Add primary keys for UserLifetimeInsights and UserInsights; add airbyte_type to timestamp fields |
| 1.0.16 | 2023-11-17 | [32627](https://github.com/airbytehq/airbyte/pull/32627) | Fix start_date type; fix docs |
| 1.0.15 | 2023-11-14 | [32494](https://github.com/airbytehq/airbyte/pull/32494) | Marked start_date as optional; set max retry time to 10 minutes; add suggested streams |
| 1.0.14 | 2023-11-13 | [32423](https://github.com/airbytehq/airbyte/pull/32423) | Capture media_product_type column in media and stories stream |
Expand Down
Loading