Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

✨ Source Mixpanel: add page size to configuration to increase sync speed #41976

Merged
Merged
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension


Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
Original file line number Diff line number Diff line change
Expand Up @@ -11,7 +11,7 @@ data:
connectorSubtype: api
connectorType: source
definitionId: 12928b32-bf0a-4f1e-964f-07e12e37153a
dockerImageTag: 3.2.4
dockerImageTag: 3.3.0
dockerRepository: airbyte/source-mixpanel
documentationUrl: https://docs.airbyte.com/integrations/sources/mixpanel
githubIssueLabel: source-mixpanel
Expand Down
Original file line number Diff line number Diff line change
Expand Up @@ -3,7 +3,7 @@ requires = ["poetry-core>=1.0.0"]
build-backend = "poetry.core.masonry.api"

[tool.poetry]
version = "3.2.4"
version = "3.3.0"
name = "source-mixpanel"
description = "Source implementation for Mixpanel."
authors = ["Airbyte <contact@airbyte.io>"]
Expand Down
Original file line number Diff line number Diff line change
Expand Up @@ -291,7 +291,7 @@ def next_page_token(self, response, last_records: List[Mapping[str, Any]]) -> Op
if total:
self._total = total

if self._total and page_number is not None and self._total > self.page_size * (page_number + 1):
if self._total and page_number is not None and self._total > self._page_size * (page_number + 1):
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

To get the real page size from PageIncrement and not '{{ config.page_size }}'

return {"session_id": decoded_response.get("session_id"), "page": page_number + 1}
else:
self._total = None
Expand Down
Original file line number Diff line number Diff line change
Expand Up @@ -44,6 +44,10 @@ definitions:
- error_message_contains: "Query rate limit exceeded"
action: RETRY
error_message: Query rate limit exceeded.
- http_codes: [500]
error_message_contains: "unknown error"
action: RETRY
error_message: An unknown error occurred
- error_message_contains: "to_date cannot be later than today"
action: FAIL
error_message: Your project timezone must be misconfigured. Please set it to the one defined in your Mixpanel project settings.
Expand Down Expand Up @@ -150,7 +154,7 @@ definitions:
type: CustomPaginationStrategy
class_name: "source_mixpanel.components.EngagePaginationStrategy"
start_from_page: 1
page_size: 1000
page_size: '{{ config["page_size"] or 1000 }}'
page_token_option:
type: RequestOption
inject_into: request_parameter
Expand Down
Original file line number Diff line number Diff line change
Expand Up @@ -72,7 +72,17 @@ def validate_date(name: str, date_str: str, default: pendulum.date) -> pendulum.

@adapt_validate_if_testing
def _validate_and_transform(self, config: MutableMapping[str, Any]):
project_timezone, start_date, end_date, attribution_window, select_properties_by_default, region, date_window_size, project_id = (
(
project_timezone,
start_date,
end_date,
attribution_window,
select_properties_by_default,
region,
date_window_size,
project_id,
page_size,
) = (
config.get("project_timezone", "US/Pacific"),
config.get("start_date"),
config.get("end_date"),
Expand All @@ -81,6 +91,7 @@ def _validate_and_transform(self, config: MutableMapping[str, Any]):
config.get("region", "US"),
config.get("date_window_size", 30),
config.get("credentials", dict()).get("project_id"),
config.get("page_size", 1000),
)
try:
project_timezone = pendulum.timezone(project_timezone)
Expand Down Expand Up @@ -113,5 +124,6 @@ def _validate_and_transform(self, config: MutableMapping[str, Any]):
config["region"] = region
config["date_window_size"] = date_window_size
config["project_id"] = project_id
config["page_size"] = page_size

return config
Original file line number Diff line number Diff line change
Expand Up @@ -119,6 +119,14 @@
"type": "integer",
"minimum": 1,
"default": 30
},
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Add the page size ton the user defined configuration

"page_size": {
"order": 9,
"title": "Page Size",
"description": "The number of records to fetch per request for the engage stream. Default is 1000. If you are experiencing long sync times with this stream, try increasing this value.",
"type": "integer",
"minimum": 1,
"default": 1000
}
}
}
Expand Down
Original file line number Diff line number Diff line change
Expand Up @@ -23,6 +23,7 @@ def config(start_date):
"start_date": start_date,
"end_date": start_date.add(days=31),
"region": "US",
"page_size": 1000,
}


Expand Down
Original file line number Diff line number Diff line change
Expand Up @@ -135,6 +135,7 @@ def test_streams_string_date(requests_mock, config_raw):
"select_properties_by_default": True,
"region": "EU",
"date_window_size": 10,
"page_size": 1000
},
True,
None,
Expand Down
1 change: 1 addition & 0 deletions docs/integrations/sources/mixpanel.md
Original file line number Diff line number Diff line change
Expand Up @@ -58,6 +58,7 @@ Syncing huge date windows may take longer due to Mixpanel's low API rate-limits

| Version | Date | Pull Request | Subject |
|:--------|:-----------|:---------------------------------------------------------|:-----------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------|
| 3.3.0 | 2024-07-15 | [41754](https://github.com/airbytehq/airbyte/pull/41754) | Add engage page size to configuration |
| 3.2.4 | 2024-07-13 | [41754](https://github.com/airbytehq/airbyte/pull/41754) | Update dependencies |
| 3.2.3 | 2024-07-10 | [41420](https://github.com/airbytehq/airbyte/pull/41420) | Update dependencies |
| 3.2.2 | 2024-07-09 | [41289](https://github.com/airbytehq/airbyte/pull/41289) | Update dependencies |
Expand Down
Loading