Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

PageNumberPaginator not working with nested resources #1921

Closed
paul-godhouse opened this issue Oct 3, 2024 · 0 comments · Fixed by #1924
Closed

PageNumberPaginator not working with nested resources #1921

paul-godhouse opened this issue Oct 3, 2024 · 0 comments · Fixed by #1924
Assignees
Labels
bug Something isn't working

Comments

@paul-godhouse
Copy link
Contributor

dlt version

1.1.0

Describe the problem

I'm implementing a connector from a REST API to Snowflake and I found what seems to be a bug

I'm calling a first endpoint /workflows which returns a list of workflows. This endpoint returns a single page.
Then I call a second endpoint /workflows/{workflow_uid}/jobs which returns for each workflow a list of jobs. This endpoint is paginated with a page param.
Let's say I have two workflows (workflow_uid_1 & workflow_uid_2).
dlt starts by doing all these requests:

  • /workflows/workflow_uid_1/jobs?page=0
  • /workflows/workflow_uid_1/jobs?page=1
  • /workflows/workflow_uid_1/jobs?page=2
  • /workflows/workflow_uid_1/jobs?page=3

until the page is empty, which is the intended behavior.
Then dlt switches to workflow_uid_2 but starts at page 3:

  • /workflows/workflow_uid_2/jobs?page=3
  • /workflows/workflow_uid_2/jobs?page=4

Expected behavior

The requests made to /workflows/workflow_uid_2/jobs should start with page=0

Steps to reproduce

See description

Operating system

macOS

Runtime environment

Local

Python version

3.11

dlt data source

rest_api

dlt destination

Snowflake

Other deployment details

No response

Additional information

The issue can be fixed by adding two lines of code in the init_request of RangePaginator:

def init_request(self, request: Request) -> None:
        self._has_next_page = True
        self.current_value = self.base_index
        if request.params is None:
            request.params = {}

        request.params[self.param_name] = self.current_value

This way, the paginator is reset every time it runs through a new parent resource

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
bug Something isn't working
Projects
Status: Done
Development

Successfully merging a pull request may close this issue.

2 participants