Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

🎉 Source Github: Add MultipleTokenAuthenticator #5223

Merged
merged 18 commits into from
Aug 19, 2021
Merged
Changes from 9 commits
Commits
File filter

Filter by extension

Filter by extension


Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
3 changes: 3 additions & 0 deletions airbyte-cdk/python/CHANGELOG.md
Original file line number Diff line number Diff line change
@@ -1,5 +1,8 @@
# Changelog

## 0.1.9
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

do we need two versions for the same release? can we just make it a single version?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Probably not. But both are released already

Add multiple token support

## 0.1.8
Allow to fetch primary key info from singer catalog

2 changes: 1 addition & 1 deletion airbyte-cdk/python/README.md
Original file line number Diff line number Diff line change
@@ -74,7 +74,7 @@ All tests are located in the `unit_tests` directory. Run `pytest --cov=airbyte_c

1. Bump the package version in `setup.py`
2. Open a PR
3. An Airbyte member must comment `/publish-cdk --dry-run=<true or false>`. Dry runs publish to test.pypi.org.
3. An Airbyte member must comment `/publish-cdk dry-run=true` to publish the package to test.pypi.org or `/publish-cdk dry-run=false` to publish it to the real index of pypi.org.

## Coming Soon

Original file line number Diff line number Diff line change
@@ -1,11 +1,12 @@
# Initialize Auth Package
from .core import HttpAuthenticator, NoAuth
from .oauth import Oauth2Authenticator
from .token import TokenAuthenticator
from .token import MultipleTokenAuthenticator, TokenAuthenticator

__all__ = [
"HttpAuthenticator",
"NoAuth",
"Oauth2Authenticator",
"TokenAuthenticator",
"MultipleTokenAuthenticator",
]
17 changes: 17 additions & 0 deletions airbyte-cdk/python/airbyte_cdk/sources/streams/http/auth/token.py
Original file line number Diff line number Diff line change
@@ -23,10 +23,16 @@
#


from itertools import cycle
from typing import Any, Mapping

from .core import HttpAuthenticator

TOKEN_SEPARATOR = ","


TOKEN_SEPARATOR = ","


class TokenAuthenticator(HttpAuthenticator):
def __init__(self, token: str, auth_method: str = "Bearer", auth_header: str = "Authorization"):
@@ -36,3 +42,14 @@ def __init__(self, token: str, auth_method: str = "Bearer", auth_header: str = "

def get_auth_header(self) -> Mapping[str, Any]:
return {self.auth_header: f"{self.auth_method} {self._token}"}


class MultipleTokenAuthenticator(HttpAuthenticator):
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

could you add a doc string explaining what this does e.g:

Uses the input list of tokens for authentication in a round-robin fashion. This allows load balancing quota consumption across multiple tokens. 

def __init__(self, tokens: str, auth_method: str = "Bearer", auth_header: str = "Authorization"):
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

this should take a list of strings rather than a comma separated string

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

good point, updated

self.auth_method = auth_method
self.auth_header = auth_header
self._tokens = [line.strip() for line in tokens.split(TOKEN_SEPARATOR)]
self._tokens_iter = cycle(self._tokens)

def get_auth_header(self) -> Mapping[str, Any]:
return {self.auth_header: f"{self.auth_method} {next(self._tokens_iter)}"}
2 changes: 1 addition & 1 deletion airbyte-cdk/python/setup.py
Original file line number Diff line number Diff line change
@@ -35,7 +35,7 @@

setup(
name="airbyte-cdk",
version="0.1.8",
version="0.1.9",
description="A framework for writing Airbyte Connectors.",
long_description=README,
long_description_content_type="text/markdown",
Original file line number Diff line number Diff line change
@@ -26,7 +26,7 @@
import logging

import requests
from airbyte_cdk.sources.streams.http.auth import NoAuth, Oauth2Authenticator, TokenAuthenticator
from airbyte_cdk.sources.streams.http.auth import MultipleTokenAuthenticator, NoAuth, Oauth2Authenticator, TokenAuthenticator
from requests import Response

LOGGER = logging.getLogger(__name__)
@@ -43,6 +43,16 @@ def test_token_authenticator():
assert {"Authorization": "Bearer test-token"} == header


def test_multiple_token_authenticator():
token = MultipleTokenAuthenticator("token1, token2")
header1 = token.get_auth_header()
assert {"Authorization": "Bearer token1"} == header1
header2 = token.get_auth_header()
assert {"Authorization": "Bearer token2"} == header2
header3 = token.get_auth_header()
assert {"Authorization": "Bearer token1"} == header3


def test_no_auth():
"""
Should always return empty body, no matter how many times token is retrieved.
Original file line number Diff line number Diff line change
@@ -2,7 +2,7 @@
"sourceDefinitionId": "ef69ef6e-aa7f-4af1-a01d-ef775033524e",
"name": "GitHub",
"dockerRepository": "airbyte/source-github",
"dockerImageTag": "0.1.4",
"dockerImageTag": "0.1.5",
"documentationUrl": "https://docs.airbyte.io/integrations/sources/github",
"icon": "github.svg"
}
2 changes: 1 addition & 1 deletion airbyte-integrations/connectors/source-github/Dockerfile
Original file line number Diff line number Diff line change
@@ -12,5 +12,5 @@ RUN pip install .
ENV AIRBYTE_ENTRYPOINT "python /airbyte/integration_code/main.py"
ENTRYPOINT ["python", "/airbyte/integration_code/main.py"]

LABEL io.airbyte.version=0.1.4
LABEL io.airbyte.version=0.1.5
LABEL io.airbyte.name=airbyte/source-github
Original file line number Diff line number Diff line change
@@ -29,7 +29,7 @@
from airbyte_cdk.models import SyncMode
from airbyte_cdk.sources import AbstractSource
from airbyte_cdk.sources.streams import Stream
from airbyte_cdk.sources.streams.http.auth import TokenAuthenticator
from airbyte_cdk.sources.streams.http.auth import MultipleTokenAuthenticator, TokenAuthenticator

from .streams import (
Assignees,
@@ -52,6 +52,9 @@
)


TOKEN_SEPARATOR = ","


class SourceGithub(AbstractSource):
def _generate_repositories(self, config: Mapping[str, Any], authenticator: TokenAuthenticator) -> List[str]:
organizations = list(filter(None, config["organization"].split(" ")))
@@ -74,7 +77,10 @@ def _generate_repositories(self, config: Mapping[str, Any], authenticator: Token

def check_connection(self, logger: AirbyteLogger, config: Mapping[str, Any]) -> Tuple[bool, Any]:
try:
authenticator = TokenAuthenticator(token=config["access_token"], auth_method="token")
if TOKEN_SEPARATOR in config["access_token"]:
authenticator = MultipleTokenAuthenticator(tokens=config["access_token"], auth_method="token")
else:
authenticator = TokenAuthenticator(token=config["access_token"], auth_method="token")
repositories = self._generate_repositories(config=config, authenticator=authenticator)

# We should use the most poorly filled stream to use the `list` method, because when using the `next` method, we can get the `StopIteration` error.
@@ -86,7 +92,10 @@ def check_connection(self, logger: AirbyteLogger, config: Mapping[str, Any]) ->
return False, repr(e)

def streams(self, config: Mapping[str, Any]) -> List[Stream]:
authenticator = TokenAuthenticator(token=config["access_token"], auth_method="token")
if TOKEN_SEPARATOR in config["access_token"]:
authenticator = MultipleTokenAuthenticator(tokens=config["access_token"], auth_method="token")
else:
authenticator = TokenAuthenticator(token=config["access_token"], auth_method="token")
repositories = self._generate_repositories(config=config, authenticator=authenticator)
full_refresh_args = {"authenticator": authenticator, "repositories": repositories}
incremental_args = {**full_refresh_args, "start_date": config["start_date"]}
Original file line number Diff line number Diff line change
@@ -9,7 +9,7 @@
"properties": {
"access_token": {
"type": "string",
"description": "Log into Github and then generate a <a href=\"https://github.com/settings/tokens\"> personal access token</a>.",
"description": "Log into Github and then generate a <a href=\"https://github.com/settings/tokens\"> personal access token</a>. Separate multiple tokens with \",\"",
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Ideally, type: string would be an array of strings too. However for backwards compatibility this is not great.

Could you create an issue to update this to a list of strings once #5396 is resolved?

"airbyte_secret": true
},
"organization": {
2 changes: 1 addition & 1 deletion docs/connector-development/cdk-python/README.md
Original file line number Diff line number Diff line change
@@ -94,7 +94,7 @@ All tests are located in the `unit_tests` directory. Run `pytest --cov=airbyte_c

1. Bump the package version in `setup.py`
2. Open a PR
3. An Airbyte member must comment `/publish-cdk --dry-run=<true or false>`. Dry runs publish to test.pypi.org.
3. An Airbyte member must comment `/publish-cdk dry-run=true` to publish the package to test.pypi.org or `/publish-cdk dry-run=false` to publish it to the real index of pypi.org.

## Coming Soon