Skip to content

Commit

Permalink
🐛 Source S3: Make Advanced Reader Options and Advanced Options tr…
Browse files Browse the repository at this point in the history
…uly `Optional` (#23669)
  • Loading branch information
bazarnov authored Mar 3, 2023
1 parent 6e985c0 commit 6a6039b
Show file tree
Hide file tree
Showing 11 changed files with 15 additions and 21 deletions.
Original file line number Diff line number Diff line change
Expand Up @@ -1649,7 +1649,7 @@
- name: S3
sourceDefinitionId: 69589781-7828-43c5-9f63-8925b1c1ccc2
dockerRepository: airbyte/source-s3
dockerImageTag: 1.0.1
dockerImageTag: 1.0.2
documentationUrl: https://docs.airbyte.com/integrations/sources/s3
icon: s3.svg
sourceType: file
Expand Down
Original file line number Diff line number Diff line change
Expand Up @@ -12838,7 +12838,7 @@
supportsNormalization: false
supportsDBT: false
supported_destination_sync_modes: []
- dockerImage: "airbyte/source-s3:1.0.1"
- dockerImage: "airbyte/source-s3:1.0.2"
spec:
documentationUrl: "https://docs.airbyte.com/integrations/sources/s3"
changelogUrl: "https://docs.airbyte.com/integrations/sources/s3"
Expand Down Expand Up @@ -12944,7 +12944,6 @@
\ <a href=\"https://arrow.apache.org/docs/python/generated/pyarrow.csv.ConvertOptions.html#pyarrow.csv.ConvertOptions\"\
\ target=\"_blank\">detailed here</a>. 'column_types' is used internally\
\ to handle schema so overriding that would likely cause problems."
default: "{}"
examples:
- "{\"timestamp_parsers\": [\"%m/%d/%Y %H:%M\", \"%Y/%m/%d %H:%M\"\
], \"strings_can_be_null\": true, \"null_values\": [\"NA\", \"NULL\"\
Expand All @@ -12959,7 +12958,6 @@
\ here if your CSV doesn't have header, or if you want to use custom\
\ column names. 'block_size' and 'encoding' are already used above,\
\ specify them again here will override the values above."
default: "{}"
examples:
- "{\"column_names\": [\"column1\", \"column2\"]}"
order: 8
Expand Down
2 changes: 1 addition & 1 deletion airbyte-integrations/connectors/source-s3/Dockerfile
Original file line number Diff line number Diff line change
Expand Up @@ -17,5 +17,5 @@ COPY source_s3 ./source_s3
ENV AIRBYTE_ENTRYPOINT "python /airbyte/integration_code/main.py"
ENTRYPOINT ["python", "/airbyte/integration_code/main.py"]

LABEL io.airbyte.version=1.0.1
LABEL io.airbyte.version=1.0.2
LABEL io.airbyte.name=airbyte/source-s3
7 changes: 4 additions & 3 deletions airbyte-integrations/connectors/source-s3/README.md
Original file line number Diff line number Diff line change
Expand Up @@ -60,7 +60,7 @@ python main.py read --config secrets/config.json --catalog integration_tests/con
#### Build
First, make sure you build the latest Docker image:
```
docker build . -t airbyte/source-s3:dev
docker build . --no-cache -t airbyte/source-s3:dev
```

You can also build the connector image via Gradle:
Expand All @@ -82,7 +82,7 @@ docker run --rm -v $(pwd)/secrets:/secrets -v $(pwd)/integration_tests:/integrat
Make sure to familiarize yourself with [pytest test discovery](https://docs.pytest.org/en/latest/goodpractices.html#test-discovery) to know how your test files and methods should be named.
First install test dependencies into your virtual environment:
```
pip install .[tests]
pip install '.[tests]'
```
### Unit Tests
To run unit tests locally, from the connector directory run:
Expand All @@ -102,7 +102,8 @@ Customize `acceptance-test-config.yml` file to configure tests. See [Connector A
If your connector requires to create or destroy resources for use during acceptance tests create fixtures for it and place them inside integration_tests/acceptance.py.
To run your integration tests with acceptance tests, from the connector root, run
```
python -m pytest integration_tests -p integration_tests.acceptance
docker build . --no-cache -t airbyte/source-s3:dev \
&& python -m pytest -p connector_acceptance_test.plugin
```
To run your integration tests with docker

Expand Down
Original file line number Diff line number Diff line change
Expand Up @@ -3,7 +3,7 @@ test_strictness_level: high
acceptance_tests:
spec:
tests:
- spec_path: integration_tests/spec.json
- spec_path: integration_tests/spec.json

connection:
tests:
Expand All @@ -25,8 +25,6 @@ acceptance_tests:
- config_path: secrets/config.json
- config_path: secrets/parquet_config.json
- config_path: secrets/avro_config.json
backward_compatibility_tests_config:
disable_for_version: 0.1.32
- config_path: secrets/jsonl_config.json
- config_path: secrets/jsonl_newlines_config.json

Expand Down
Original file line number Diff line number Diff line change
Expand Up @@ -6,7 +6,7 @@
"aws_access_key_id": "123456",
"aws_secret_access_key": "123456key",
"path_prefix": "",
"endpoint": "http://10.0.194.47:9000"
"endpoint": "http://10.0.45.4:9000"
},
"format": {
"filetype": "csv",
Expand Down
Original file line number Diff line number Diff line change
Expand Up @@ -91,7 +91,6 @@
"additional_reader_options": {
"title": "Additional Reader Options",
"description": "Optionally add a valid JSON string here to provide additional options to the csv reader. Mappings must correspond to options <a href=\"https://arrow.apache.org/docs/python/generated/pyarrow.csv.ConvertOptions.html#pyarrow.csv.ConvertOptions\" target=\"_blank\">detailed here</a>. 'column_types' is used internally to handle schema so overriding that would likely cause problems.",
"default": "{}",
"examples": [
"{\"timestamp_parsers\": [\"%m/%d/%Y %H:%M\", \"%Y/%m/%d %H:%M\"], \"strings_can_be_null\": true, \"null_values\": [\"NA\", \"NULL\"]}"
],
Expand All @@ -101,7 +100,6 @@
"advanced_options": {
"title": "Advanced Options",
"description": "Optionally add a valid JSON string here to provide additional <a href=\"https://arrow.apache.org/docs/python/generated/pyarrow.csv.ReadOptions.html#pyarrow.csv.ReadOptions\" target=\"_blank\">Pyarrow ReadOptions</a>. Specify 'column_names' here if your CSV doesn't have header, or if you want to use custom column names. 'block_size' and 'encoding' are already used above, specify them again here will override the values above.",
"default": "{}",
"examples": ["{\"column_names\": [\"column1\", \"column2\"]}"],
"order": 8,
"type": "string"
Expand Down
Original file line number Diff line number Diff line change
Expand Up @@ -57,16 +57,14 @@ class Config:
description="Whether newline characters are allowed in CSV values. Turning this on may affect performance. Leave blank to default to False.",
order=6,
)
additional_reader_options: str = Field(
default="{}",
additional_reader_options: Optional[str] = Field(
description='Optionally add a valid JSON string here to provide additional options to the csv reader. Mappings must correspond to options <a href="https://arrow.apache.org/docs/python/generated/pyarrow.csv.ConvertOptions.html#pyarrow.csv.ConvertOptions" target="_blank">detailed here</a>. \'column_types\' is used internally to handle schema so overriding that would likely cause problems.',
examples=[
'{"timestamp_parsers": ["%m/%d/%Y %H:%M", "%Y/%m/%d %H:%M"], "strings_can_be_null": true, "null_values": ["NA", "NULL"]}'
],
order=7,
)
advanced_options: str = Field(
default="{}",
advanced_options: Optional[str] = Field(
description="Optionally add a valid JSON string here to provide additional <a href=\"https://arrow.apache.org/docs/python/generated/pyarrow.csv.ReadOptions.html#pyarrow.csv.ReadOptions\" target=\"_blank\">Pyarrow ReadOptions</a>. Specify 'column_names' here if your CSV doesn't have header, or if you want to use custom column names. 'block_size' and 'encoding' are already used above, specify them again here will override the values above.",
examples=['{"column_names": ["column1", "column2"]}'],
order=8,
Expand Down
4 changes: 2 additions & 2 deletions airbyte-integrations/connectors/source-s3/source_s3/utils.py
Original file line number Diff line number Diff line change
Expand Up @@ -50,5 +50,5 @@ def multiprocess_queuer(func: Callable, queue: mp.Queue, *args: Any, **kwargs: A
queue.put(dill.loads(func)(*args, **kwargs))


def get_value_or_json_if_empty_string(options: str) -> str:
return options.strip() or "{}"
def get_value_or_json_if_empty_string(options: str = None) -> str:
return options.strip() if options else "{}"
2 changes: 1 addition & 1 deletion connectors.md
Original file line number Diff line number Diff line change
Expand Up @@ -196,7 +196,7 @@
| **Reply.io** | <img alt="Reply.io icon" src="https://raw.githubusercontent.com/airbytehq/airbyte/master/airbyte-config/init/src/main/resources/icons/reply-io.svg" height="30" height="30"/> | Source | airbyte/source-reply-io:0.1.0 | alpha | [link](https://docs.airbyte.com/integrations/sources/reply-io) | [code](https://github.com/airbytehq/airbyte/tree/master/airbyte-integrations/connectors/source-reply-io) | <small>`8cc6537e-f8a6-423c-b960-e927af76116e`</small> |
| **Retently** | <img alt="Retently icon" src="https://raw.githubusercontent.com/airbytehq/airbyte/master/airbyte-config/init/src/main/resources/icons/retently.svg" height="30" height="30"/> | Source | airbyte/source-retently:0.1.3 | alpha | [link](https://docs.airbyte.com/integrations/sources/retently) | [code](https://github.com/airbytehq/airbyte/tree/master/airbyte-integrations/connectors/source-retently) | <small>`db04ecd1-42e7-4115-9cec-95812905c626`</small> |
| **Rocket.chat** | <img alt="Rocket.chat icon" src="https://raw.githubusercontent.com/airbytehq/airbyte/master/airbyte-config/init/src/main/resources/icons/rocket-chat.svg" height="30" height="30"/> | Source | airbyte/source-rocket-chat:0.1.0 | alpha | [link](https://docs.airbyte.com/integrations/sources/rocket-chat) | [code](https://github.com/airbytehq/airbyte/tree/master/airbyte-integrations/connectors/source-rocket-chat) | <small>`921d9608-3915-450b-8078-0af18801ea1b`</small> |
| **S3** | <img alt="S3 icon" src="https://raw.githubusercontent.com/airbytehq/airbyte/master/airbyte-config/init/src/main/resources/icons/s3.svg" height="30" height="30"/> | Source | airbyte/source-s3:1.0.1 | generally_available | [link](https://docs.airbyte.com/integrations/sources/s3) | [code](https://github.com/airbytehq/airbyte/tree/master/airbyte-integrations/connectors/source-s3) | <small>`69589781-7828-43c5-9f63-8925b1c1ccc2`</small> |
| **S3** | <img alt="S3 icon" src="https://raw.githubusercontent.com/airbytehq/airbyte/master/airbyte-config/init/src/main/resources/icons/s3.svg" height="30" height="30"/> | Source | airbyte/source-s3:1.0.2 | generally_available | [link](https://docs.airbyte.com/integrations/sources/s3) | [code](https://github.com/airbytehq/airbyte/tree/master/airbyte-integrations/connectors/source-s3) | <small>`69589781-7828-43c5-9f63-8925b1c1ccc2`</small> |
| **SAP Fieldglass** | <img alt="SAP Fieldglass icon" src="https://raw.githubusercontent.com/airbytehq/airbyte/master/airbyte-config/init/src/main/resources/icons/sapfieldglass.svg" height="30" height="30"/> | Source | airbyte/source-sap-fieldglass:0.1.0 | alpha | [link](https://docs.airbyte.com/integrations/sources/sap-fieldglass) | [code](https://github.com/airbytehq/airbyte/tree/master/airbyte-integrations/connectors/source-sap-fieldglass) | <small>`ec5f3102-fb31-4916-99ae-864faf8e7e25`</small> |
| **SFTP** | <img alt="SFTP icon" src="https://raw.githubusercontent.com/airbytehq/airbyte/master/airbyte-config/init/src/main/resources/icons/sftp.svg" height="30" height="30"/> | Source | airbyte/source-sftp:0.1.2 | alpha | [link](https://docs.airbyte.com/integrations/sources/sftp) | [code](https://github.com/airbytehq/airbyte/tree/master/airbyte-integrations/connectors/source-sftp) | <small>`a827c52e-791c-4135-a245-e233c5255199`</small> |
| **SFTP Bulk** | <img alt="SFTP Bulk icon" src="https://raw.githubusercontent.com/airbytehq/airbyte/master/airbyte-config/init/src/main/resources/icons/sftp.svg" height="30" height="30"/> | Source | airbyte/source-sftp-bulk:0.1.0 | alpha | [link](https://docs.airbyte.com/integrations/sources/sftp-bulk) | [code](https://github.com/airbytehq/airbyte/tree/master/airbyte-integrations/connectors/source-sftp-bulk) | <small>`31e3242f-dee7-4cdc-a4b8-8e06c5458517`</small> |
Expand Down
1 change: 1 addition & 0 deletions docs/integrations/sources/s3.md
Original file line number Diff line number Diff line change
Expand Up @@ -209,6 +209,7 @@ The Jsonl parser uses pyarrow hence,only the line-delimited JSON format is suppo
| Version | Date | Pull Request | Subject |
|:--------|:-----------|:----------------------------------------------------------------------------------------------------------------|:-------------------------------------------------------------------------------------------|
| 1.0.2 | 2023-03-02 | [23669](https://github.com/airbytehq/airbyte/pull/23669)| Made `Advanced Reader Options` and `Advanced Options` truly `optional` for `CSV` format |
| 1.0.1 | 2023-02-27 | [23502](https://github.com/airbytehq/airbyte/pull/23502) | Fix error handling |
| 1.0.0 | 2023-02-17 | [23198](https://github.com/airbytehq/airbyte/pull/23198) | Fix Avro schema discovery |
| 0.1.32 | 2023-02-07 | [22500](https://github.com/airbytehq/airbyte/pull/22500) | Speed up discovery |
Expand Down

0 comments on commit 6a6039b

Please sign in to comment.