Skip to content

Commit

Permalink
Protocol change: Define a set of well-known data types (#17486)
Browse files Browse the repository at this point in the history
* add WellKnownTypes.yaml

* rename to snakecase + put in airbyte-protocol

* add examples

* more descriptoins

* descriptions, more restrictions, better regex

* update documentation

* explicitly call out BC support
  • Loading branch information
edgao authored and nataly committed Nov 3, 2022
1 parent bfcd0af commit 5801604
Show file tree
Hide file tree
Showing 2 changed files with 122 additions and 57 deletions.
Original file line number Diff line number Diff line change
@@ -0,0 +1,75 @@
# Regexes are purely illustrative and need to be workshopped: BC dates shouldn't require 4-digit years; years may have >=5 digits; etc
definitions:
String:
type: string
description: Arbitrary text
BinaryData:
type: string
description: >
Arbitrary binary data. Represented as base64-encoded strings in the JSON transport.
In the future, if we support other transports, may be encoded differently.
# All credit to https://stackoverflow.com/a/475217 for this pattern
pattern: ^(?:[A-Za-z0-9+/]{4})*(?:[A-Za-z0-9+/]{2}==|[A-Za-z0-9+/]{3}=)?$
Date:
type: string
# Examples:
# 2022-01-23
# 2022-01-23 BC
# format: date is a superset of what we want, so we cannot use it here (e.g. it accepts 2-digit years)
pattern: ^\d{4}-\d{2}-\d{2}( BC)?$
description: RFC 3339§5.6's full-date format, extended with BC era support
TimestampWithTimezone:
type: string
# Examples:
# 2022-01-23T01:23:45Z
# 2022-01-23T01:23:45.678-11:30 BC
# format: date-time is a superset of what we want, so we cannot use it here (e.g. it accepts 2-digit years)
pattern: ^\d{4}-\d{2}-\d{2}T\d{2}:\d{2}:\d{2}(\.\d+)?(Z|[+\-]\d{1,2}:\d{2})( BC)?$
description: >
An instant in time. Frequently simply referred to as just a timestamp, or timestamptz.
Uses RFC 3339§5.6's date-time format, requiring a "T" separator, and extended with BC era support.
Note that we do _not_ accept Unix epochs here.
TimestampWithoutTimezone:
type: string
# Examples:
# 2022-01-23T01:23:45
# 2022-01-23T01:23:45.678 BC
# 2022-01-23T01:23:45.678
pattern: ^\d{4}-\d{2}-\d{2}T\d{2}:\d{2}:\d{2}(\.\d+)?( BC)?$
description: >
Also known as a localdatetime, or just datetime.
Under RFC 3339§5.6, this would be represented as `full-date "T" partial-time`, extended with BC era support.
TimeWithTimezone:
type: string
# Examples:
# 01:23:45Z
# 01:23:45.678-11:30
pattern: ^\d{2}:\d{2}:\d{2}(\.\d+)?(Z|[+\-]\d{1,2}:\d{2})$
description: An RFC 3339§5.6 full-time
TimeWithoutTimezone:
type: string
# Examples:
# 01:23:45
# 01:23:45.678
pattern: ^\d{2}:\d{2}:\d{2}(\.\d+)?$
description: An RFC 3339§5.6 partial-time
Number:
type: string
oneOf:
- pattern: -?(0|[0-9]\d*)(\.\d+)?
- enum:
- Infinity
- -Infinity
- NaN
description: Note the mix of regex validation for normal numbers, and enum validation for special values.
Integer:
type: string
oneOf:
- pattern: -?(0|[0-9]\d*)
- enum:
- Infinity
- -Infinity
- NaN
Boolean:
type: boolean
description: Note the direct usage of a primitive boolean rather than string. Unlike Numbers and Integers, we don't expect unusual values here.
Loading

0 comments on commit 5801604

Please sign in to comment.