Skip to content

Commit

Permalink
MongoDB: Improve type mapper, add support for more BSON types
Browse files Browse the repository at this point in the history
DatetimeMS, Decimal128, Int64
  • Loading branch information
amotl committed Aug 21, 2024
1 parent d1c0740 commit 5f47609
Show file tree
Hide file tree
Showing 3 changed files with 19 additions and 5 deletions.
2 changes: 2 additions & 0 deletions CHANGES.md
Original file line number Diff line number Diff line change
Expand Up @@ -4,6 +4,8 @@
## Unreleased
- MongoDB: Improve type mapper by discriminating between
`INTEGER` and `BIGINT`
- MongoDB: Improve type mapper by supporting BSON `DatetimeMS`,
`Decimal128`, and `Int64` types

## 2024/08/19 v0.0.17
- Processor: Updated Kinesis Lambda processor to understand AWS DMS
Expand Down
4 changes: 3 additions & 1 deletion cratedb_toolkit/io/mongodb/extract.py
Original file line number Diff line number Diff line change
Expand Up @@ -170,7 +170,9 @@ def extract_schema_from_array(array: list, schema: dict):
bson.ObjectId: "OID",
bson.datetime.datetime: "DATETIME",
bson.Timestamp: "TIMESTAMP",
bson.int64.Int64: "INT64",
bson.DatetimeMS: "TIMESTAMP",
bson.Decimal128: "DOUBLE",
bson.Int64: "INT64",
# primitive types
str: "STRING",
bool: "BOOLEAN",
Expand Down
18 changes: 14 additions & 4 deletions tests/io/mongodb/test_extract.py
Original file line number Diff line number Diff line change
Expand Up @@ -40,11 +40,21 @@ def test_integer_types(self):

def test_bson_types(self):
data = {
"a": bson.ObjectId("55153a8014829a865bbf700d"),
"b": bson.datetime.datetime.now(),
"c": bson.Timestamp(0, 0),
"datetime": bson.datetime.datetime.now(),
"datetimems": bson.DatetimeMS(1563051934000),
"decimal128": bson.Decimal128("42.42"),
"int64": bson.Int64(42),
"objectid": bson.ObjectId("55153a8014829a865bbf700d"),
"timestamp": bson.Timestamp(0, 0),
}
expected = {
"datetime": "DATETIME",
"datetimems": "TIMESTAMP",
"decimal128": "DOUBLE",
"int64": "INT64",
"objectid": "OID",
"timestamp": "TIMESTAMP",
}
expected = {"a": "OID", "b": "DATETIME", "c": "TIMESTAMP"}
schema = trim_schema(extract.extract_schema_from_document(data, {}))
self.assertDictEqual(schema, expected)

Expand Down

0 comments on commit 5f47609

Please sign in to comment.