Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Add column_dtypes to forcibly convert the type #594

Merged
merged 1 commit into from
Jun 6, 2024

Conversation

grieve54706
Copy link
Contributor

@grieve54706 grieve54706 commented Jun 5, 2024

Description

We will convert types in the data frame by input dtypes.
We also format datetime to a formatted string.

HTTP example

POST /v2/ibis/postgres/query
{
  "connectionInfo": {...},
  "manifestStr": "base64 string of manifest",
  "sql": 'SELECT * FROM "Orders" LIMIT 1',
  "columnDtypes": {
      "totalprice": "float",
      "orderdate": "datetime64",
      "timestamp": "datetime64",
      "timestamptz": "datetime64"
  }
}

Response difference

{
    "columns": [
        "orderkey",
        "custkey",
        "orderstatus",
        "totalprice",
        "orderdate",
        "order_cust_key",
        "timestamp",
        "timestamptz"
    ],
    "data": [
        [
            1,
            370,
            "O",
            "172799.49",
-           820540800000,
+           "1996-01-02 00:00:00.000000",
            "1_370",
-           1704153599000,
+           "2024-01-01 23:59:59.000000",
-           1704153599000
+           "2024-01-01 23:59:59.000000 UTC"
        ]
    ],
    "dtypes": {
        "orderkey": "int32",
        "custkey": "int32",
        "orderstatus": "object",
-       "totalprice": "object",
+       "totalprice": "float64",
        "orderdate": "object",
        "order_cust_key": "object",
-       "timestamp": "datetime64[ns]",
+       "timestamp": "object",
-       "timestamptz": "datetime64[ns, UTC]"
+       "timestamptz": "object"
    }
}

Additional information

Pandas dtypes https://pandas.pydata.org/docs/user_guide/basics.html#dtypes

Copy link
Contributor

@goldmedal goldmedal left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks @grieve54706. I have one question for this.

Comment on lines +108 to +111
'orderdate': 'object',
'order_cust_key': 'object',
'timestamp': 'object',
'timestamptz': 'object'
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Curiously, why are those object after setting the specific column type?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Because we format the datetime to string. String in the dtype is objcet. The data type is mapping the data value. If we want to keep the type is datetime, we should not format the value. But we need the value format.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I see. Thanks. It's good for now.

Copy link
Contributor

@goldmedal goldmedal left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM

@goldmedal goldmedal merged commit 58f94a3 into main Jun 6, 2024
5 checks passed
@goldmedal goldmedal deleted the feature/ibis/adjust-query-api-with-spec-type branch June 6, 2024 08:04
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants