Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Better dataframe formatting in REPL #15555

Open
1 task done
universalmind303 opened this issue Jul 31, 2024 · 4 comments
Open
1 task done

Better dataframe formatting in REPL #15555

universalmind303 opened this issue Jul 31, 2024 · 4 comments
Labels
enhancement [core label] repl repl, jupyter, notebooks, etc

Comments

@universalmind303
Copy link

Check for existing issues

  • Completed

Describe the feature

When you display a dataframe via the REPL, the formatting is very bad. I tried this with multiple dataframe libraries (polars, daft, pandas). They should use _repr_html_ as is customary with jupyter notebooks, but it looks like instead they are using the __repr__ method.

The __repr__ method would be fine if it was properly aligned as it is in vscode

If applicable, add mockups / screenshots to help present your vision of the feature

Zed

image

VSCode

image

VSCode __repr__

image

@universalmind303 universalmind303 added admin read Pending admin review enhancement [core label] triage Maintainer needs to classify the issue labels Jul 31, 2024
@rgbkrk
Copy link
Member

rgbkrk commented Jul 31, 2024

I haven't documented this yet, but you can use the following to get a richer table:

pd.set_option('display.html.table_schema', True)

Zed/GPUI does not support HTML output at this time so we are using the JSON Schema output for now.

@rgbkrk rgbkrk removed triage Maintainer needs to classify the issue admin read Pending admin review labels Jul 31, 2024
@universalmind303
Copy link
Author

@rgbkrk for dataframe objects, how exactly are they serialized to the repl?

i saw that there is the rank_mime_type that'll use DataTable if possible, but wasn't able to figure out how is that mime type determined from the pyobject?

Is there a dunder method or other _repr_<something>_ the dataframe libraries could add to make this auto detect?

AFAIK, only pandas supports the afformentioned option for display.html.table_schema

@universalmind303
Copy link
Author

It also looks like data tables are truncating the last column.

Screen.Recording.2024-08-01.at.12.13.01.PM.mov

@rgbkrk
Copy link
Member

rgbkrk commented Aug 1, 2024

The media type this goes on is application/vnd.dataresource+json. The only _repr*_ way to emit this is to use _repr_mimebundle_. Example:

class DataResource:
    def __init__(self, data):
        self.data = data

    # The media type for this is `application/vnd.dataresource+json`.
    # The only `_repr*_` method to emit this is to use `_repr_mimebundle_`.
    # For more details, refer to: https://ipython.readthedocs.io/en/stable/config/integrating.html
    def _repr_mimebundle_(self, include=None, exclude=None):
        return {
            'application/vnd.dataresource+json': self.data
        }

# Example usage
data = {
    "schema": {
        "fields": [
            {"name": "name", "type": "string"},
            {"name": "age", "type": "integer"}
        ]
    },
    "data": [
        {"name": "Alice", "age": 30},
        {"name": "Bob", "age": 25}
    ]
}

resource = DataResource(data)
resource 

You can also use _ipython_display_, but that's the repr way.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
enhancement [core label] repl repl, jupyter, notebooks, etc
Projects
None yet
Development

No branches or pull requests

2 participants