Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

DS: Provide rich debugger support for pandas DataFrames #1286

Closed
apryor6 opened this issue Aug 22, 2019 · 70 comments
Closed

DS: Provide rich debugger support for pandas DataFrames #1286

apryor6 opened this issue Aug 22, 2019 · 70 comments

Comments

@apryor6
Copy link

apryor6 commented Aug 22, 2019

I build lots of Flask APIs that internally use pandas DataFrames, and almost daily I find myself jumping into the debugger; however, there is not a particularly good way to inspect the contents of DataFrame. Ideally there would be a way to, perhaps, double-click or hover on a DataFrame variable and have a popup that shows the first few rows spreadsheet-style

The Spyder IDE has something like what I describe

Currently, I find I am limited to either:

  • Calling print(df.head()) from the debugger console. This is pretty clunky
  • Writing the DataFrame to a file and re-importing into a Jupyter notebook, iPython, etc. This method introduces a class of problems in addition to the time inconvenience
  • Inspecting the variable from the debugger Watch menu, but for an object as complex as a DataFrame this doesn't work as well. There is an implicitly structured context to a DataFrame that is not fully captured here.

I've felt for some time that this would be a great feature, but honestly I assumed there would be a large number of other data scientists sharing the same request and that it would magically appear. This is me conceding that point and doing what I should have already and making the request.

Edit: I see a teaser for such functionality from some time ago here, but I don't see this inside of the app. Am I missing something, or is there an update?

@mgsnuno
Copy link

mgsnuno commented Aug 26, 2019

I share the same need.

While hovering over a pandas DataFrame in debug, it would be nice to render the html view of the dataframe head, with horizontal/vertical scrolls.

@IanMatthewHuff
Copy link
Member

@apryor6. In the interactive window experience we have a data viewer for pandas dataframes. Is this the type of viewing experience that you would like with a standard VSCode debugging experience?

https://devblogs.microsoft.com/python/python-in-visual-studio-code-april-2019-release/

@mgsnuno
Copy link

mgsnuno commented Aug 27, 2019

just giving my 2 cents: vscode variables explorer viewer or IPython.display.display are both good options. whatever makes the user experience more consistent.

@apryor6
Copy link
Author

apryor6 commented Aug 27, 2019

@IanMatthewHuff yes precisely like this. The ideal user experience would be that when paused at a triggered breakpoint that an icon appears next to the variable name if it is a DataFrame that opens this viewer on click or that such a window appears on hover.

@IanMatthewHuff
Copy link
Member

Got it. Thanks for the feedback @apryor6 and @mgsnuno .

@kogakenji
Copy link

I have the same need. Is there any workaround for this? What are you guys using when debugging a dataframe inside vscode?

@mgsnuno
Copy link

mgsnuno commented Sep 30, 2019

while debugging, I type in the Debug Console display(dataframe)

display comes from IPython.display.display and is loaded by default in a jupyter enviroment.

@kogakenji
Copy link

Ok. Thanks Nuno! I see. I am not using jupyter environment. I am using python debug inside vscode. Even after importing IPython, I get None when using IPython.display.display(DataFrame).

@DonJayamanne
Copy link
Contributor

Playing with a few ideas...

dsExplorer

@ejohb
Copy link

ejohb commented Oct 14, 2019

I think this is the only thing still keeping me on PyCharm :)

@szc11121
Copy link

Playing with a few ideas...

dsExplorer

could you please tell me how can I find the DATA SCIENCE VARIABLES in my debug side bar?

@rchiodo
Copy link
Contributor

rchiodo commented Nov 20, 2019

@szc11121 the ideas that @DonJayamanne was playing with have not shipped. That feature is not supported yet.

@pyropenguin
Copy link

Chiming in to voice support for this feature. I like the suggestion from @szc11121, although just having the "Show Variable in Data Viewer" icon directly in the VARIABLES sidebar on the right of each variable it applies to (and double-click the line to open data viewer) might be cleaner. Alternatively, if there were a separate list of Data Science Variables, perhaps make it look and function the same as the Python Interactive Variables table.

@boazdori
Copy link

boazdori commented Dec 4, 2019

I would like to add my support for this feature. I am working a lot with pandas DataFrames and having tables and variable description helps a lot in the debugging process. This format is available in spyder which came from the MatLab way of thinking, and Pycharm included a data science working setup which gives this possibility as well. vs code should have the same setup this is a must!

@wangluochao902
Copy link

There is a way to work around with the help of debugging a cell in Jupyter notebook. While debugging, the dataframe is displayed in the Jupyter notebook. The limitation is you need to run it in the Jupyter notebook.

vscode-debugger-dataframe

@el-analista
Copy link

Hi,
+1 for this as right now the debugger is printing end of line as \n instead of an end of line. Why doesn't it behave like the terminal?

@Dr-Irv
Copy link

Dr-Irv commented Jan 11, 2020

while debugging, I type in the Debug Console display(dataframe)

display comes from IPython.display.display and is loaded by default in a jupyter enviroment.

No longer works with python extension version 2020.1.57204 . See microsoft/ptvsd#2036

@joaohsr
Copy link

joaohsr commented Jan 29, 2020

Still waiting for a DataFrame Viewer in debug mode :(

@christina-zhou-96
Copy link

I'm following this to learn when I can switch off PyCharm...

@greazer greazer changed the title Provide rich debugger support for pandas DataFrames DS: Provide rich debugger support for pandas DataFrames Feb 13, 2020
@voochuk
Copy link

voochuk commented Feb 16, 2020

Would be great if you supported a debug visualizer extension point like studio
https://docs.microsoft.com/en-us/visualstudio/debugger/create-custom-visualizers-of-data?view=vs-2019
Some datascience structures like large xarray's, tensors etc that require introspection using filters, sorting etc can then be handled as well

@fpnick
Copy link

fpnick commented Feb 24, 2020

Same need here :)

@pearlus
Copy link

pearlus commented Feb 25, 2020

+1, lack of this feature forcing me to pycharm

Adding picture from pycharm sciview:
image

@maksudmck
Copy link

I really really need this feature!

@memeplex
Copy link

memeplex commented Mar 2, 2020

Having rich view is one great thing, but NOT having a raw string with newlines emebeded and everything is one basic thing. Could you at least show the standard dataframe representation which is ok as far as it goes?

@aleemkhusro
Copy link

please add this. This would make my life so simple. The only thing that VS Code doesn't have, and PyCharm does have. Do this, and I'll be in remote debugging heaven. My boss will give me a raise, and I will cure cancer.

@granthussey
Copy link

This is a feature VSCode desperately needs

@shireenrao
Copy link

I updated vscode to the latest version, and now see that dataframes are printing nicely. Thank you for this. This is how it looks:
Annotation 2020-03-10 133746

@taha-yassine
Copy link

I think extanding this feature to other data formats (e.g. numpy arrays) like in the already supported data viewer would be great.

@rishabhrishu
Copy link

Were you able to view all the rows? I get internals on clicking expand icon.

I updated vscode to the latest version, and now see that dataframes are printing nicely. Thank you for this. This is how it looks:
Annotation 2020-03-10 133746

@ejohb
Copy link

ejohb commented Sep 4, 2020

Possible solution for integration points microsoft/vscode#70377

Looks very promising.

@bw984
Copy link

bw984 commented Sep 22, 2020

Our company uses Visual Studio Professional as the primary IDE and to my disbelief viewing dataframes isn't a feature in VSCode or VS Professional! Python in general seems like an afterthought in VS 2019, it's clunky and slow with bugging autocomplete. The print(df) method in the VSCode Debug Console does not allow you to scroll to columns that are out of the print view. PyCharm is a much smoother development experience at the moment.

@nkkollaw
Copy link

Looks like everyone just keeps bitching about this without providing any solution to move forward, @bw984—and annoying everyone subscribed to this in the process.

For instance, if your company uses VS as your primary IDE and you're in disbelief, why don't you invest in creating a plugin?

@bw984
Copy link

bw984 commented Sep 24, 2020

For the $800 per year VS Professional cost I would hope that itellisense wouldn't be broken, but it is horribly broken for python. I would expect the debugger to be able to ignore certain custom raise exceptions, but it can't. I would expect to be able to view the data in a dataframe while debugging, but I can't. These aren't major asks for a "professional" IDE.

Python support feels like a complete afterthought in VS Pro. Until the python experience at least matches the that of the free VS Code I think additional investment is best spent on companies who are working to give python developers a first class experience. At this point in time I'd rather pay for a yearly enterprise license of Pycharm out of my own pocket than be forced to use VS Pro for python development on the company's dime.

@nkkollaw
Copy link

I feel your pain, I'm just wondering why you'd post a comment here to complain to the devs without offering any kind of solution—can't you call support and ask for your money back?

A few comments ago I suggested we get some money together on BountyHunt or whatever it's called, and pay somebody to do it since Microsoft seems to out of cash and developers /s.

What does your company think about that..?

@ma7555
Copy link

ma7555 commented Sep 30, 2020

I updated vscode to the latest version, and now see that dataframes are printing nicely. Thank you for this. This is how it looks:
Annotation 2020-03-10 133746

this doesn't work with larger DataFrames.

Is this already integrated? I don't find it

@ssameerr
Copy link

ssameerr commented Oct 1, 2020

Where does this requirement fit on the VSCODE roadmap?

@boazdori
Copy link

boazdori commented Oct 1, 2020

What is really needed is the Pycharm science mode!

@bw984
Copy link

bw984 commented Oct 7, 2020

I updated vscode to the latest version, and now see that dataframes are printing nicely. Thank you for this. This is how it
this doesn't work with larger DataFrames.

Is this already integrated? I don't find it

You have to print the dataframe to see that output, it's not built into the debugger at this time. If you have many columns you won't be able to see all of them.

@jamiegl
Copy link

jamiegl commented Oct 19, 2020

I've been using Azure Functions, which is a nuisance to use outside of vscode. This feature would be very helpful to those of us locked into vscode for this reason (although that's not to say I don't really like vscode!). I've been using the to_html() method with the html plugin to some success.

@chopeen
Copy link

chopeen commented Oct 30, 2020

It's not as convenient as hovering over a DataFrame, but you can install https://github.com/adamerose/pandasgui and start it from the debug console.

To visualize a df variable you run:

from pandasgui import show
show(df)

@ejohb
Copy link

ejohb commented Oct 30, 2020

It's not as convenient as hovering over a DataFrame, but you can install https://github.com/adamerose/pandasgui and start it from the debug console.

To visualize a df variable you run:

from pandasgui import show
show(df)

That's worth knowing about so thanks for mentioning it. Though if you're going to use an external process to visualise, it feels like that process should be Excel (or your favourite vegan alternative), e.g.:

def show(df):
    df.to_excel('tmp.xlsx')
    subprocess.call(['excel.exe, 'tmp.xlsx'])

@chopeen
Copy link

chopeen commented Nov 23, 2020

I spent some time researching this topic to see how hard it would be to create such a visualizer extension for the Python debugger in VS Code. I found no easy way. Debug Visualizer, but the support for Python (not to mention DataFrames) is very limited.

I created a feature request microsoft/vscode#111105 for VS Code to support custom data visualizers. It will be added to the roadmap when it gathers 20 upvotes.

🗳️ Go to microsoft/vscode#111105 and vote by clicking 👍!

@deutschmn
Copy link

I built myself a solution for numpy arrays in Jupyter notebooks. It works in VS code but also in other Jupyter environments. Probably doesn't entirely replace the feature asked for in this thread, but it's a good enough intermediary solution - for me at least. 😄

full_example

https://github.com/deutschmn/ndpretty

@joyceerhl
Copy link
Contributor

joyceerhl commented Jan 22, 2021

We have added the ability to open DataFrames in our data viewer from the debugger. Here's a GIF of this in action from our blog post:
DataViewerWhenDebugging

Our data viewer currently supports an allowlist of data types, with more enhancements on the way. We appreciate your patience and look forward to your feedback!

@agrimaldi74
Copy link

That's quite a good start!
It takes a few seconds to open, but then it works perfectly.
What I'd like to have is a refresh button to update the content of the variable after it has been modified, instead of having to select it and open in Data Viewer again in a separate tab.
And also the possibility to change the value of a dataframe or array element.
One thing on filtering pandas dataframes:
I can click on Filter Rows button and enter a condition in any of the white boxes that appear below the column names. The dataframe gets filtered in real time. But when I click again on teh Filter Rows button, the dataframe stays filtered. I'd expect that once the filters are masked, they are not applied. One could get confused unless he shows the filters.
My suggestion is that the filters gets applied only when the conditions are visible or, alternatively, add some indication that the dataframe is filtered even when the conditions are hidden.
Hope in future improvements!
Thanks a lot for now

@joyceerhl
Copy link
Contributor

Thanks for your ideas @agrimaldi74, I've created new issues to track them.

We'll be closing this issue now that we have added DataFrame support to the debugger. Please open a new issue if you encounter problems or have suggestions for improving the current experience 😊

@github-actions github-actions bot locked as resolved and limited conversation to collaborators May 6, 2021
Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Labels
None yet
Projects
None yet
Development

No branches or pull requests