Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Data viewer via debugger doesn't work in remote SSH & WSL scenarios #4065

Closed
joyceerhl opened this issue Dec 1, 2020 · 20 comments
Closed

Data viewer via debugger doesn't work in remote SSH & WSL scenarios #4065

joyceerhl opened this issue Dec 1, 2020 · 20 comments
Labels
bug Issue identified by VS Code Team member as probable bug data-viewer notebook-debugging notebook-remote Applies to remote Jupyter Servers

Comments

@joyceerhl
Copy link
Contributor

There seems to be some issues with opening the data viewer from the debugger variables window still. By far the most common cause of failure is:

NameError: name '_VSCODE_InfoImport' is not defined

This suggests we're failing to eval our import scripts in the debugger somehow. No repro yet.

@joyceerhl joyceerhl added the bug Issue identified by VS Code Team member as probable bug label Dec 1, 2020
@rchiodo
Copy link
Contributor

rchiodo commented Dec 1, 2020

Might this be the remote problem that Ian is working on?

Oh this might mean we do need this for remote debugging because we use this for more than just the interactive window. @IanMatthewHuff I think your fix to go back to the paths for debugging may not work in all cases.

@IanMatthewHuff
Copy link
Member

Yeah, my current fix wouldn't clean up this issue as debugger is basically the same before and after.

Normally our scenarios with debugger are blocked from being remote scenarios (Run-By-Line and IW cell debugging). The assumption here is that this is trying to open a data frame from a remote debugging session? Yeah, that would have an issue with the paths.

If this is a must fix, then I could pick this up right now, I already have the setup for investigating and I'm in that area of code.

@rchiodo
Copy link
Contributor

rchiodo commented Dec 1, 2020

IMO the debugger part isn't must fix as that's still behind an experiment.

@edumotya
Copy link

I think the issue persists. Sometimes, while I am using the debugger with data viewer:

Error: Traceback (most recent call last):
  File "<string>", line 1, in <module>
NameError: name '_VSCODE_VariableImport' is not defined

@joyceerhl
Copy link
Contributor Author

Hey @edumotya, could you please provide the specific steps to reproduce the behavior you're seeing? We've been trying to track down some of these issues for a while now. For example, what debug launch configuration are you using, what source code are you debugging, and what variable are you requesting to look at? Are you debugging remotely or locally, and what platform (e.g. WSL, Linux, Windows, Mac) are you using? Thank you!

@edumotya
Copy link

edumotya commented Mar 12, 2021

Hi @joyceerhl, I am not able to share you the source code. On the other, I can tell you:

  • I am using Ubuntu 20.04.2 LTS.
  • I am debugging locally.
  • The launch configuration is the following:
{
    "python.linting.pylintEnabled": true,
    "python.linting.enabled": true,
    "cSpell.words": [
        "mymodule",
        "gcloud",
        "imgs",
        "jupyterlab"
    ],
    "python.testing.pytestArgs": [
        "mymodule/tests"
    ],
    "python.testing.unittestEnabled": false,
    "python.testing.nosetestsEnabled": false,
    "python.testing.pytestEnabled": true
}

I am not sure... but I think it normally happens in these scenarios:

Scenario 1

  1. I start the code with "Debug test (Multiple)".
  2. Create DataFrame df inside a function f1.
  3. Process DataFrame df inside function f2. Here I have the breakpoint where I use data viewer. At this point it works.
  4. Return DataFrame df back into f1.
  5. Manipulate DataFrame df inside f1 (filtering, adding columns, etc.).
  6. Process the DataFrame again with function f2. Here data viewer crashes.

Scenario 2

  1. I start the code with "Debug test (Multiple)".
  2. Open a DataFrame df in data viewer.
  3. Open a different DataFrame which is also named as df in data viewer, while the previous one is still opened.

I am sorry, but I cannot tell you much more...

@khgouldy
Copy link

Hi team, I'm also experiencing the error edumotya ^ receives with a very similar scenario 1. MAC, Python, debug locally. Small dataframe <500 records / 10cols - so that is likely not related.

It seems to occur when the DF was created in funcA, imported into funcB as an argument, and then at a breakpoint in funcB data viewer is unable to open DF.

@joyceerhl
Copy link
Contributor Author

@khgould Are you seeing the exact same error message? Any chance you can provide a sample code repro?

@RShevcheCape
Copy link

Same error here. Running in the remote Docker container in debug mode.
Context: df is created outside of the function and provided as an argument:

def compute_sampling_weights(df, mode="average"):
...
breakpoint

View Value in Data Viewer results in:
"Error: Traceback (most recent call last): File "", line 1, in NameError: name '_VSCODE_VariableImport' is not defined"

@r-moctezuma
Copy link

Same error here, @joyceerhl , the error happens all the time if you try to inspect a Pandas dataframe while debugging Python inside a Docker container. Here is a public repo with minimalistic code to replicate it:
https://github.com/fractalriver/vscode-error

You can run the sample.py outside Docker and see that you can inspect the dataframe 'df' perfectly well. When you debug it by running it inside Docker, it will error out. To replicate, just put a breakpoint in line 9 of sample.py, and try to view the 'df' dataframe in the Data Viewer (View Value in Data Viewer). You will get:

Error: Traceback (most recent call last): File "", line 1, in NameError: name '_VSCODE_VariableImport' is not defined

Hopefully this helps? We do a lot of work with Python/Pandas in Docker, so this is a big topic for us...
Thank you,

@greazer greazer changed the title Resolve bugs with debugger data viewer feature Data viewer via debugger doesn't work in WSL Apr 22, 2021
@greazer greazer changed the title Data viewer via debugger doesn't work in WSL Data viewer via debugger doesn't work in remote scenarios Apr 22, 2021
@greazer greazer added this to the June 2021 Release milestone Apr 22, 2021
@MaxiTrien
Copy link

@joyceerhl
Same error here, when trying to inspect a dataframe while debugging. Its not really clear to me when this error occurs, because sometimes i can open a df and sometimes i get this error.

Error: Traceback (most recent call last): File "<string>", line 1, in <module> NameError: name '_VSCODE_VariableImport' is not defined

I can't really share anything code related but hoping this bug will be fixed.

Thank you for the effort!

@joyceerhl
Copy link
Contributor Author

joyceerhl commented Apr 23, 2021

There are at least two different problems being reported in this thread.

  1. Docker-specific problem. Unfortunately I cannot reproduce this with the Dockerfile you provided, @r-moctezuma, although I'm running locally in a container using the Remote-Containers extension. Are you running the Docker container locally and using the Remote-Containers extension to connect to it, or running it remotely and using Remote-SSH to connect, or some other configuration?
    image

  2. Function scopes for debugger integration. This may be a straight up bug in our debug adapter tracker logic. I believe this is because we are trying to evaluate a variable in a debug scope that doesn't have access to the imported variable scripts. For example, I can reliably trigger the '_VSCODE_VariableImport' is not defined error when attempting to access df in bar below:

import pandas as pd 
import numpy as np 

def bar(df):
	return df # Trying to view this in the data viewer in a debug session triggers an error

def foo():
	mydf = pd.DataFrame(np.arange(500 * 30).reshape(500, 30)) 
	mydf = bar(mydf) # Viewing `mydf` when stopped on a breakpoint on this line works
	print(mydf) # Viewing `mydf` when stopped on a breakpoint on this line works

foo()

EDIT: Just did some more debugging, including more details here. That code snippet I included above doesn't repro the problem if you only put a breakpoint on the return df line:
singlebreakpoint

But if you set breakpoints in two separate function scopes, that's when things fall over. That made me suspect we're importing the scripts into the first function scope only (and we only do the import once because it's slow, so subsequent references in another scope fail):
twobreakpoints

And sure enough, if I comment out the caching we do here and here, everything seems to work! 🤔

nocache

It does seem like the problem is that the scripts are evaluated once per debug session in a specific function scope, so the data viewer works for variables in function A, but variables in other scopes won't know anything about the imported scripts. We need to ensure we import the scripts into the global scope.

@rchiodo
Copy link
Contributor

rchiodo commented Apr 24, 2021

Maybe we can skip the frame id on the eval? At least the DAP says it's allowed:

/**
   * Evaluate the expression in the scope of this stack frame. If not specified,
   * the expression is evaluated in the global scope.
   */
  frameId?: number;

https://microsoft.github.io/debug-adapter-protocol/specification#Requests_Evaluate

@joyceerhl
Copy link
Contributor Author

Our existing code seems like it needs to be changed because it will eval the script imports in the topmost frame at the time that the data viewer is first opened, and never again even if that frame then gets popped.

But I can't seem to get global imports working. @rchiodo I think you're suggesting evaluating our Python script imports with frameId unspecified, and then evaluating the variables relative to the topmost frame ID? I tried implementing that and it doesn't work. Maybe a bug in DAP implementation. It looks like this was implemented and is supposed to work: microsoft/ptvsd#1729

The debug session customRequest implementation in VS Code seems to transparently forward the args through to the debug adapter without modification, so I've filed an issue on debugpy to get their help with investigating: microsoft/debugpy#598

In any case we can work around whatever the frameId problem is if we eval our import scripts in every new frame that we receive a data viewer request in. This is strictly worse than evaluating the scripts once in the global scope, but also better than evaluating it on every single data viewer request.

@joyceerhl
Copy link
Contributor Author

joyceerhl commented Apr 24, 2021

I think I have a fix for the local-only scenario involving breakpoints in two different function scopes. I created #5627 for that because it is distinct from the Docker scenario. For folks who are encountering this error locally i.e. @edumotya @MaxiTrien and @khgould, can you please try installing this VSIX and letting me know whether it solves the problem for you? https://github.com/microsoft/vscode-jupyter/suites/2575030338/artifacts/56143535 (I don't think this is going to fix the problem when running in Docker.)

@rchiodo in case you're interested, omitting frameId actually doesn't do what I thought it would, because debugpy handles such eval requests by creating a new isolated "global" frame whose contents are inaccessible to any other frame. (And that's perfectly reasonable for Python because every module has its own global scope, i.e. there's no concept of a global frame whose content can be accessed by all children frames.)

@greazer greazer modified the milestones: August 2021, old August 2021 Aug 9, 2021
@greazer greazer added notebook-debugging notebook-remote Applies to remote Jupyter Servers and removed needs-triage labels Sep 2, 2021
@greazer greazer changed the title Data viewer via debugger doesn't work in remote scenarios Data viewer via debugger doesn't work in remote SSH scenarios Sep 2, 2021
@greazer greazer changed the title Data viewer via debugger doesn't work in remote SSH scenarios Data viewer via debugger doesn't work in remote SSH & WSL scenarios Sep 2, 2021
@greazer greazer added the WSL label Sep 2, 2021
@miaoz2001
Copy link

Hi I created an issue #6705 and it is marked as a dupe of this issue, but it has been nearly a year already...when will it get a chance to get fixed? Am I the only one who has this problem?

@rchiodo
Copy link
Contributor

rchiodo commented Mar 3, 2022

Sorry, this is probably obvious, this isn't a high priority at the moment. We use up votes to determine what items need to be fixed.

However, I just tried this with our latest and it works for me in WSL.

image

My suspicion for your original bug was that the python extension and the jupyter extension on the WSL machine were not in sync and didn't run the expected code.

@rchiodo
Copy link
Contributor

rchiodo commented Mar 3, 2022

I'm going to close this as I think the fix that @joyceerhl applied worked. At least I can't repro.

@rchiodo rchiodo closed this as completed Mar 3, 2022
@miaoz2001
Copy link

Hi @rchiodo , thanks for the quick reply! yep, I understand the priority thing, sad that there is no more upvote for this.

yeah...I still have the problem with the latest version. I think my one is related with docker. I am using WSL2, docker, python

image

@miaoz2001
Copy link

Thanks for reopening my old issue, however I couldn't comment on it, so have to do it here.
Let's see how many upvotes there, I doubt it though haha...

@github-actions github-actions bot locked as resolved and limited conversation to collaborators Mar 4, 2023
Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Labels
bug Issue identified by VS Code Team member as probable bug data-viewer notebook-debugging notebook-remote Applies to remote Jupyter Servers
Projects
None yet
Development

No branches or pull requests