Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Jupyter: Some lists and pd.DataFrames not opening in Data Viewer #4886

Closed
maciejkos opened this issue Feb 20, 2021 · 12 comments
Closed

Jupyter: Some lists and pd.DataFrames not opening in Data Viewer #4886

maciejkos opened this issue Feb 20, 2021 · 12 comments
Assignees
Labels
bug Issue identified by VS Code Team member as probable bug

Comments

@maciejkos
Copy link

maciejkos commented Feb 20, 2021

Environment data

  • VS Code version: 1.53.2 (user setup)
  • Jupyter Extension version: v2021.2.576440691
  • Python Extension version (available under the Extensions sidebar): v2021.2.582707922
  • OS (Windows | Mac | Linux distro) and version: Windows_NT x64 10.0.19042
  • Python and/or Anaconda version: Python 3.7.8 64-bit
  • Type of virtual environment used: conda
  • Jupyter server running: Local

Expected behaviour

Opening all lists and pd.DataFrames in Data Viewer should display their contents.

Actual behaviour

Some lists and DataFrames are not displayed in Data Viewer.

Steps to reproduce:

  1. Make a list of files (paths to files)
import pathlib

# make a long path
path = pathlib.Path.cwd()
new_path = path / "some_analysis_folder_with_data" / "some_output_folder"
pathlib.Path(new_path).mkdir(parents=True, exist_ok=True)

# make long files in the folder
file_name_stem = "some_file_name_foooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooo"
file_suffix = ".gz"

for i in range(1,8):
    new_file = new_path / f"{file_name_stem}_{i}{file_suffix}"
    pathlib.Path(new_file).touch()

new_files = [file for file in new_path.iterdir()]
# type(new_files[0]) # pathlib.WindowsPath
new_files # this list doesn't open in Data Viewer
  1. Make a list of "fake" files, i.e., strings, not pathlib.WindowsPath objects

# make a new list (fake_files_list) similar to new_files, but with strings, not pathlib.WindowsPath objects
## make sure the elements in fake_files_list are as long as those in new_files

fake_folder_name = str(new_file.parent)
letters_for_fake_file = string.ascii_lowercase[:len(new_files)]
new_file_stem_len = len(str(new_file.stem[:-2]))

fake_files_list = []

for i in range(len(new_files)):
    fake_files_list.append(f"{fake_folder_name}\\{letters_for_fake_file[i]*new_file_stem_len}_{i}{file_suffix}")

for i in range(len(new_files)):
    assert len(fake_files_list[i]) == len(str(new_files[i])), f"ERROR: Lenghts are different for element {i}"

fake_files_list
  1. fake_files_list opens in Data Viewer, but new_files doesn't. I have a similar issue with some DataFrames. The size of the Dataframe does not seem to matter.

2021-02-19_21-52-14

Logs

Output for Jupyter in the Output panel (ViewOutput, change the drop-down the upper-right of the Output panel to Jupyter)

> ~\Anaconda3\envs\mobs_abm\python.exe -c "import pandas;print(pandas.__version__)"
Error 2021-02-19 22:09:28: Failure during variable extraction:
---------------------------------------------------------------------------
OverflowError                             Traceback (most recent call last)
<ipython-input-3-8fc409734a9b> in <module>()
----> 1 print(_VSCODE_getDataFrameRows(new_files, 0, 7))

<ipython-input-3-93cda1340104> in _VSCODE_getDataFrameRows(df, start, end)
    136     except:
    137         pass
--> 138     return _VSCODE_pd_json.to_json(None, rows, orient="table", date_format="iso")
    139 
    140 

C:\Users\Christine\Anaconda3\envs\mobs_abm\lib\site-packages\pandas\io\json\_json.py in to_json(path_or_buf, obj, orient, date_format, double_precision, force_ascii, date_unit, default_handler, lines, compression, index, indent)
     82         default_handler=default_handler,
     83         index=index,
---> 84         indent=indent,
     85     ).write()
     86 

C:\Users\Christine\Anaconda3\envs\mobs_abm\lib\site-packages\pandas\io\json\_json.py in write(self)
    142             self.date_format == "iso",
    143             self.default_handler,
--> 144             self.indent,
    145         )
    146 

C:\Users\Christine\Anaconda3\envs\mobs_abm\lib\site-packages\pandas\io\json\_json.py in _write(self, obj, orient, double_precision, ensure_ascii, date_unit, iso_dates, default_handler, indent)
    339             iso_dates,
    340             default_handler,
--> 341             indent,
    342         )
    343 

C:\Users\Christine\Anaconda3\envs\mobs_abm\lib\site-packages\pandas\io\json\_json.py in _write(self, obj, orient, double_precision, ensure_ascii, date_unit, iso_dates, default_handler, indent)
    242             iso_dates,
    243             default_handler,
--> 244             indent,
    245         )
    246 

C:\Users\Christine\Anaconda3\envs\mobs_abm\lib\site-packages\pandas\io\json\_json.py in _write(self, obj, orient, double_precision, ensure_ascii, date_unit, iso_dates, default_handler, indent)
    164             iso_dates=iso_dates,
    165             default_handler=default_handler,
--> 166             indent=indent,
    167         )
    168 

OverflowError: Maximum recursion level reached
Error 2021-02-19 22:09:28: [Error: Failure during variable extraction:
---------------------------------------------------------------------------
OverflowError                             Traceback (most recent call last)
<ipython-input-3-8fc409734a9b> in <module>()
----> 1 print(_VSCODE_getDataFrameRows(new_files, 0, 7))

<ipython-input-3-93cda1340104> in _VSCODE_getDataFrameRows(df, start, end)
    136     except:
    137         pass
--> 138     return _VSCODE_pd_json.to_json(None, rows, orient="table", date_format="iso")
    139 
    140 

C:\Users\Christine\Anaconda3\envs\mobs_abm\lib\site-packages\pandas\io\json\_json.py in to_json(path_or_buf, obj, orient, date_format, double_precision, force_ascii, date_unit, default_handler, lines, compression, index, indent)
     82         default_handler=default_handler,
     83         index=index,
---> 84         indent=indent,
     85     ).write()
     86 

C:\Users\Christine\Anaconda3\envs\mobs_abm\lib\site-packages\pandas\io\json\_json.py in write(self)
    142             self.date_format == "iso",
    143             self.default_handler,
--> 144             self.indent,
    145         )
    146 

C:\Users\Christine\Anaconda3\envs\mobs_abm\lib\site-packages\pandas\io\json\_json.py in _write(self, obj, orient, double_precision, ensure_ascii, date_unit, iso_dates, default_handler, indent)
    339             iso_dates,
    340             default_handler,
--> 341             indent,
    342         )
    343 

C:\Users\Christine\Anaconda3\envs\mobs_abm\lib\site-packages\pandas\io\json\_json.py in _write(self, obj, orient, double_precision, ensure_ascii, date_unit, iso_dates, default_handler, indent)
    242             iso_dates,
    243             default_handler,
--> 244             indent,
    245         )
    246 

C:\Users\Christine\Anaconda3\envs\mobs_abm\lib\site-packages\pandas\io\json\_json.py in _write(self, obj, orient, double_precision, ensure_ascii, date_unit, iso_dates, default_handler, indent)
    164             iso_dates=iso_dates,
    165             default_handler=default_handler,
--> 166             indent=indent,
    167         )
    168 

OverflowError: Maximum recursion level reached
	at k.extractJupyterResultText (c:\Users\Christine\.vscode\extensions\ms-toolsai.jupyter-2021.2.576440691\out\client\extension.js:49:572275)
	at k.deserializeJupyterResult (c:\Users\Christine\.vscode\extensions\ms-toolsai.jupyter-2021.2.576440691\out\client\extension.js:49:572394)
	at k.getDataFrameRows (c:\Users\Christine\.vscode\extensions\ms-toolsai.jupyter-2021.2.576440691\out\client\extension.js:49:570203)
	at async r.getRows (c:\Users\Christine\.vscode\extensions\ms-toolsai.jupyter-2021.2.576440691\out\client\extension.js:49:210853)
	at async c:\Users\Christine\.vscode\extensions\ms-toolsai.jupyter-2021.2.576440691\out\client\extension.js:49:206244
	at async C.wrapRequest (c:\Users\Christine\.vscode\extensions\ms-toolsai.jupyter-2021.2.576440691\out\client\extension.js:49:206473)]

Log from Extension Host.
[2021-02-19 22:02:01.803] [exthost] [warning] [Decorations] CAPPING events from decorations provider vscode.git 4157
[2021-02-19 22:06:00.763] [exthost] [info] ExtensionService#_doActivateExtension vscode.typescript-language-features {"startup":false,"extensionId":{"value":"vscode.typescript-language-features","_lower":"vscode.typescript-language-features"},"activationEvent":"onLanguage:jsonc"}
[2021-02-19 22:06:00.764] [exthost] [info] ExtensionService#loadCommonJSModule file:///c:/Users/Christine/AppData/Local/Programs/Microsoft VS Code/resources/app/extensions/typescript-language-features/dist/extension
[2021-02-19 22:06:06.486] [exthost] [warning] [Decorations] CAPPING events from decorations provider vscode.git 4157
@maciejkos maciejkos added the bug Issue identified by VS Code Team member as probable bug label Feb 20, 2021
@maciejkos maciejkos changed the title Some lists and pd.DataFrames not opening in Data Viewer Jupyter: Some lists and pd.DataFrames not opening in Data Viewer Feb 20, 2021
@DavidKutu
Copy link

Thanks for reporting it @maciejkos, and thanks for your detailed repro steps. I can also get it using regular python.
I'll bring this up in our next meeting.

@greazer greazer added the info-needed Issue requires more information from poster label Feb 25, 2021
@greazer
Copy link
Member

greazer commented Feb 25, 2021

@joyceerhl , please check to see if this is a regression.

@joyceerhl
Copy link
Contributor

Not a regression, same behavior exists in the December release. Root cause is new_files is an array of Python objects, and we don't support viewing arrays of objects right now. @maciejkos would you expect to see the string representation of the path objects in the data viewer when viewing new_files?

@maciejkos
Copy link
Author

Thanks for looking into it @joyceerhl, @DavidKutu, and @greazer!

@joyceerhl I like how PyCharm is doing it.

At the very least, I would like to see a string representation like below, perhaps with a nicer presentation and formatting.
image

What I would really like is the ability to explore the objects in more detail like here.
image

I have a question. I mentioned in my report that I had experienced the same issue a few times with some pd.dataframes. Unfortunately, I don't remember which ones. Would you venture a guess as to what dataframes could be affected? I haven't created any dataframes with pathlib.WindowsPath objects :)

@maciejkos
Copy link
Author

maciejkos commented Mar 4, 2021

@greazer Do we still need the "info-needed" label? :)

@joyceerhl
Copy link
Contributor

Sorry for the delayed response. We've had other reports of dataframes failing to load, e.g. #4367. If you have an example that would really help us.

@mbijjur
Copy link

mbijjur commented Mar 15, 2021

I found the same behavior when I tried to view the transposed DataFrame (the regular dataframe has no issues).
This was working fine until the recent update of VS Code.

Version: 1.54.2 (user setup)
Commit: fd6f3bce6709b121a895d042d343d71f317d74e7
Date: 2021-03-11T00:56:19.848Z
Electron: 11.3.0
Chrome: 87.0.4280.141
Node.js: 12.18.3
V8: 8.7.220.31-electron.0
OS: Windows_NT x64 10.0.19042

@mbijjur
Copy link

mbijjur commented Mar 15, 2021

Hope this helps. Version below is perfectly working with variable viewer. So something between 1.53.2 and 1.54.2 has changed which is breaking the Data viewer.

Version: 1.53.2 (user setup)
Commit: 622cb03f7e070a9670c94bae1a45d78d7181fbd4
Date: 2021-02-11T11:48:04.245Z
Electron: 11.2.1
Chrome: 87.0.4280.141
Node.js: 12.18.3
V8: 8.7.220.31-electron.0
OS: Windows_NT x64 10.0.19042

@joyceerhl
Copy link
Contributor

Hi @mbijjur can you share a code sample and the version of the Jupyter extension that you have? Or is it the same version with either build of VS Code?

@mbijjur
Copy link

mbijjur commented Mar 16, 2021

Hi @mbijjur can you share a code sample and the version of the Jupyter extension that you have? Or is it the same version with either build of VS Code?

It is the same Jupyter extension version. You can try with any simple pandas df example.
column names does not appear when transposed.

image

@joyceerhl
Copy link
Contributor

@mbijjur thank you, this is a regression in the Jupyter extension (nothing to do with VS Code core), I know why this is happening. Will submit a fix. Sorry for the trouble.

@joyceerhl
Copy link
Contributor

Update: we fixed the problem with transposed DataFrames in the April release. The original problem with objects not being displayed would be resolved by #1138 so closing as a duplicate of that. Please feel free to upvote that issue instead.

@github-actions github-actions bot locked as resolved and limited conversation to collaborators May 14, 2021
Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Labels
bug Issue identified by VS Code Team member as probable bug
Projects
None yet
Development

No branches or pull requests

5 participants