Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Explore running notebook extension in dedicated extension host #140374

Closed
rebornix opened this issue Jan 9, 2022 · 3 comments · Fixed by microsoft/vscode-jupyter#9492
Closed
Assignees
Milestone

Comments

@rebornix
Copy link
Member

rebornix commented Jan 9, 2022

We have been receiving reports of slow performance of cell execution in Jupyter extension. The underlining problem is similar to what Vim users are sometimes running into: the extension host runs all extensions and any heavy computation can slow down or block the event loop, and at the end slow down the Vim extension. Cell execution in VS Code workflow is

  1. Users press run button
  2. Execution request sent to extension host
  3. Jupyter extension sent request to Jupyter kernel
  4. Jupyter kernel kept sending output changes to Jupyter extension
  5. Jupyter extension convert raw outputs to VS Code model and sent to UI

Step 4 and 5 are heavily affected by other extensions running in the same extension host thus we want to explore if we can run notebook extensions in its own dedicated extension host.

@rebornix rebornix added the feature-request Request for new features or functionality label Jan 13, 2022
@rebornix rebornix added this to the January 2022 milestone Jan 13, 2022
@rebornix
Copy link
Member Author

After analyzing performance issues reported in Jupyter repo https://github.com/microsoft/vscode-jupyter/issues?page=2&q=is%3Aissue+label%3Aperf, we found there are multiple factors to the cell execution performance:

  1. Starving extension host. As long as there is a blocking call in the extension host, the other extensions like notebook would be blocked.
  2. Remote extension host latency/throughput (e.g. Huge latency executing all comments since update vscode-jupyter#7673 (comment)).
  3. Data transfer overhead. We need to transfer the output data from the raw kernel io to the UI and we need pay for the overhead of encoding/decoding data.

One of our hypothesis is running notebook extension in a separate extension host can help with the first one. Even though we didn't get concrete bug reports for it yet but we can easily write mock test suites to reproduce (say a mock extension constantly block the event loop). While we continue to investigate what is the major factor to the perf issue, we can also start to brainstorm what we might need to tackle to enable running notebook extension in separate extension host, here are some rough ideas:

  • Moving notebook kernel extension to a different extension host would break extension dependencies. For example, Jupyter extension relies on Python extension for Python interpreter detection by looking up vscode.extensions. The extension api doesn't work across multiple extension host
    • We can generate a dependency graph of notebook related extensions and move them all to one extension host
    • We need to ensure that there is only one running instance of a single extension
    • We can probably support extension api across extension hosts (by proxying the API)

@rebornix
Copy link
Member Author

We brainstormed how we can mitigate the starving extension host issue last week and great ideas came out of the discussions, including but not limited to:

  • Allow notebook extension to create controllers living in web worker with a dedicated IPC comm to the UI process. The web worker can talk directly to the Jupyter kernel and transform data to VS Code internals and talk directly to the UI.
  • Users specify a set of extensions running in dedicated extension hosts.

@rebornix
Copy link
Member Author

rebornix commented Feb 22, 2022

Some updates from explorations we did this month:

Firstly, the starving extension host problem can be easily reproduced with a misbehaving extension blocking the event loop

for x in range(10000):
  print(x)
  • Run command "Start Throttling"
  • Run the cells again, the execution would take much longer than expected

Allow notebook extension to create controllers living in web worker with a dedicated IPC comm to the UI process. The web worker can talk directly to the Jupyter kernel and transform data to VS Code internals and talk directly to the UI.

This actually won't work as threads on node still run on the same event loop, meaning if the event loop is blocked, we won't be able to handle other inputs.

Running notebook extensions in separate extension host can solve this problem (experiment code tracked in https://github.com/microsoft/vscode/tree/rebornix/notebook-process) but it has open questions about extension dependencies.

@rebornix rebornix modified the milestones: February 2022, March 2022 Feb 23, 2022
@rebornix rebornix removed the feature-request Request for new features or functionality label Mar 22, 2022
@github-actions github-actions bot locked and limited conversation to collaborators May 6, 2022
Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Labels
None yet
Projects
None yet
Development

Successfully merging a pull request may close this issue.

1 participant