Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

WT should start up fast: profile the startup path and trim anything that takes a while #5907

Open
2 of 3 tasks
Tracked by #13392
ghost opened this issue May 14, 2020 · 42 comments
Open
2 of 3 tasks
Tracked by #13392
Labels
Area-Performance Performance-related issue Help Wanted We encourage anyone to jump in on these. Issue-Task It's a feature request, but it doesn't really need a major design. Product-Terminal The new Windows Terminal.

Comments

@ghost
Copy link

ghost commented May 14, 2020

Steps to reproduce

  1. Click to launch Windows Terminal

Expected behavior

Windows Terminal should be ready instantaneously like windows console, or like Sublime Text.
Windows Terminal can't be slower than windows console.

While Windows Terminal is fast compared with other tools like Visual Studio, or iTunes, it is still not fast enough for a Terminal application.

Actual behavior

It takes too long to startup. It is not ready instantaneously. It's not as fast as windows console.


maintainer note: hijacking OP for task list:

  • Use exactly one ColorPickerFlyout for all tabs, and just redirect it to whatever tab activated it.
  • Delayload the ColorPickerFlyout.
  • Add a setting to disable ALL fragments / dynamic profiles. The app catalog search is expensive.
    • Maybe we can just skip loading dynamic profiles on initial launch, unless we discover that the defaultProfile is a dynamic one... Nah, cause what about for defterm launches that end up matching a dynamic profile.
@ghost ghost added Needs-Triage It's a new issue that the core contributor team needs to triage at the next triage meeting Needs-Tag-Fix Doesn't match tag requirements labels May 14, 2020
@zadjii-msft
Copy link
Member

zadjii-msft commented May 14, 2020

I mean, the Terminal is doing a lot more than the console ever was. I'm not sure there's much more we can do to optimize our UI setup. Conhost was using basically the simplest Win32/GDI interface possible, and the Terminal needs to stand up a XAML stack. Even if we somehow had a server process that already had the settings pre-loaded, we'd still need to stand up the UI stack.

At least the Terminal is faster at processing output than the console ever was, and opening new tabs/panes is certainly faster than opening a new conhost is.

Maybe there's something we can do here to optimize the creation of the XAML stack.


7/21/2022 edit: putting this here so it doesn't ping everyone on this thread.

While investigating another issue:

image

  • parsing the json is the green column.
  • Looking for fragments is wholly 50% of the settings load cost
  • Az cloud Shell is another 20% (Just creating the IAzureConnectionStatics is that expensive? That's crazy. Hard to be sure, I don't have symbols working 😕)
  • In this trace:
    • TryLoadSettings is 44/284 for the AppHost ctor
    • ctoring the XAML resources is 62/284 for the AppHost ctor, of the 139 for just instantiating App
    • instantiating the CmdPal looks like another 30/284
    • Outside of the apphost ctor:
      • opening a new tab is like 53 (_CreateNewTabFromPane), creating the color picker flyout is 15 of that.
      • image

@zadjii-msft zadjii-msft added Area-Performance Performance-related issue Issue-Task It's a feature request, but it doesn't really need a major design. Product-Terminal The new Windows Terminal. labels May 14, 2020
@ghost ghost removed the Needs-Tag-Fix Doesn't match tag requirements label May 14, 2020
@zadjii-msft zadjii-msft added this to the Terminal Backlog milestone May 14, 2020
@ghost

This comment has been minimized.

@mdtauk

This comment has been minimized.

@zadjii-msft

This comment has been minimized.

@AnuthaDev

This comment has been minimized.

@zadjii-msft

This comment has been minimized.

@ghost

This comment has been minimized.

@mdtauk

This comment has been minimized.

@mdtauk

This comment has been minimized.

@ghost

This comment has been minimized.

@AnuthaDev

This comment has been minimized.

@DHowett-MSFT

This comment has been minimized.

@AnuthaDev

This comment has been minimized.

@mdtauk

This comment has been minimized.

@oising

This comment has been minimized.

@AnuthaDev

This comment has been minimized.

@mdtauk

This comment has been minimized.

@DHowett-MSFT

This comment has been minimized.

@DHowett-MSFT

This comment has been minimized.

@mdtauk

This comment has been minimized.

@DHowett-MSFT

This comment has been minimized.

@DHowett-MSFT DHowett-MSFT changed the title Windows Terminal takes longer than windows console and longer than other apps that need to load instantaneously WT should start up fast: profile the startup path and trim anything that takes a while May 15, 2020
@DHowett-MSFT DHowett-MSFT removed the Needs-Triage It's a new issue that the core contributor team needs to triage at the next triage meeting label May 15, 2020
DHowett pushed a commit that referenced this issue Aug 31, 2022
This commit stores a hash of the `settings.json` file in `ApplicationState`
with which we can detect whether the settings contents actually changed.
Since I only use a small 64-bit hash as opposed to SHA2 for instance,
I'm taking the last write time of the file into account as well.
This allows us to skip calling `UpdateJumplist` at least the majority of app
launches which hopefully improves launch performance on devices with slower IO.

Part of #5907.

## Validation Steps Performed


* Delete some profiles (see above), save settings, tasks are gone ✅
  FYI For some (...) inexplicable reason, shell task lists are preserved forever
  even if msix applications are uninstalled, etc. So to test whether tasks are
  properly written on first app launch we have to delete some profiles/tasks
  first, otherwise we can't see whether they're actually written later.
* Now exit Windows Terminal, delete `settings.json` and relaunch
* All tasks are back ✅
* With a debugger, ensure that `CascadiaSettings::WriteSettingsToDisk`
  generates the same hash that `LoadAll` reads. ✅
ghost pushed a commit that referenced this issue Sep 9, 2022
This commit reduces the amount of telemetry during general usage by about half.
8 events that weren't really used anymore were removed.
1 new event was added ("AppInitialized") which will help us investigate #5907.
During review 9 events were found that were incorrectly tagged as perf. data.

## Validation Steps Performed
* Launch Windows Terminal
* The "latency" field "AppInitialized" matches the approx. launch time ✅
DHowett pushed a commit that referenced this issue Sep 9, 2022
This commit reduces the amount of telemetry during general usage by about half.
8 events that weren't really used anymore were removed.
1 new event was added ("AppInitialized") which will help us investigate #5907.
During review 9 events were found that were incorrectly tagged as perf. data.

* Launch Windows Terminal
* The "latency" field "AppInitialized" matches the approx. launch time ✅

(cherry picked from commit 37c159a)
Service-Card-Id: 85548958
Service-Version: 1.15
@DHowett
Copy link
Member

DHowett commented Sep 12, 2022

Discussion idea:

  • Font metrics do not change between machines. We can build a compiled-in cache of font cell sizes at different point sizes for the five most popular fonts (Cascadia Code/Mono, Consolas, Lucida Console, Various types of Fira?) and short circuit the calculation. That will stop us from creating a font collection on every startup.

@vadimkantorov

This comment was marked as off-topic.

@zadjii-msft

This comment was marked as off-topic.

@vadimkantorov

This comment was marked as off-topic.

@vadimkantorov
Copy link

In #15001 a related usecase: Windows OS running some autoruns do quick cmd.exe some_script.cmd that do not print anything and do not require user input. This spins up many Terminal instances and it's quite slow. The special thing about this usecase is that if the script completes quick, it was worthless of doing full XAML loading and such. So if full rendering can be delayed and then skipped completely because of exit in 500ms, it'd be a big win.

@zadjii-msft zadjii-msft modified the milestones: Terminal v1.18, Up Next Apr 4, 2023
microsoft-github-policy-service bot pushed a commit that referenced this issue Apr 18, 2023
This sets `x:Load` to `false` for the two elements.
On my system, with Windows Defender disabled, this reduces CPU
usage by 15ms and the visual delay during launch by 40ms.

Part of #5907

## PR Checklist
* Ctrl+Shift+P opens command palette ✅
* Context menu opens command palette ✅
* Context menu opens about dialog ✅
DHowett pushed a commit that referenced this issue Apr 19, 2023
This sets `x:Load` to `false` for the two elements.
On my system, with Windows Defender disabled, this reduces CPU
usage by 15ms and the visual delay during launch by 40ms.

Part of #5907

## Validation Steps Performed
* Ctrl+Shift+P opens command palette ✅
* Context menu opens command palette ✅
* Context menu opens about dialog ✅
DHowett pushed a commit that referenced this issue Apr 20, 2023
This fixes 3 sources for animations:
* `TabView`'s `EntranceThemeTransition` causes tabs to slowly slide in
  from the bottom. Removing the transition requires you to override the
  entire list of transitions obviously, which is a global change. Nice.
  Am I glad I don't need to deal with the complexity of CSS. /s
* `TabBase`, `SettingsTab` and `TerminalTab` were using a lot of
  coroutines with `resume_foreground` even though almost none of the
  functions are called from background tabs in the first place. This
  caused us to miss the initial XAML drawing pass, which resulted in
  animations when the tab icons would asynchronously pop into existence.
  It also appears as if `resume_foreground`, etc. have a very high CPU
  cost attached, which surprises me absolutely not at all given WinRT.

The improvement is difficult to quantify because the run to run
variation is very high. But it seems like this shaves about 10% off
of the ~500ms startup delay on my PC depending on how you measure it.

Part of #5907

## PR Checklist
* It starts when it should ✅
* It doesn't "exit" when it shouldn't ✅
  (Scrolling, Settings reload, Bell `\a`, Progress `\e]9;4;2;80\e\\`)
@microsoft microsoft deleted a comment from LinuxOnTheDesktop Apr 24, 2023
@lhecker
Copy link
Member

lhecker commented Apr 24, 2023

With the recent barrage of improvements, I've made a new perf trace today. This time using Nvidia Nsight Systems, because it has a neat way to represent delays (zoom in as needed):

trace

Of the remaining ~400ms ~320ms launch cost of Windows Terminal, about 240ms (60% 75%) are due to WinUI and XAML. There are some things we can do about that, but it'll be very difficult, because WinUI isn't exactly easy to manipulate into being lean. For instance, the C++ XAML generator has a bug, where it doesn't emit metadata for system types into our metadata cache. When WinUI then starts, it tries to look up those system types, can't find them and will look around in all registered user providers. This causes Microsoft.Terminal.Settings.Model.dll and everything else to be loaded, which takes ~10% (maybe more). Preventing this isn't easily possible, because creating 2 metadata caches (one for WT and one for the settings model) isn't documented and probably not supported. Most of the time is spent in the layout and rendering code1 though and that's an area we can't improve upon.

Another 80ms (20%) are caused by our workaround for #11648, a bug which still isn't fixed in Windows unfortunately. I've been thinking about just adding a setting that will load the nearby fonts if the setting is enabled. I don't think caching the font size is a good idea because that would only improve launch time by 10% instead of the expected 20%. It would worsen the user experience for some but improve it for others. This is what I'd like to fix asap. It's the easiest improvement we can make at this point. Fixed.

The remaining 80ms (20%) will be difficult to fix. For instance, HWND creation costs 20ms and we need at least 2 (main thread + 1 for each window). Setting up the Monarch COM server and negotiating that costs another 5-10ms, by nature of COM setup being slow. We allow loading fragments via the app extension catalog, which is an LPC and extremely slow (10ms for returning an empty list). That's almost the entire cost already.

Footnotes

  1. FizzBuzzEnterpriseEdition

DHowett pushed a commit that referenced this issue Apr 25, 2023
This fixes 3 sources for animations:
* `TabView`'s `EntranceThemeTransition` causes tabs to slowly slide in
  from the bottom. Removing the transition requires you to override the
  entire list of transitions obviously, which is a global change. Nice.
  Am I glad I don't need to deal with the complexity of CSS. /s
* `TabBase`, `SettingsTab` and `TerminalTab` were using a lot of
  coroutines with `resume_foreground` even though almost none of the
  functions are called from background tabs in the first place. This
  caused us to miss the initial XAML drawing pass, which resulted in
  animations when the tab icons would asynchronously pop into existence.
  It also appears as if `resume_foreground`, etc. have a very high CPU
  cost attached, which surprises me absolutely not at all given WinRT.

The improvement is difficult to quantify because the run to run
variation is very high. But it seems like this shaves about 10% off
of the ~500ms startup delay on my PC depending on how you measure it.

Part of #5907

* It starts when it should ✅
* It doesn't "exit" when it shouldn't ✅
  (Scrolling, Settings reload, Bell `\a`, Progress `\e]9;4;2;80\e\\`)

(cherry picked from commit 35b9e75)
Service-Card-Id: 89001994
Service-Version: 1.17
microsoft-github-policy-service bot pushed a commit that referenced this issue May 3, 2023
`IDWriteFontSetBuilder` is super expensive (~40ms of CPU for building a
single font set on a high-end CPU from ~2021). Let's avoid the cost,
by only constructing it if Cascadia Code is actually missing.
To not overcomplicate the code and to support any additional fonts we
might ship in the future, I'm not checking for the font name, and
instead I just construct the font set whenever any font is missing.

Part of #5907

## Validation Steps Performed
* Breakpoints in FontCache aren't hit ✅
* App doesn't crash ✅
@vadimkantorov
Copy link

vadimkantorov commented May 5, 2023

As Terminal users, we can upvote certain issues/bugs (that are preventing Terminal from being faster) on XAML github if they have github repo and you link these bugs :) same for any linked Feedback items about general appx slowness which you consider slowing Terminal.

Also, earlier I noted that my Terminal loads a ton of unrelated DLLs, including something related to Maps controls.

@eduarddejong
Copy link

If I may add anything functional here. I can imagine that a possible solution would be to differentiate between 2 different ways of starting the Terminal:

  1. Opening the application by explicitly clicking it as a user, whether that is via a shortcut or the right click menu in a folder.
  2. Automatic launches by several console application executions when being configured as the default terminal in Windows, which would otherwise launch the old conhost.exe console window of windows.

When launching the first way, I love features, and startup time does not matter that much.

When launching the second way, the application should be absolutely blazingly fast by really not loading any more resources than it needs. Because these launches can happen a lot of time after each other. For example when running any kind of automated batch operation.

I don't know if this is possible, but it might be helpful as an idea.

@carlos-zamora carlos-zamora modified the milestones: Up Next, Terminal v1.23 Aug 1, 2024
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Area-Performance Performance-related issue Help Wanted We encourage anyone to jump in on these. Issue-Task It's a feature request, but it doesn't really need a major design. Product-Terminal The new Windows Terminal.
Projects
None yet
Development

No branches or pull requests