Build hangs when test spawns processes which do not terminate properly #1183

ValentijnMakkenze · 2022-05-11T12:26:36Z

Hi,

After upgrading NUnit and NUnit.ConsoleRunner our build hangs and errors out after the maximum time has exceeded

We came from
NUnit version=3.13.2
NUnit.ConsoleRunner version=3.12.0

And went to
NUnit version=3.13.3
NUnit.ConsoleRunner version=3.15.0

The main difference is that the project that hangs runs in ApartmentState.STA as it uses some proprietary COM objects.
The projects that don't run in ApartmentState.STA seem to work fine.

The last line in our build log is
Results (nunit3) saved as

We are not seeing the committing line, so I guess the nunit-agent.exe process is not terminating correctly
Committing...

Due to the proprietary COM objects I am unable to provide a small repro.

I did some more experiments and it turns out to be the NUnit.ConsoleRunner upgrade that was the culprit.
I started downgrading it and when I got back to 3.12.0 the test didn't hang anymore, so from 3.13.0 and onwards it seems broken.

We are running through OpenCover and I added --trace=Debug and ran once with 3.12.0 and once with 3.13.0

"C:\Git\Neuron Stelsel Registratie\packages\OpenCover.4.7.1221\tools\OpenCover.Console.exe" -register:administrator "-target:C:\Git\Neuron Stelsel Registratie\packages\NUnit.ConsoleRunner.3.12.0\tools\nunit3-console.exe" "-targetargs:""C:\Git\NEURON~1\\Vicrea.GVR.UI.Tests\bin\Debug\Vicrea.GVR.UI.Tests.dll"" --trace=Debug --result=""C:\Git\NEURON~1\/NUnitVicrea.GVR.UI.Tests.xml""" -output:C:\Git\NEURON~1\\Vicrea.GVR.UI.Tests.xml -filter:+[Vicrea.GVR.UI]* -excludebyfile:*Reference*;*Generated* -excludebyattribute:System.Diagnostics.CodeAnalysis.ExcludeFromCodeCoverageAttribute;System.Runtime.CompilerServices.CompilerGeneratedAttribute;System.CodeDom.Compiler.GeneratedCodeAttribute -skipautoprops -hideskipped:all -returntargetcode

Logfiles:
InternalTrace.29508-version3.12.log
nunit-agent_35952-version3.12-.log
InternalTrace.15540-version3.12.log
nunit-agent_31012-version3.12.log

Kind regards,

Valentijn

The text was updated successfully, but these errors were encountered:

ValentijnMakkenze · 2022-08-29T16:46:38Z

I accidentally did an upgrade to 3.15.2 and the build hangs again, downgrading back to 3.12 fixed it

tcali101 · 2022-08-30T10:56:46Z

We have the same problem. Nunit-console runner just hangs and never terminates after test execution. Version 3.12 works fine and newer versions hang. It could be related to Selenium tests. We are using Chrome / ChromeDriver and the libraries Selenium.Support and Selenium.WebDriver to execute Selenium tests with Nunit.

CharliePoole · 2022-08-30T15:57:15Z

@tcali101 @ValentijnMakkenze I'm marking this confirmed. Unfortunately, the logs from @ValentijnMakkenze show no errors at all. The tests terminate successfully. Therefore the hang is happening in the console runner itself.

Can either of you verify that this happens when not using OpenCover? @tcali101 Can you provide a simple repro, not involving any special software?

ValentijnMakkenze · 2022-08-30T16:33:55Z

Ok so I ran once with 3.12
"packages\NUnit.ConsoleRunner.3.12.0\tools\nunit3-console.exe" "Vicrea.GVR.UI.Tests\bin\Debug\Vicrea.GVR.UI.Tests.dll" --trace=Debug --result="NUnitVicrea.GVR.UI.Tests.xml"

And once with 3.15.2
"packages\NUnit.ConsoleRunner.3.15.2\tools\nunit3-console.exe" "Vicrea.GVR.UI.Tests\bin\Debug\Vicrea.GVR.UI.Tests.dll" --trace=Debug --result="NUnitVicrea.GVR.UI.Tests.xml"

The first one finished, the second one hangs

I had to kill the nunit3-console.exe process to be able to zip the logs, but then I noticed that there were two nunit-agent.exe processes. So maybe the agent doesn't close correctly and the console now notices that and waits.
3.15.2logs.zip
3.12logs.zip

tcali101 · 2022-08-31T14:47:45Z

It has something to do with spawned processes. Selenium tests spawn processes and I managed to reproduce it with a simplistic test.

Create a .NET Framework Class Library project.
Packages:

  <package id="NUnit" version="3.13.3" targetFramework="net48" />
  <package id="Selenium.Support" version="4.4.0" targetFramework="net48" />
  <package id="Selenium.WebDriver" version="4.4.0" targetFramework="net48" />

In a new class:

        [Test]
        public void TestHang()
        {
            var driver = new ChromeDriver(@"C:\git\chromedriver_win32");
            driver.Url = "https://google.com";
            //driver.Close();
            //driver.Quit();
        }

Adjust the path to the chrome driver location, get it from https://chromedriver.chromium.org

I tried something more simplistic like:

        [Test]
        public void Test()
        {
            Process.Start(@"notepad.exe");
        }

But that does not reproduce it.

CharliePoole · 2022-09-24T15:38:18Z

@ValentijnMakkenze
The logs are pretty much identical between the two runs. In particular, the agent terminates normally in both runs and the console does everything it needs to do and then tries to exit.

As you suggest, I think that the agent is spawning some other processes, which do not close correctly. Neither the agent nor the runner process are able to terminate while those child processes are alive. It's possible that you have some subtle error in your tests, similar to the more obvious error in your example. However, I think that the runner and engine should ultimately deal with this, terminating all the child processes forcibly if they don't actually close.

CharliePoole · 2022-09-26T16:44:06Z

Interesting! My initial assumption was that 3.12 didn't leave the ChromDriver process running, hence no hang. But it turns out that 3.12 also fails to stop the ChromeDriver process. The difference is that the console runner was not bothered by the hanging process in 3.12 and terminated properly. In newer releases it hangs rather than terminating.

tcali101 · 2022-09-26T16:52:12Z

The difference is that the console runner was not bothered by the hanging process in 3.12 and terminated properly. In newer releases it hangs rather than terminating.

I think that is the correct behavior, it should not be bothered by any child processes. It should not try to terminate processes. Just ignore any running children and finish.

CharliePoole · 2022-09-26T21:53:53Z

I agree. The trick is to figure out what changed. Since nothing was done deliberately, it has to be a side effect of some other change.

CharliePoole · 2022-09-30T19:13:32Z

Interesting... it seems to be a side effect of adding the TCP transport back in April 2021.

If the TCP transport is started and any child processes are left running, we continue to wait for a connection. This happens even if we don't use that transport (.NET Framework tests use remoting).

CharliePoole · 2022-10-02T14:11:43Z

After going down a lot of TCP rabbit holes, I realized that the thread listening for TCP connections was a foreground thread. Making it a background thread resolves the problem.

CharliePoole · 2022-11-15T09:17:23Z

This issue has been resolved in version 3.16.0

The release is available on:
GitHub.
NuGet packages are also available NuGet.org and
Chocolatey Packages may be found at Chocolatey.org

michael77777777777 · 2023-06-28T11:53:30Z

Hello @CharliePoole

unfortunately this issue is still alive in 3.16.3.

We have a gitlab-runner Windows service installed on our "Windows11 Pro 22H2" build machine. This service allows up to 4 parallel gitlab-pipelines. In each gitlab-pipeline we call nunit3-console.exe to execute our test assembly.

From time to time we observe that the nunit3-console.exe ist not terminated by its parent process.
Using Microsofts "ProcessExplorer" tool to analyze the stuck nunit3.console.exe process shows that the parent process is not existing anymore. This could be one consequence of your fix. When switching to the "Threads" tab in ProcessExplorer, a Windows internal process lifecycle-check must be triggered as the stuck nunit3.console.exe terminates immediately. So I am pretty sure that the nunit3.console.exe process is not terminated because of a threading issue. Maybe due to open TCP-connections?

Also, similar to #1183 (comment)_, the nunit log shows no error at all. All tests have been executed successfully.

Any help is highly appreciated.

CharliePoole · 2023-06-28T13:33:05Z

Hi @michael77777777777

If you read through the comments, you'll see that this issue was actually about non-terminating agent processes when running in the STA. Failing to terminate correctly is a fairly generic error, so your situation may not actually be the same bug.

I thought I'd just mention that in passing in case it helps resolve the issue. I'm no longer working on this project but others will undoubtedly pick up the issue.

Charlie

michael77777777777 · 2023-06-28T15:54:02Z

Thank you for the remark @CharliePoole. You are right, the original cause is likely not the same.

However I am pretty sure that the observed behaviour of stuck nunit3.console.exe goes in a similar direction: TCP connections/threading

Would it be better to create a FollowUp ticket as this one here is already closed?

CharliePoole · 2023-06-28T16:05:28Z

Yes, a new issue with details is most likely best. Threads in the example I worked were not hung but were merely waiting for the process to terminate. Once the process was created as a background process, that happened and all was well. OTOH thread hanging issue tends to be due to very specific causes.

Bear in mind that the engine does not monitor threads created by your tests. Although that was something we once considered, it would have required a fairly large addition to the engine, probably as a new service.

Backport #1183 to version315: Eliminate hang due to TcpListener thread

CharliePoole added Bug Needs Confirmation labels May 11, 2022

CharliePoole added this to the 3.15.1 milestone Jun 7, 2022

CharliePoole modified the milestones: 3.15.1, 3.15.2 Jun 29, 2022

CharliePoole added High Priority and removed Needs Confirmation labels Aug 30, 2022

CharliePoole modified the milestones: 3.15.3, 3.16 Sep 24, 2022

CharliePoole changed the title ~~Build hangs after upgrading NUnit.ConsoleRunner to version 3.13.0~~ Build hangs when test spawns processes which do not terminate properly Sep 25, 2022

This was referenced Oct 2, 2022

Eliminate hang due to TcpListener thread #1230

Closed

Eliminate hang due to TCP listener thread #1231

Merged

CharliePoole closed this as completed Oct 2, 2022

CharliePoole self-assigned this Oct 4, 2022

CharliePoole added the PortToVersion4 Version 3 change needs to be ported to the version 4 code in main label Oct 31, 2022

CharliePoole mentioned this issue Nov 2, 2022

Port fixes for 3.16 release to the 4.0 development code #1258

Closed

22 tasks

CharliePoole mentioned this issue Jan 6, 2023

Fix hang when test processes fail to terminate correctly #1289

Merged

CharliePoole removed the PortToVersion4 Version 3 change needs to be ported to the version 4 code in main label Jan 6, 2023

Tabea4 mentioned this issue Apr 12, 2023

Nunit agent process is not terminated properly #1334

Open

This was referenced Dec 19, 2023

Stuck between versions #1371

Closed

Backport #1183 to 3.15.3: Eliminate hang due to TcpListener thread #1372

Closed

Backport #1183 to version315: Eliminate hang due to TcpListener thread #1373

Merged

OsirisTerje added a commit that referenced this issue Dec 24, 2023

Merge pull request #1373 from ValentijnMakkenze/issue-1371-2

c52aa3d

Backport #1183 to version315: Eliminate hang due to TcpListener thread

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Build hangs when test spawns processes which do not terminate properly #1183

Build hangs when test spawns processes which do not terminate properly #1183

ValentijnMakkenze commented May 11, 2022 •

edited

Loading

ValentijnMakkenze commented Aug 29, 2022

tcali101 commented Aug 30, 2022

CharliePoole commented Aug 30, 2022

ValentijnMakkenze commented Aug 30, 2022

tcali101 commented Aug 31, 2022

CharliePoole commented Sep 24, 2022

CharliePoole commented Sep 26, 2022

tcali101 commented Sep 26, 2022

CharliePoole commented Sep 26, 2022

CharliePoole commented Sep 30, 2022

CharliePoole commented Oct 2, 2022

CharliePoole commented Nov 15, 2022

michael77777777777 commented Jun 28, 2023

CharliePoole commented Jun 28, 2023

michael77777777777 commented Jun 28, 2023

CharliePoole commented Jun 28, 2023

Build hangs when test spawns processes which do not terminate properly #1183

Build hangs when test spawns processes which do not terminate properly #1183

Comments

ValentijnMakkenze commented May 11, 2022 • edited Loading

ValentijnMakkenze commented Aug 29, 2022

tcali101 commented Aug 30, 2022

CharliePoole commented Aug 30, 2022

ValentijnMakkenze commented Aug 30, 2022

tcali101 commented Aug 31, 2022

CharliePoole commented Sep 24, 2022

CharliePoole commented Sep 26, 2022

tcali101 commented Sep 26, 2022

CharliePoole commented Sep 26, 2022

CharliePoole commented Sep 30, 2022

CharliePoole commented Oct 2, 2022

CharliePoole commented Nov 15, 2022

michael77777777777 commented Jun 28, 2023

CharliePoole commented Jun 28, 2023

michael77777777777 commented Jun 28, 2023

CharliePoole commented Jun 28, 2023

ValentijnMakkenze commented May 11, 2022 •

edited

Loading