[🐛 Bug]: Race condition in ruby library for capybara system tests #14454

krschacht · 2024-08-28T22:11:08Z

What happened?

I've been using successfully using Capybara in Rails for quite some time (many months). But one day, about a month ago, my system tests started sporadically failing in my Github CI Actions with Net::ReadTimeout with "Net::ReadTimeout with #<TCPSocket:(closed)>". If I re-run the test suite a few times I can eventually get it to successfully run through. I've tried many different workarounds but none of them work around the issue. I've tried rolling back all changes in my repo to months ago when tests were consistently passing, and that doesn't seem to fix it either.

We've spent many hours investigating the cause and we currently think there is a race condition somewhere between chromedriver and selenium. My project is an open source project so here is a direct link to one of the failed CI runs where you can see the full stack trace: https://github.com/AllYourBot/hostedgpt/actions/runs/10533347868/job/29189182499?pr=498

The Net::ReadTimeout is coming from capybara (aka selenium) failing to hit chromedriver when attempting to set up the server. One of my engineers has outlined his read of that stack trace:

I think the tests run (and fail) before puma is started by capybara
The test hung because the server was still running and ruby wouldn't exit
It says the TCP socket was closed -- does this means the socket was open when it started but closed during the exchange? Or that it was never open? I suspect the former because the stack trace is in the middle of a read loop.
The failure is in the area of code which causes chromedriver to build a new session (ie, start chrome up):

Also, another thing that suggests a race condition is that when we SSH into the job mid-run, it sometimes fails or hangs for a bit. But if I interrupt the process (^c) and then re-run it, it goes fine.

Capybara Version: 3.39.2
Driver Information (and browser if relevant): selenium-webdriver (4.23.0) using headless chrome

How can we reproduce the issue?

1. On github you can [fork this repo](https://github.com/AllYourBot/hostedgpt)
2. I've configured the Github CI Actions to **not** run system tests on forks, but (a) [delete this line](https://github.com/AllYourBot/hostedgpt/blob/main/.github/workflows/rubyonrails.yml#L49) to remove the short circuit, and (b) change the very next "runs-on" line back to `ubuntu-latest` which are the default Github Action servers.
3. Push a change to the repo to trigger Github CI to run

Relevant log output

You can see the full stack trace: https://github.com/AllYourBot/hostedgpt/actions/runs/10533347868/job/29189182499?pr=498

Operating System

Alpine Linux

Selenium version

4.23.0 of selenium-webdriver gem

What are the browser(s) and version(s) where you see this issue?

Chrome

What are the browser driver(s) and version(s) where you see this issue?

ChromeDriver but not sure how to get version, latest, I think

Are you using Selenium Grid?

No

The text was updated successfully, but these errors were encountered:

github-actions · 2024-08-28T22:11:25Z

@krschacht, thank you for creating this issue. We will troubleshoot it as soon as we can.

Info for maintainers

Triage this issue by using labels.

If information is missing, add a helpful comment and then I-issue-template label.

If the issue is a question, add the I-question label.

If the issue is valid but there is no time to troubleshoot it, consider adding the help wanted label.

If the issue requires changes or fixes from an external project (e.g., ChromeDriver, GeckoDriver, MSEdgeDriver, W3C), add the applicable G-* label, and it will provide the correct link and auto-close the issue.

After troubleshooting the issue, please add the R-awaiting answer label.

Thank you!

AnrichVS · 2024-09-03T13:38:06Z

Hi,

I recently starting experiencing exactly what @krschacht describes.

I believe it might be related to the Chrome version. On my host OS (Arch) the issue occurs, and I'm running:

google-chrome 128.0.6613.84-1

Within a Docker container, with exactly the same code base (mounted from host OS), the issue doesn't occur. It is running:

chromium-112.0.5615.165-r0

I suspect this is related to the Chrome version since it only started happening on my host OS recently after having done a full system upgrade (which also upgraded Chrome).

sickdyd · 2024-09-05T15:12:18Z

We are facing the same problem. Had specs working fine for years and since a month ago or so they started having Net::ReadTimeout: errors. I tried literally everything I could think of and searched everywhere online, nothing seems to fix the problem.

sickdyd · 2024-09-10T10:06:48Z

I can confirm that the most recent versions of Chrome seem to be the root cause.

I could solve the problem by using version 126.0.6478.61 for both Chrome and chromedriver.

Not a permanent solution, but for the time being is better than having specs constantly failing.

Note that the Chrome installer requires to add -1 to the version in the download link.

# CHROME_DRIVER_VERSION=126.0.6478.61

- name: Install Chrome
  run: |
    # Download specific Chrome version
    wget https://dl.google.com/linux/chrome/deb/pool/main/g/google-chrome-stable/google-chrome-stable_${CHROME_DRIVER_VERSION}-1_amd64.deb
    # Install Chrome
    sudo apt-get install -y --allow-downgrades ./google-chrome-stable_${CHROME_DRIVER_VERSION}-1_amd64.deb

- name: Install ChromeDriver
  run: |
    wget "https://storage.googleapis.com/chrome-for-testing-public/${CHROME_DRIVER_VERSION}/linux64/chromedriver-linux64.zip"
    unzip chromedriver-linux64.zip
    sudo mv chromedriver-linux64/chromedriver /usr/local/bin/
    rm chromedriver-linux64.zip
    rm -rf chromedriver-linux64

tvdeyen · 2024-09-10T11:58:40Z

We experience the same issues. We pinned the Chrome version to 127 with this setup

 Capybara.register_driver :selenium_chrome_headless do |app|
   options = ::Selenium::WebDriver::Chrome::Options.new.tap do |opts|
     opts.add_argument("--headless")
     opts.add_argument("--disable-gpu") if Gem.win_platform?
     # Workaround https://bugs.chromium.org/p/chromedriver/issues/detail?id=2650&q=load&sort=-id&colspec=ID%20Status%20Pri%20Owner%20Summary
     opts.add_argument("--disable-site-isolation-trials")
     opts.add_argument("--window-size=1920,1080")
     opts.add_argument("--disable-search-engine-choice-screen")
+    opts.browser_version = "127"
   end
 
   Capybara::Selenium::Driver.new(app, browser: :chrome, options: options)
 end

and all tests run fine.

But it fails with Chrome 128

 Capybara.register_driver :selenium_chrome_headless do |app|
   options = ::Selenium::WebDriver::Chrome::Options.new.tap do |opts|
     opts.add_argument("--headless")
     opts.add_argument("--disable-gpu") if Gem.win_platform?
     # Workaround https://bugs.chromium.org/p/chromedriver/issues/detail?id=2650&q=load&sort=-id&colspec=ID%20Status%20Pri%20Owner%20Summary
     opts.add_argument("--disable-site-isolation-trials")
     opts.add_argument("--window-size=1920,1080")
     opts.add_argument("--disable-search-engine-choice-screen")
-    opts.browser_version = "127"
+    opts.browser_version = "128"
   end
 
   Capybara::Selenium::Driver.new(app, browser: :chrome, options: options)
 end

glaszig · 2024-09-13T01:00:20Z

experiencing the same since 1 or 2 months. but i'm using firefox.

ehutzelman · 2024-09-17T23:12:03Z

Been seeing issues in system tests getting locked up since Chrome 128. Just updated to Chrome 129 and unfortunately still see the same issues. Looks like turning off headless allows the tests to run as expected, but not a great fix.

krschacht added I-defect needs-triaging labels Aug 28, 2024

krschacht mentioned this issue Aug 28, 2024

Race condition between chromedriver and selenium with a good stack trace indicating it teamcapybara/capybara#2770

Open

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

[🐛 Bug]: Race condition in ruby library for capybara system tests #14454

[🐛 Bug]: Race condition in ruby library for capybara system tests #14454

krschacht commented Aug 28, 2024

github-actions bot commented Aug 28, 2024

AnrichVS commented Sep 3, 2024

sickdyd commented Sep 5, 2024

sickdyd commented Sep 10, 2024 •

edited

Loading

tvdeyen commented Sep 10, 2024

glaszig commented Sep 13, 2024

ehutzelman commented Sep 17, 2024

[🐛 Bug]: Race condition in ruby library for capybara system tests #14454

[🐛 Bug]: Race condition in ruby library for capybara system tests #14454

Comments

krschacht commented Aug 28, 2024

What happened?

How can we reproduce the issue?

Relevant log output

Operating System

Selenium version

What are the browser(s) and version(s) where you see this issue?

What are the browser driver(s) and version(s) where you see this issue?

Are you using Selenium Grid?

github-actions bot commented Aug 28, 2024

AnrichVS commented Sep 3, 2024

sickdyd commented Sep 5, 2024

sickdyd commented Sep 10, 2024 • edited Loading

tvdeyen commented Sep 10, 2024

glaszig commented Sep 13, 2024

ehutzelman commented Sep 17, 2024

sickdyd commented Sep 10, 2024 •

edited

Loading