Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

WebSockets: No timeout associated to Ping-Pong-Frames #24614

Open
ghandmann opened this issue Aug 6, 2020 · 9 comments
Open

WebSockets: No timeout associated to Ping-Pong-Frames #24614

ghandmann opened this issue Aug 6, 2020 · 9 comments
Labels
area-networking Includes servers, yarp, json patch, bedrock, websockets, http client factory, and http abstractions feature-websockets Includes: WebSockets
Milestone

Comments

@ghandmann
Copy link

Describe the bug

The Websocket RFC defines in Section 5.5.2 and 5.5.3 the Ping and Pong frames. ASP.net Core 3.1 is sending a Ping-Frame from the Server to the client in the defined interval from WebSocketOptions.KeepAliveInterval property. So far everything works as expected.

But with a "malicious" client, which is not responding with a mandatory Pong-Frame, the Server just keeps sending Ping-Frames and nothing happens.

After quickly scanning through the repo, i am afraid that there is simply nothing implemented, that checks if the client responds with a pong frame in time. To make this clear: The RFC is not defining what the other side should do, when Pong-Fames are not coming back. But it clearly states that the receiver of a Ping-Frame MUST send a Pong-Frame back. Based on that i was expecting some server-side timeout for the websocket connection (like missing three Pong-Frames terminates the connection).

This behavior currently prevents the server from detecting broken Websocket-Connections where e.G. an intermediate Router has gone away and now WebSocket Close-Frame and/or TCP FIN packet has been transmitted.

To Reproduce

This is not that easily reproducable, since you need a WebSocket-Client which you can "freeze" without sending a Close-Frame to the server. I used Mojo::UserAgent in a perl script to connect a websocket and after a while i paused the process via CTRL+Z on the command line.

I watched the network traffic via Wireshark and verified that Ping-Frames are sent and no Pong-Frames are going back anymore. The red line marks the moment when i froze the client-websocket.

https://sveneppler.de/stuff/asp-net-websocket-ping-pong.png

I could provide a reproducer repo. But since to my knowledge there is just no handling in the server anyway, a reproducer would be "useless".

Further technical details

  • ASP.NET Core 3.1
  • Include the output of dotnet --info
.NET Core SDK (reflecting any global.json):
 Version:   3.1.301
 Commit:    7feb845744

Runtime Environment:
 OS Name:     ubuntu
 OS Version:  18.04
 OS Platform: Linux
 RID:         ubuntu.18.04-x64
 Base Path:   /usr/share/dotnet/sdk/3.1.301/

Host (useful for support):
  Version: 3.1.5
  Commit:  65cd789777

.NET Core SDKs installed:
  2.1.807 [/usr/share/dotnet/sdk]
  2.2.402 [/usr/share/dotnet/sdk]
  3.1.301 [/usr/share/dotnet/sdk]

.NET Core runtimes installed:
  Microsoft.AspNetCore.All 2.1.19 [/usr/share/dotnet/shared/Microsoft.AspNetCore.All]
  Microsoft.AspNetCore.All 2.2.8 [/usr/share/dotnet/shared/Microsoft.AspNetCore.All]
  Microsoft.AspNetCore.App 2.1.19 [/usr/share/dotnet/shared/Microsoft.AspNetCore.App]
  Microsoft.AspNetCore.App 2.2.8 [/usr/share/dotnet/shared/Microsoft.AspNetCore.App]
  Microsoft.AspNetCore.App 3.1.5 [/usr/share/dotnet/shared/Microsoft.AspNetCore.App]
  Microsoft.NETCore.App 2.1.19 [/usr/share/dotnet/shared/Microsoft.NETCore.App]
  Microsoft.NETCore.App 2.2.8 [/usr/share/dotnet/shared/Microsoft.NETCore.App]
  Microsoft.NETCore.App 3.1.5 [/usr/share/dotnet/shared/Microsoft.NETCore.App]

To install additional .NET Core runtimes or SDKs:
  https://aka.ms/dotnet-download

@BrennanConroy
Copy link
Member

BrennanConroy commented Aug 7, 2020

The RFC is not defining what the other side should do, when Pong-Fames are not coming back

Like you said, there isn't anything we're required to be doing in this case. The WebSocket will be closed if the TCP packet from a Ping or write isn't ACKed.

You can implement your own KeepAlive logic with app level messages like SignalR does.

We could consider adding a configurable timeout in the future.

@BrennanConroy BrennanConroy added feature-websockets Includes: WebSockets and removed area-servers labels Aug 7, 2020
@BrennanConroy BrennanConroy added this to the Backlog milestone Aug 7, 2020
@ghost
Copy link

ghost commented Aug 7, 2020

We've moved this issue to the Backlog milestone. This means that it is not going to be worked on for the coming release. We will reassess the backlog following the current release and consider this item at that time. To learn more about our issue management process and to have better expectation regarding different types of issues you can read our Triage Process.

@osexpert
Copy link

osexpert commented Sep 21, 2020

As being resolved duplicate from #26117 I add a summary here:
It is easy to reproduce, pull out network cable:-)
ClientWebSocket does timeout (and throws WebSocketException) after not recieving Ping or Pong. Inconsistent. RFC does not say anything about this either, so why not let the client hang forever too:-D But seriously, RFC only says Pingpong can be used as a tool to implement heartbeat\timeout for endpoints (eg. both parties):
"NOTE: A Ping frame may serve either as a keepalive or as a means to verify that the remote endpoint is still responsive."
You may choose to not use the tools given, and it will work poorly. You may use them and it will work good. The RFC does not care, we do:-)

Here is exception from client side:

System.Net.WebSockets.WebSocketException (0x80004005): The remote party closed the WebSocket connection without completing the close handshake.
 ---> System.IO.IOException: Unable to read data from the transport connection: A connection attempt failed because the connected party did not properly respond after a period of time, or established connection failed because connected host has failed to respond.
 ---> System.Net.Sockets.SocketException (10060): A connection attempt failed because the connected party did not properly respond after a period of time, or established connection failed because connected host has failed to respond.
   --- End of inner exception stack trace ---
   at System.Net.Sockets.Socket.AwaitableSocketAsyncEventArgs.ThrowException(SocketError error, CancellationToken cancellationToken)
   at System.Net.Sockets.Socket.AwaitableSocketAsyncEventArgs.GetResult(Int16 token)
   at System.Net.Http.HttpConnection.ReadBufferedAsyncCore(Memory`1 destination)
   at System.Net.Http.HttpConnection.RawConnectionStream.ReadAsync(Memory`1 buffer, CancellationToken cancellationToken)
   at System.Net.WebSockets.ManagedWebSocket.EnsureBufferContainsAsync(Int32 minimumRequiredBytes, CancellationToken cancellationToken, Boolean throwOnPrematureClosure)
   at System.Net.WebSockets.ManagedWebSocket.ReceiveAsyncPrivate[TWebSocketReceiveResultGetter,TWebSocketReceiveResult](Memory`1 payloadBuffer, CancellationToken cancellationToken, TWebSocketReceiveResultGetter resultGetter)
   at System.Net.WebSockets.ManagedWebSocket.ReceiveAsyncPrivate[TWebSocketReceiveResultGetter,TWebSocketReceiveResult](Memory`1 payloadBuffer, CancellationToken cancellationToken, TWebSocketReceiveResultGetter resultGetter)

It does not seem related to Pingpong, possibly the client dies because of other circumstances. By coincidence the server does not die in the same way. I guess I come to same conclusion as original poster, that .Net WebSocket does not implement any of timeout related to Pingpong.

@osexpert
Copy link

osexpert commented Feb 17, 2021

Another way to reproduce that I found was running the WS client in a VM, just doing stuff in a tight loop (I think I used VirtualBox, but others may work similar). Then diconnect the NIC for the VM (in the VM gui, not in Windows). Wait. Enable again (or don't). Client will crash after a while but server hangs forever.

@bjornen77
Copy link
Contributor

bjornen77 commented Feb 24, 2021

I think that adding a WebSocketOptions.Timeout property would be a good thing. As @BrennanConroy says above, the WebSocket is only closed when the TCP Retransmission times out. This timeout seems to be different on Windows and Linux. By default, on Windows it takes about 21 seconds and on Linux it takes about 15 minutes before the WebSocket is closed.

The ClientWebSocket today only sends "Pong" to the server, and therefore it does not wait for a "Pong" response to arrive. I think that the ClientWebSocket shall send a "Ping" instead, and then verify that a "Pong" is received within the Timeout set. If not received, the connection shall be closed. Or at least it should expose the "received "Pong" message in the WebSocketReceiveResult.MessageType. It would be good if an received "Ping" message from a server is also exposed. Then the application could use these Ping/Pong message types to implement its own timeout logic.

@osexpert
Copy link

osexpert commented Feb 26, 2021

I think that adding a WebSocketOptions.Timeout property would be a good thing. As @BrennanConroy says above, the WebSocket is only closed when the TCP Retransmission times out. This timeout seems to be different on Windows and Linux. By default, on Windows it takes about 21 seconds and on Linux it takes about 15 minutes before the WebSocket is closed.

The ClientWebSocket today only sends "Pong" to the server, and therefore it does not wait for a "Pong" response to arrive. I think that the ClientWebSocket shall send a "Ping" instead, and then verify that a "Pong" is received within the Timeout set. If not received, the connection shall be closed. Or at least it should expose the "received "Pong" message in the WebSocketReceiveResult.MessageType. It would be good if an received "Ping" message from a server is also exposed. Then the application could use these Ping/Pong message types to implement its own timeout logic.

Not sure if the ping\pong logic necessarily needs to change. The main thing is last time data was recieved, either a ping or real data. A successful response to sending data may also count as last recieved (and here, perhaps waiting for Pong to succeed may count as last recieved too, allthou not sure if this can be completely trusted). Edit: in any case, I think the current WebSocket code does make sure that either a ping or real data is sent to other end within the KeepAliveInterval, so I thinks its just code missing that keep track of lastRecieved in other end and terminating connction if lastRecieved > Timeout.

@rogeralsing
Copy link

This timeout seems to be different on Windows and Linux. By default, on Windows it takes about 21 seconds and on Linux it takes about 15 minutes before the WebSocket is closed.

At least we are not alone in this then..
Having a custom setting for this would be great, 15 min is way too long as it gives false positives for us on what connections are active.

@lzxyz
Copy link

lzxyz commented Mar 9, 2022

Always collecting again, never in action

@devsienko
Copy link

It would really nice to have possibility to track the last client's pong. I don't understand why Microsoft engineers haven't still implemented it. To solve this issue peaple have to implemed some custom message types or edit MS code manually. Lol.

@amcasey amcasey added area-networking Includes servers, yarp, json patch, bedrock, websockets, http client factory, and http abstractions and removed area-runtime labels Jun 2, 2023
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
area-networking Includes servers, yarp, json patch, bedrock, websockets, http client factory, and http abstractions feature-websockets Includes: WebSockets
Projects
None yet
Development

No branches or pull requests

10 participants