Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Don't send requests with Connection:keep-alive if we have no agent. #488

Closed

Conversation

glasser
Copy link
Contributor

@glasser glasser commented Sep 24, 2013

Avoids unhandleable ECONNRESETs.

I think this is arguably a Node http bug (in 0.10.18 at least), but: if you run

http.request({headers: {connection: 'keep-alive'}, agent: false});

During the ClientRequest constructor, self.shouldKeepAlive is set to false because there is no agent. But then it calls (indirectly) the storeHeader function (in http.js, which sets self.shouldKeepAlive to true because you specified the keep-alive header.

Then once the request is over responseOnEnd (in http.js) runs. Because shouldKeepAlive is true, we do NOT destroy the underlying socket. But we do remove its error listener. However, because we do NOT have an agent, nobody cares that we emit free, and the socket is essentially leaked.

That is, we continue to have an open socket to the target server, which has no error listener and which will never be cleaned up.

It's bad enough that this is a resource leak. But to make matters worse: let's say that the target server dies. This socket will emit an ECONNRESET error... and since there is no error listener on the socket (there's a listener on the ClientRequest! but not on the socket), bam, time for an incredibly confusing error to be thrown from the top level, probably crashing your process or triggering uncaughtException.

I think the best workaround here is to ensure that if we have no agent, then we don't send connection: keep-alive. This PR is one implementation of this.

(I'll work around this a different way in Meteor for now: by providing an explicit Agent.)

@coveralls
Copy link

Coverage Status

Coverage increased (+0.17%) when pulling 5007bd3 on glasser:keep-alive-requires-agent into 94ec6fa on nodejitsu:caronte.

@glasser
Copy link
Contributor Author

glasser commented Sep 24, 2013

Hmm. I can pretty easily make a test which shows that without this PR, sockets are leaked. Just something as basic as

var http = require('http');
var server = http.createServer(function (req, res) {
  res.writeHead(200, {'Content-Type': 'text/plain'});
  res.write("Thanks for waiting!");
  res.end();
});
server.listen(4000);
var caronte = require('./lib/caronte');
var proxy = caronte.createProxyServer({target: 'http://127.0.0.1:4000'});
proxy.ee.on('caronte:outgoing:web:error', function (e) {
  console.log('handling error', e);
});
proxy.listen(4010);

pointing a web browser at localhost:4010, and running lsof -p pidofproxyserver.... you'll see that sockets remain open from the proxy server to the target server, but since agent is false they will never be reused.

Unfortunately I have not been able to make a simple reproduction of the ECONNRESET error without the full complexity of Meteor.

Also, my PR ends up (unless you specify an Agent) turning all connections into connection: close, which probably is a little too strong. I mean, without an Agent the proxy-target communication should be connection: close, but there's no reason that the client-proxy communication should be too (but my PR makes it be that way).

@Rush
Copy link
Contributor

Rush commented Sep 24, 2013

The problem with using an agent is that they are impossibly slow. Only disabling the agent gives you any sort of performance. In my opinion disabling client to proxy keep-alive is totally unacceptable, especially when paired with HTTPS. The handshake can take up to 150ms during my tests so basically any REST service would be impossibly slow .

@glasser
Copy link
Contributor Author

glasser commented Sep 24, 2013

Do we have an understanding of what aspect of using an agent is slow? Am I confused if I think that without an agent, the proxy to server connection is just going to be leaked and never reused?

@yawnt
Copy link
Contributor

yawnt commented Sep 24, 2013

hey @glasser,
i think your solution is a bit too brutal as a workaround to be implemented without significant sacrifices.. we should, imho, post this on joyent/node and see what they say there about it

could you do it since you uncovered this bug and know more about it? if not, i can handle this

thanks for submitting the PR anyway :)

regarding @RushPL 's comment i think he's referring to the performance that ab shows when tested against caronte.. there are a couple of things that make ab behave weirdly (most noticeably the fact that it's HTTP1/0 and node allows only HTTP1/1 requests, thus preventing me from sending a content length which ab requires).. using other HTTP/1.1 perf suites hasn't shown, so far, significant penalties in using agents except the usual slow-down due to pooling

@yawnt yawnt closed this Sep 24, 2013
@Rush
Copy link
Contributor

Rush commented Sep 24, 2013

I am actually referring to a situation where many outgoing connections need
to be made and a pool (by its nature) is limited. Hence the problem.

2013/9/24 yawnt notifications@github.com

hey @glasser https://github.com/glasser,
i think your solution is a bit too brutal as a workaround to be
implemented without significant sacrifices.. we should, imho, post this on
joyent/node https://github.com/joyent/node/issues and see what they say
there about it

could you do it since you uncovered this bug and know more about it? if
not, i can handle this

thanks for submitting the PR anyway :)

regarding @RushPL https://github.com/RushPL 's comment i think he's
referring to the performance that ab shows when tested against caronte..
there are a couple of things that make ab behave weirdly (most noticeably
the fact that it's HTTP1/0 and node allows only HTTP1/1 requests, thus
preventing me from sending a content length which ab requires).. using
other HTTP/1.1 perf suites hasn't shown, so far, significant penalties in
using agents except the usual slow-down due to pooling


Reply to this email directly or view it on GitHubhttps://github.com//pull/488#issuecomment-25045764
.

@yawnt
Copy link
Contributor

yawnt commented Sep 24, 2013

oh yes, in that case you can just increase the pool size to an absurd number :P

@Rush
Copy link
Contributor

Rush commented Sep 24, 2013

And hence no need for pool. :-)

2013/9/24 yawnt notifications@github.com

oh yes, in that case you can just increase the pool size to an absurd
number :P


Reply to this email directly or view it on GitHubhttps://github.com//pull/488#issuecomment-25046066
.

@yawnt
Copy link
Contributor

yawnt commented Sep 24, 2013

not really, if you leave without pools at one point you're gonna finish up file descriptors and the node process will crash.. with a pool you can have mostly the same perf, but without the horrible crash when you're done with FDs

@glasser
Copy link
Contributor Author

glasser commented Sep 24, 2013

I agree that this is sort of brutal. But right now, the most obvious way to create a proxy without any options defaults to no agent. So in the commonish case where the client is sending Connection:keep-alive, bam, your proxy leaks sockets forever, which will certainly crash eventually. (And in my app will probably lead to ECONNRESETs eventually, though I can't reproduce this in a small example.)

Maybe caronte should default to using an agent (perhaps a large one)?

I'm not sure exactly what bug to report to joyent/node. Basically, the issue is this: if, when using http.request, you go out of your way to specify two somewhat-conflicting non-default options ({headers: {connection: 'keep-alive'}, agent: false}) then you get bad behavior (a leaked open socket).

@glasser
Copy link
Contributor Author

glasser commented Sep 24, 2013

Oops, hit send too soon.

... but then there's the valid argument of "ok, well, don't specify two conflicting options". The problem is that agent: false is the default for caronte, and that when proxying you probably aren't controlling the headers.

So what should I suggest node does? Maybe it shouldn't set shouldKeepAlive when you specify connection: keep-alive with agent: false. But then the keep-alive is getting ignored!

Or maybe specifying both of those options should throw an error. That's reasonable, except then caronte has to still make a change, either the one from this PR, or to stop supporting agent: false.

@glasser
Copy link
Contributor Author

glasser commented Sep 24, 2013

Another potential caronte-side fix: if you're doing agent: false (whether or not that's the case), then the proxy->server code should use connection: keep-alive just as in this PR. but, if we're making that change, then the web-outgoing connection header pass should NOT copy the proxy response's connection header back to the client; it should keep the connection open if the original client's request said keep-alive, and not otherwise.

@yawnt
Copy link
Contributor

yawnt commented Sep 25, 2013

ok so i had a talk with a node core contrib about this.. keep alive connections should only be used, as you said, when there's an agent.. so i think the best way is to make sure that

  • if there's an agent, all is well
  • if there's no agent, the connection between client and proxy should be kept keep-alive but the connection between proxy and server should be defaulted to close (this also means restoring keep-alive on the response)

can you add the "restore keep-alive" part? when that's done (with tests :D) i'm willing to merge this

thanks!

@yawnt yawnt reopened this Sep 25, 2013
@coveralls
Copy link

Coverage Status

Coverage increased (+0.17%) when pulling 5007bd3 on glasser:keep-alive-requires-agent into 94ec6fa on nodejitsu:caronte.

@glasser
Copy link
Contributor Author

glasser commented Sep 25, 2013

I can look into this, but due to some travel it definitely won't be this week. Already stayed up until 3am on this issue once this week :)

@yawnt
Copy link
Contributor

yawnt commented Sep 25, 2013

no prob, i'll look into it myself ASAP (aka when i'm done with the other issue)

@boutell
Copy link

boutell commented Dec 8, 2013

I'm noticing this behavior too. It can DOS a backend server pretty quick actually; lsof shows my backend hanging up immediately on new connections once there are 216 outstanding keepalive connections from the proxy server, because of course I've hit the default limit of 256 file descriptors on my Mac. Naturally that can be raised but eventually there's a limit in whatever context.

Eventually these connections time out and the back end is usable again.

I'm curious why node's core http module doesn't have a tuneable parameter for breaking connections based on how long it's been since the last request. That would at least keep well intentioned goofs like leftover keepalive connections from DOSing the server. Without doing something hamfisted like breaking an incomplete HTTP upload or other long-lived single request, which is a completely different decision.

I think one could write a watchdog in a few lines in userspace: when you see a new req.connection, add a close event handler, and also watch for additional requests with the same connection. If it hasn't been closed or had a new request come in for a few seconds, just close it...

But is that necessary or am I missing a piece of existing functionality?

Apologies for getting a little off the beam of the original issue with node-http-proxy. It's a bug for sure, bad enough behavior that a lot of servers would probably just firewall the responsible IP (:

@boutell
Copy link

boutell commented Dec 8, 2013

The problem seems entirely resolved by adding the default agent:

var proxy = httpProxy.createProxyServer({ agent: http.globalAgent });

I did see @RushPL's comment that the use of any agent at all is a performance-killer. I haven't attempted any measurements of my that, but leaking sockets and not really leveraging keepalive as a result can't be good either.

@glasser
Copy link
Contributor Author

glasser commented Dec 8, 2013

@boutell Yes, it would be great if Node's http server had a way to flip the socket timeout value between one value for "during a request" and one for "between requests". Or a more usable event for "socket is now awaiting another request".

We did something in Meteor recently to try to simulate this but it's hacky: https://github.com/meteor/meteor/blob/devel/packages/webapp/webapp_server.js#L209 https://github.com/meteor/meteor/blob/devel/packages/webapp/webapp_server.js#L448

@boutell
Copy link

boutell commented Dec 8, 2013

Nice, this is what I was suggesting... I think you could build that as a
generic npm module, though it no doubt belongs in core.

On Sun, Dec 8, 2013 at 5:37 PM, David Glasser notifications@github.hscsec.cnwrote:

@boutell https://github.com/boutell Yes, it would be great if Node's
http server had a way to flip the socket timeout value between one value
for "during a request" and one for "between requests". Or a more usable
event for "socket is now awaiting another request".

We did something in Meteor recently to try to simulate this but it's
hacky:
https://github.com/meteor/meteor/blob/devel/packages/webapp/webapp_server.js#L209
https://github.com/meteor/meteor/blob/devel/packages/webapp/webapp_server.js#L448


Reply to this email directly or view it on GitHubhttps://github.com//pull/488#issuecomment-30095063
.

Tom Boutell
P'unk Avenue
215 755 1330
punkave.com
window.punkave.com

@yawnt
Copy link
Contributor

yawnt commented Dec 18, 2013

hey,
i've been going back on this one and using 0.11 (latest master) it doesn't look like it's leaking for me
can you confirm?

@Rush
Copy link
Contributor

Rush commented Dec 21, 2013

So it is possible with 0.11 to run without agent and do keep alive to the target? How would I go about testing it?

@Rush
Copy link
Contributor

Rush commented Dec 21, 2013

I ran a test with:

    var proxy = httpProxy.createProxyServer({
        agent: null
    });

and proxying with:

            proxy.web(req, res, {
                target: 'http://localhost:80'
            });

Seems that it works and does not leak (checked with lsof).. unless agent: null is ignored, I am not sure.

@yawnt
Copy link
Contributor

yawnt commented Dec 21, 2013

i also tried completely without specifying an agent and it didn't leak as well.. waiting to hear from @glasser

@wclr
Copy link

wclr commented Feb 25, 2014

It seems that this fixes #579

jayharris added a commit to jayharris/node-http-proxy that referenced this pull request Mar 26, 2014
@jcrugzz
Copy link
Contributor

jcrugzz commented Mar 27, 2014

behavior is inconsistent, i accept this as a patch until http is finally fixed in 0.12.x.

Fixed in 89a22bc

@jayharris
Copy link
Contributor

This seems to be breaking Upgrades (#638). It forces the Connection header to 'close', but on a Socket Upgrade, we don't have an agent, yet. Perhaps, instead it should check if there is no agent AND the connection is not set to 'upgrade'.

@jcrugzz
Copy link
Contributor

jcrugzz commented May 9, 2014

@jayharris id buy that. It seems that in that case it shouldnt leak sockets as it will only be making one request of that nature. I haven't run into this problem personally since we use an agent in our proxy that is handling sockets. I'd accept that patch.

@jayharris
Copy link
Contributor

Awesome. I'll get it submitted in a few hours.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

8 participants