-
Notifications
You must be signed in to change notification settings - Fork 4.8k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
[WIP] use llhttp as default HTTP parser #9033
Conversation
Signed-off-by: Derek Argueta <dereka@pinterest.com>
Signed-off-by: Derek Argueta <dereka@pinterest.com>
Signed-off-by: Derek Argueta <dereka@pinterest.com>
Signed-off-by: Derek Argueta <dereka@pinterest.com>
Signed-off-by: Derek Argueta <dereka@pinterest.com>
Fantastic work, @derekargueta ! I'm curious if this will bring any noticeable performance improvements to envoyproxy! |
sendProtocolError(); | ||
throw CodecProtocolException("http/1.1 protocol error: " + | ||
std::string(http_errno_name(HTTP_PARSER_ERRNO(&parser_)))); | ||
std::string(parser_errno_name(parser_get_errno(&parser_)))); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
There is a new opportunity for detailed errors here. One could potentially use llhttp_get_error_reason
to get the detailed reason for the error.
source/common/http/http1/BUILD
Outdated
@@ -2,6 +2,7 @@ licenses(["notice"]) # Apache 2 | |||
|
|||
load( | |||
"//bazel:envoy_build_system.bzl", | |||
"envoy_cc_binary", |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
not used?
ENVOY_CONN_LOG(trace, "message complete", connection_); | ||
if (handling_upgrade_) { | ||
// If this is an upgrade request, swallow the onMessageComplete. The | ||
// upgrade payload will be treated as stream body. | ||
ASSERT(!deferred_end_stream_headers_); | ||
ENVOY_CONN_LOG(trace, "Pausing parser due to upgrade.", connection_); | ||
#ifdef ENVOY_ENABLE_LEGACY_HTTP_PARSER |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
can we always return HPE_PAUSED and let the calling site call http_parser_pause? i.e. here: https://github.com/envoyproxy/envoy/pull/9033/files#diff-2ba39613a1b78b994ec31ddd03cf42f5R342
same for other places.
nread = llhttp_get_error_pos(parser) - slice; | ||
if (err == HPE_PAUSED_UPGRADE) { | ||
err = HPE_OK; | ||
llhttp_resume_after_upgrade(parser); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
we don't always handle upgrade today? don't we need to check handling_upgrade_ here before resume?
@envoyproxy/security-team for review as well. |
@derekargueta awesome stuff. I put this comment in the issue, but is it possible to compile both in for a period of time so we can feature flag which parser to use? /wait-any |
+1 to what Matt said. I'm super excited for the switch and it'd be even better if we can runtime switch this for a release or two! |
FWIW, I've created a PR to introduce lenient parsing to |
@mattklein123 @alyssawilk yeah I started working on a runtime switch. The catch is that |
@derekargueta from a quick glance it doesn't seems have conflicting enum between llhttp and http_parser, seems everything in llhttp is prefixed with |
@lizan sorry I misspoke, it's not the enum's themselves (which you are correct, llhttp is prefixed with When the 2 are included into the same context (directly or transitively):
|
Can we change llhttp to prefix it's enum values? I am not sure that is necessary though. We should have an abstraction layer with its own enum to enable the switch, right? |
@derekargueta thanks that makes sense, yeah then you need two classes implement same interface in different compilation unit to make it work. |
This pull request has been automatically marked as stale because it has not had activity in the last 7 days. It will be closed in 7 days if no further activity occurs. Please feel free to give a status update now, ping for review, or re-open when it's ready. Thank you for your contributions! |
Signed-off-by: Derek Argueta <dereka@pinterest.com>
Signed-off-by: Derek Argueta <dereka@pinterest.com>
Signed-off-by: Derek Argueta <dereka@pinterest.com>
Still very much a WIP, not yet ready for full review but high-level comments welcome. Wrestling with getting the static-casting right from This introduces quite a few abstractions with the aim of completely decoupling the HTTP parser from the codec/connectionimpl which is necessary for the 2 libraries to coexist. Main components are
Also keeping an eye on #8667 which looks like will have conflicts with this PR |
Signed-off-by: Derek Argueta <dereka@pinterest.com>
Signed-off-by: Derek Argueta <dereka@pinterest.com>
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Huzzah for progress!
some high level comments while you're doing clean up :-)
docs/root/intro/version_history.rst
Outdated
@@ -12,6 +12,7 @@ Version history | |||
* health check: gRPC health checker sets the gRPC deadline to the configured timeout duration. | |||
* http: added the ability to sanitize headers nominated by the Connection header. This new behavior is guarded by envoy.reloadable_features.connection_header_sanitization which defaults to true. | |||
* http: support :ref:`auto_host_rewrite_header<envoy_api_field_config.filter.http.dynamic_forward_proxy.v2alpha.PerRouteConfig.auto_host_rewrite_header>` in the dynamic forward proxy. | |||
* http: use llhttp as default http parser |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Drive by comment as you clean things up, I wouldn't default this on until we've had some fuzzing aimed at it.
I'd be inclined to land default off, make sure we have comparable fuzzing coverage, encourage folks to smoke test it on, fuzz for a few weeks, and then switch the default.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Sure, will flip this off-by-default for now
@@ -314,47 +314,13 @@ void RequestStreamEncoderImpl::encodeHeaders(const HeaderMap& headers, bool end_ | |||
StreamEncoderImpl::encodeHeaders(headers, end_stream); | |||
} | |||
|
|||
http_parser_settings ConnectionImpl::settings_{ |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
While you're doing your clean-up, WDYD of landing some of the pure refactor as its own PR?
It'd let us land some of this ASAP, make for easier and smaller reviews, etc.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Sure I can break this up a bit (I was admittedly surprised how large the delta was once I got the refactor building)
|
||
bool ParserFactory::usesLegacyParser() { return use_legacy_parser_; } | ||
|
||
void ParserFactory::useLegacy(bool use_legacy_parser) { use_legacy_parser_ = use_legacy_parser; } |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
As long as we're able to link both, why not make this based on runtime?
I could imagine finding an issue where things don't work as expected, and switching all new connections over is safer than changing command line flags and doing fleet-wide restarts
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Are you suggesting just runtime value and not introducing a flag? Or keeping the flag but allow runtime to override? I'm guessing the former since the latter could be confusing to someone seeing the flag enabled but not knowing the runtime value is overriding it, but just want to clarify.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I definitely think it's worth having runtime. Neutral about flags - last I heard from Matt most non-google companies don't use command line flags heavily so if you folks don't need it I'd lean towards sticking with one configuration method for consistency, as you say.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yeah I would probably just do runtime.
if (strict_header_validation_) { | ||
if (!Http::HeaderUtility::headerIsValid(header_value)) { | ||
ENVOY_CONN_LOG(debug, "invalid header value: {}", connection_, header_value); | ||
error_code_ = Http::Code::BadRequest; | ||
sendProtocolError(); | ||
throw CodecProtocolException("http/1.1 protocol error: header value contains invalid chars"); | ||
} | ||
} else if (header_value.find('\0') != absl::string_view::npos) { | ||
} else if (ParserFactory::usesLegacyParser() && |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Is this an optimization because we're confident the new parser handles this for us? I'd think we wouldn't want to allow \0 for the new version either.
(calling out because if we make this a runtime switch we can't depend on the previously stable value)
sendProtocolError(); | ||
throw CodecProtocolException("http/1.1 protocol error: " + | ||
std::string(http_errno_name(HTTP_PARSER_ERRNO(&parser_)))); | ||
throw CodecProtocolException("http/1.1 protocol error: " + std::string(parser_->errnoName())); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
absl::StrCat would be a good think to use here.
virtual ~Parser() = default; | ||
virtual int execute(const char* slice, int len) PURE; | ||
virtual void resume() PURE; | ||
virtual int pause() PURE; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Would it make sense to return ParserStatus here? Or would it be too much work to translate from implementation specific status to the ParserStatus?
@@ -607,12 +569,13 @@ void ServerConnectionImpl::onEncodeComplete() { | |||
void ServerConnectionImpl::handlePath(HeaderMapImpl& headers, unsigned int method) { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Would it make sense to make method to be type Method?
|
||
// Currently, CONNECT is not supported, however; http_parser_parse_url needs to know about | ||
// Currently, CONNECT is not supported, however; llhttp_parse_url needs to know about |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Do both parsers need to know about CONNECT?
current_header_map_.reset(); | ||
header_parsing_state_ = HeaderParsingState::Done; | ||
|
||
// Returning 2 informs llhttp to not expect a body or further data on this connection. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Is this code common to both http_parser and llhttp? What would http_parser do with 2?
bazel/repository_locations.bzl
Outdated
strip_prefix = "llhttp-release-v2.0.1", | ||
urls = ["https://github.com/nodejs/llhttp/archive/release/v2.0.1.tar.gz"], |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
llhttp 2.0.2 was just released - https://github.com/nodejs/llhttp/releases/tag/release%2Fv2.0.2
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
llhttp 2.0.3 was just released - https://github.com/nodejs/llhttp/releases/tag/release%2Fv2.0.3
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
llhttp 2.0.4 was just released - https://github.com/nodejs/llhttp/releases/tag/v2.0.4
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
llhttp 2.0.5 released - https://github.com/nodejs/llhttp/releases/tag/v2.0.5
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
As @indutny mentioned at #5155 (comment), llhttp 2.1.1 released - https://github.com/nodejs/llhttp/releases/tag/v2.1.1
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
llhttp 2.2.0 released - https://github.com/nodejs/llhttp/releases/tag/v2.2.0
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
llhttp 3.0.0 released - https://github.com/nodejs/llhttp/releases/tag/release%2Fv3.0.0
Signed-off-by: Derek Argueta <deargueta@tesla.com>
This pull request has been automatically marked as stale because it has not had activity in the last 7 days. It will be closed in 7 days if no further activity occurs. Please feel free to give a status update now, ping for review, or re-open when it's ready. Thank you for your contributions! |
Signed-off-by: Derek Argueta <darguetap@gmail.com>
Signed-off-by: Derek Argueta <darguetap@gmail.com>
This pull request has been automatically marked as stale because it has not had activity in the last 7 days. It will be closed in 7 days if no further activity occurs. Please feel free to give a status update now, ping for review, or re-open when it's ready. Thank you for your contributions! |
This pull request has been automatically closed because it has not had activity in the last 14 days. Please feel free to give a status update now, ping for review, or re-open when it's ready. Thank you for your contributions! |
Description: Uses llhttp to parse HTTP/1.1 traffic. A compile-time switch is provided to use the legacy Node http-parser. Preprocessor macros were selected due to global naming conflicts between the two libraries. The alternative to make it a runtime switch would require
#include
ing the libraries within separate namespaces (I can backtrack to this option if it's more desirable).TODO
BadRequestNoStream
test to not open a new streamThere are other places that still use the http-parser, notably the http_inspector network filter and the url parsing utilities.
Risk Level: medium
Testing: existing
Docs Changes:
Release Notes:
Fixes #5155
Deprecated: http-parser