Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

How to encode query parameters? #24

Closed
bennnjamin opened this issue Apr 27, 2023 · 11 comments
Closed

How to encode query parameters? #24

bennnjamin opened this issue Apr 27, 2023 · 11 comments
Labels
question Further information is requested

Comments

@bennnjamin
Copy link

I'm primarily using http-client, but almost all query parameter handling is implemented in the parent HttpRequest class here. It doesn't look like there is any place that the special characters in query parameters are being encoded. Is this intentional? Or does it make sense to provide this option?

Please advise the best way going forward.

@trowski trowski added the question Further information is requested label Apr 28, 2023
@trowski
Copy link
Member

trowski commented Apr 28, 2023

Special characters in query parameters are encoded by League\Uri\QueryString::build() before being set as the query component of the request UriInterface.

These characters are also decoded by League\Uri\QueryString::parse() which is used to create the internal array in HttpRequest, so values returned from `getQueryParameter() and friends do not need to be decoded.

@bennnjamin
Copy link
Author

bennnjamin commented Apr 28, 2023

@trowski thanks for the explanation, it looks like a bug in that dependency when using encoding type PHP_QUERY_RFC3986. For now, I am building the query string manually and just setting the URI to the correct query string. In particular, I have to pass phone numbers as query parameters and the bug occurs when encoding a +.

$queryParameters = [
    "phone_number" => "+18001231234",
    "special_characetrs" => " /+t!q_"
];
$queryPairs = [
    ["phone_number", "+18001231234"],
    ["special_characetrs", " /+t!q_"]
];
$queryString = QueryString::build($queryPairs, '&', PHP_QUERY_RFC1738);
print("League\URI PHP_QUERY_RFC1738 $queryString\n");
$queryString = QueryString::build($queryPairs, '&', PHP_QUERY_RFC3986);
print("League\URI PHP_QUERY_RFC3986 $queryString\n");
$queryString = http_build_query($queryParameters, "", null, PHP_QUERY_RFC1738);
print("http_build_query PHP_QUERY_RFC1738 $queryString\n");
$queryString = http_build_query($queryParameters, "", null, PHP_QUERY_RFC3986);
print("http_build_query PHP_QUERY_RFC3986 $queryString\n");

Output:

League\URI PHP_QUERY_RFC1738 phone_number=%2B18001231234&special_characetrs=+%2F%2Bt%21q_
League\URI PHP_QUERY_RFC3986 phone_number=+18001231234&special_characetrs=%20/+t!q_
http_build_query PHP_QUERY_RFC1738 phone_number=%2B18001231234&special_characetrs=+%2F%2Bt%21q_
http_build_query PHP_QUERY_RFC3986 phone_number=%2B18001231234&special_characetrs=%20%2F%2Bt%21q_

@bennnjamin
Copy link
Author

After digging into this further it seems that some older servers aren't really compliant with RFC 3986. Would you be open to providing a way to override or set the encoding type to PHP_QUERY_RFC1738 or PHP_QUERY_RFC3986? League\Uri\QueryString::build() supports it, but it's not exposed publicly in HttpRequest as a parameter.

@bennnjamin bennnjamin reopened this Apr 28, 2023
@kelunik
Copy link
Member

kelunik commented Apr 28, 2023

See also thephpleague/uri-src#109.

+ is also used for whitespace, no? Which spec do browsers follow? Do we simply reference the wrong rfc here?

@bennnjamin
Copy link
Author

@kelunik Yeah that was me that just opened that issue. Honestly I don't know enough about the spec but there are some major inconsistencies in real world implementations. I think that's why it's important to provide an option at the library level, since this could be used to connect to a variety of legacy HTTP servers with no support for recent RFCs. JavaScript's encoderURIComponent encodes a + but encodeURI does not, for example.

In my case, I don't have control over the API, and it expects the + to be encoded, so I have to encode it regardless of whether the RFC says that's correct.

@nyamsprod
Copy link

For reference you have RFC1738, RFC3986, RFC3987 and also the living standard each with their own set of rules around encoding. So it really depends which RFC is followed by which client.

@trowski
Copy link
Member

trowski commented Apr 28, 2023

@nyamsprod Do you have an insight on a good direction here? Does it make the most sense to offer an option as to which encoding is used, or could we use RFC1738 which seems to encode everything?

@nyamsprod
Copy link

From my understanding and research when developping the package, browsers (HTTP Clients) are leaning toward using a version that ressembles RFC1738 to align URI encoding with Forms Data encoding (see the living standard). Whereas servers historically are leaning towards following RFC3986. So like everything in programming the answer is it depends and there is no clear one solution for all

@kelunik
Copy link
Member

kelunik commented Aug 11, 2023

@trowski Seems like we still need to change

$this->uri = $this->uri->withQuery(QueryString::build($pairs) ?? '');
to use the old RFC encoding?

@trowski
Copy link
Member

trowski commented Aug 11, 2023

Yes, I'll update and tag a bugfix release.

@kelunik kelunik closed this as completed Aug 24, 2023
@nyamsprod
Copy link

nyamsprod commented Sep 6, 2023

@trowski @bennnjamin @kelunik after implementing URLSearchParams I can confidently say that there are 3 encoding types for the query component:

  • RFC1738
  • RFC3986
  • application/www-form-urlencoded

Browsers, at least modern browsers that follow the WHATWG group now only uses the later for Forms and query string in order to align all parsing/serializing.

Starting with version 7.1.0 the latter should be accessible via the Modifier class using the encodeQuery method as shown below

use League\Uri\Modifier;
use League\Uri\KeyValuePair\Converter;

$formDataConverter = Converter::new('&')
    ->withEncodingMap(['%20' => '+', '%2A' => '*']);

$uri = Modifier::from($uri)
    ->encodeQuery(formDataConverter)
    ->getUriString();

or the QueryString class

use League\Uri\QueryString;
use League\Uri\KeyValuePair\Converter;

$formDataConverter = Converter::new('&')
    ->withEncodingMap(['%20' => '+', '%2A' => '*']);

$query = QueryString::buildFromPairs($pairs, $formDataConverter);
//$query will be a string or `null` if $pairs is an empty array.

In the next minor version (7.3.0 I presume) you will be able to simply a bit the code to

use League\Uri\Modifier;
use League\Uri\KeyValuePair\Converter;

echo Modifier::from($uri)
    ->encodeQuery(Converter::fromFormData())
    ->getUriString();

and with the QueryString class

use League\Uri\QueryString;
use League\Uri\KeyValuePair\Converter;

echo QueryString::buildFromPairs($pairs,  Converter::fromFormData());
//returns a string or `null` if $pairs is an empty array.

7.3.0 will also give you access to the PHP implementation of URLSearchParams but that's another topic even though it will give you proper representation of how browser and client should handle query string.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
question Further information is requested
Development

No branches or pull requests

4 participants