Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

MockMvc no longer handles UTF-8 characters #23219

Closed
momega opened this issue Jul 1, 2019 · 8 comments
Closed

MockMvc no longer handles UTF-8 characters #23219

momega opened this issue Jul 1, 2019 · 8 comments
Assignees
Labels
in: test Issues in the test module in: web Issues in web modules (web, webmvc, webflux, websocket) type: regression A bug that is also a regression
Milestone

Comments

@momega
Copy link

momega commented Jul 1, 2019

I wrote simple controller:

@RestController
public class TestController {

    @GetMapping("/api")
    public Map<String, String> simpleTest() {
        // here is the text with special characters, all in UTF-8
        return Collections.singletonMap("test", "Příliš žluťoučký kůň úpěl ďábelské ódy");
    }
}

Also very simple tutorial-like test:

    @Before
    public void setup() {
        this.mockMvc = MockMvcBuilders.webAppContextSetup(this.wac).build();
    }

    @Test
    public void test() throws Exception {
        this.mockMvc.perform(get("/api"))
                .andDo(print())
                .andExpect(jsonPath("$.test").value("Příliš žluťoučký kůň úpěl ďábelské ódy"));
    }

The test fails in version 5.2.0.M3 now. Test passes in the previous milestone version 5.2.0.M2 and earlier versions.

The issue is somehow related to deprecation of org.springframework.http.MediaType#APPLICATION_JSON_UTF8_VALUE.

Of course it is possible to fix the test by adding accept but I rather do not do that, because APPLICATION_JSON_UTF8_VALUE is deprecated now.

.accept(MediaType.APPLICATION_JSON_UTF8_VALUE)

I have also prepared simple maven project to simulate the problem.
springutf8.zip

It seems to me, that MockMvc client does not handle UTF-8 characters well, like other modern browsers.

@spring-projects-issues spring-projects-issues added the status: waiting-for-triage An issue we've not yet triaged or decided on label Jul 1, 2019
@sbrannen sbrannen added in: test Issues in the test module in: web Issues in web modules (web, webmvc, webflux, websocket) type: regression A bug that is also a regression labels Jul 1, 2019
@sbrannen
Copy link
Member

sbrannen commented Jul 1, 2019

@sdeleuze, it appears that gh-22788 introduced a regression in MockMvc. Do you have time to look into this?

@sbrannen sbrannen removed the status: waiting-for-triage An issue we've not yet triaged or decided on label Jul 1, 2019
@sbrannen sbrannen added this to the 5.2 RC1 milestone Jul 1, 2019
@sbrannen sbrannen changed the title MockMvc does not handle UTF-8 characters MockMvc no longer handles UTF-8 characters Jul 1, 2019
@sdeleuze sdeleuze self-assigned this Jul 1, 2019
@sdeleuze
Copy link
Contributor

sdeleuze commented Jul 8, 2019

@sbrannen @rstoyanchev The situation here is the following.

As expected, the JSON data is written as UTF-8 by Jackson converter/encoder regardless of the ISO-8859-1 default Servlet encoding which is only used when the String serialization is performed by the Servlet container itself.

On MockMVC side, by default MockHttpServletResponse#characterEncoding is set to ISO-8859-1 and used for writing and reading String. For writing text, I think that's relevant since it will mimic the Servlet Container behavior. But for reading, I think current behavior can lead to surprising results like the one raised in this issue.

In order to get a more predictable behavior, I think my proposal would be to change:

public String getContentAsString() throws UnsupportedEncodingException {
	return (this.characterEncoding != null ?
			this.content.toString(this.characterEncoding) : this.content.toString());
}

to

public String getContentAsString() throws UnsupportedEncodingException {
	return (this.characterEncoding != null && this.charset ?
			this.content.toString(this.characterEncoding) : this.content.toString());
}

in order to only use characterEncoding in getContentAsString when it has been set explicitly.

Any thoughts?

@rstoyanchev
Copy link
Contributor

rstoyanchev commented Jul 9, 2019

What about making a change in JsonPathResultMatchers#getContent instead to check if response.isCharset() is set and if not then default to UTF-8 for extracting the content. That 's a better default for JSON vs the general one for the Servlet API and matches the default assumption of the Jackson message converter. Given this was triggered by a change to the JSON media type it seems a more targeted fix too.

@sdeleuze
Copy link
Contributor

sdeleuze commented Jul 9, 2019

I am not sure because response.getContentAsString() is also pretty commonly used, and I am not sure this default to ISO-8859-1 makes sense on reading the mocked response which matches conceptually with the clients like an http library or the browser which have not this unusual (outside Servlet world) default charset. So I think I am still in favor of my previous proposal.

@rstoyanchev
Copy link
Contributor

@sdeleuze I think MockHttpServletResponse is too level and widely used, and for a very long time. This change is likely to break tests. We should probably discuss with the team. What I like about a change in JsonPathResultMatchers#getContent is that we can afford to make a stronger assumption and it will address very specifically the cause for the regression.

sbrannen added a commit to sbrannen/spring-framework that referenced this issue Jul 16, 2019
sdeleuze added a commit to sdeleuze/spring-framework that referenced this issue Jul 16, 2019
sdeleuze added a commit to sdeleuze/spring-framework that referenced this issue Jul 16, 2019
@sgybas
Copy link

sgybas commented Sep 11, 2019

@sdeleuze Thanks for fixing this bug. However, the same bug exists for .andExpect(content().json(expectedJson)) which still uses the default charset in response.getContentAsString() as of Spring 5.2.0.RC2.

Should I open a new issue and provide a project to reproduce the problem?

@sdeleuze
Copy link
Contributor

yes please

@mobject
Copy link

mobject commented Oct 9, 2019

I had to read from getResponse().getContentAsByteArray(); and wrap the byte[] into String.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
in: test Issues in the test module in: web Issues in web modules (web, webmvc, webflux, websocket) type: regression A bug that is also a regression
Projects
None yet
Development

No branches or pull requests

7 participants