Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

cunit/charset:test_charset_decode: check for base64 input with invalid length #5006

Conversation

dilyanpalauzov
Copy link
Contributor

Valid base64 encoded input must have specific length, possibly by appending = characters. Here I add a test containing base64 encoded input with invalid length. In this case charset_decode() should fail, but does currently succeed.

The problem is with the Sieve body test, which on MIME parts with Content-Transfer-Encoding: base64 and invalid length of characters produces unpredictable results. Yahoo FBL does send such MIME parts.

This could be handled by producing NULL or "" as output, on such input. https://www.rfc-editor.org/rfc/rfc3548#section-2.2 suggests skipping the padding characters in transit, and appending them just when decoding - this is one more valid approach.

In any case the current implementation skips handling this case and when I ran my code under Address senitizer, this was triggered for unknown input:

AddressSanitizerAddressSanitizer:DEADLYSIGNAL
:DEADLYSIGNAL
=================================================================
==311693==ERROR: AddressSanitizer: SEGV on unknown address 0x000000000000 (pc 0x7fcbea14256b bp 0x000000000022 sp 0x7fcbdb8d87d8 T169007)
==311693==The signal is caused by a WRITE memory access.
==311693==Hint: address points to the zero page.
ignoring invalid base64 characters: syserror=<Invalid argument> func=<b64_flush>
    #0 0x7fcbea14256b in __memmove_sse2_unaligned_erms ../sysdeps/x86_64/multiarch/memmove-vec-unaligned-erms.S:377
    #1 0x7fcbed7984bd in buf_appendmap ../lib/util.c:1284
    #2 0x7fcbed7984bd in buf_appendmap ../lib/util.c:1280
    #3 0x7fcbed799db8 in xsyslog_fn ../lib/util.c:2129
    #4 0x7fcbed6cf32c in b64_flush ../lib/charset.c:535
    #5 0x7fcbed6d05c5 in convert_flush ../lib/charset.c:373
    #6 0x7fcbed6d05c5 in convert_catn ../lib/charset.c:407
    #7 0x7fcbed6d4e3e in charset_decode_sha1 ../lib/charset.c:2797
    #8 0x7fcbedaef976 in body_add_content_guid ../imap/message.c:747
    #9 0x7fcbedaf30c2 in message_parse_body ../imap/message.c:787
    #10 0x7fcbedaf3a77 in message_parse_multipart ../imap/message.c:1829
    #11 0x7fcbedaf30f1 in message_parse_body ../imap/message.c:806
    #12 0x7fcbedaf31ad in message_parse_body ../imap/message.c:828
    #13 0x7fcbedaf3a77 in message_parse_multipart ../imap/message.c:1829
    #14 0x7fcbedaf30f1 in message_parse_body ../imap/message.c:806
    #15 0x7fcbedaf325f in message_parse_mapped ../imap/message.c:529
    #16 0x48f481 in cyrus_getbody src/modules/mod_sieve/libcyrus_sieve.cpp:184
    #17 0x7fcbedbaf02c in eval_bc_test ../sieve/bc_eval.c:983
    #18 0x7fcbedbb12af in sieve_eval_bc ../sieve/bc_eval.c:1943
    #19 0x7fcbedbb81be in sieve_execute_bytecode ../sieve/script.c:972
AddressSanitizer can not provide additional info.
SUMMARY: AddressSanitizer: SEGV ../sysdeps/x86_64/multiarch/memmove-vec-unaligned-erms.S:377 in __memmove_sse2_unaligned_erms
==311693==ABORTING

@rsto rsto self-assigned this Aug 21, 2024
@rsto
Copy link
Member

rsto commented Aug 21, 2024

@dilyanpalauzov I can well imagine that there is something we need to fix here, but I am confused by the test code in this pull request:

  • Why do you pass an invalid length parameter to the charset_decode function, that is 11 instead of 1?
  • Do you actually expect charset_decode to return "\xba\xf6\xa5\xb9\xe8" or does your test assert for that because that's what it currently is doing?

@dilyanpalauzov
Copy link
Contributor Author

Why do you pass an invalid length parameter to the charset_decode function, that is 11 instead of 1?

Because I copied another example, which had 11. If I change it to 1, the decoded string is empty ("") - thus correct. If that 11 is used, the returned value is "\xba\xf6\xa5\xb9\xe8".

Do you have an idea how to interpret that backtrace above?

@rsto
Copy link
Member

rsto commented Aug 21, 2024

Do you have an idea how to interpret that backtrace above?

Without having reproduced it locally, it looks as if the byte array in the static buffer buf in xsyslog_fn is pointing to NULL. That's quite surprising. Is that memory access error reported also if you pass the correct length 1 (or 0 for c-string)?

@dilyanpalauzov
Copy link
Contributor Author

I do not know. I have the backtrace since March, where I ran everything under AddressSanitizer, in the context of #4819 and #4785 (comment) . I do not want to run everything again under AddressSanitizer, but to find out of this backtrace some problem.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants