You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
Description
During a test of the lz4 encoder I have seen differences in the encoded output compared to the lz4 cli when block chaining is enabled
To reproduce
Consider the binary attached input file testfile.zip. The file is zip compressed and must be decompressed.
Processing the file with the lz4 cli produces the following checksums:
using K4os.Compression.LZ4;
using K4os.Compression.LZ4.Internal;
using K4os.Compression.LZ4.Streams;
Console.WriteLine("Hello, World!");
using (var source = File.OpenRead("testfile.bin"))
{
// lz4 -v -B4 -BI -1 --no-frame-crc testfile.bin testfile.lz4-1-independent-expected
using (var actual = LZ4Stream.Encode(File.Create("testfile.lz4-1-independent-actual"), new LZ4EncoderSettings()
{
ChainBlocks = false,
BlockSize = Mem.K64,
CompressionLevel = LZ4Level.L00_FAST,
}))
{
source.Position = 0;
source.CopyTo(actual);
}
PrintComparison("testfile.lz4-1-independent-expected", "testfile.lz4-1-independent-actual");
// lz4 -v -B4 -BD -1 --no-frame-crc testfile.bin testfile.lz4-1-chained-expected
using (var actual = LZ4Stream.Encode(File.Create("testfile.lz4-1-chained-actual"), new LZ4EncoderSettings()
{
ChainBlocks = true,
BlockSize = Mem.K64,
CompressionLevel = LZ4Level.L00_FAST,
}))
{
source.Position = 0;
source.CopyTo(actual);
}
PrintComparison("testfile.lz4-1-chained-expected", "testfile.lz4-1-chained-actual");
// lz4 -v -B4 -BI -3 --no-frame-crc testfile.bin testfile.lz4-3-independent-expected
using (var actual = LZ4Stream.Encode(File.Create("testfile.lz4-3-independent-actual"), new LZ4EncoderSettings()
{
ChainBlocks = false,
BlockSize = Mem.K64,
CompressionLevel = LZ4Level.L03_HC,
}))
{
source.Position = 0;
source.CopyTo(actual);
}
PrintComparison("testfile.lz4-3-independent-expected", "testfile.lz4-3-independent-actual");
// lz4 -v -B4 -BD -3 --no-frame-crc testfile.bin testfile.lz4-3-chained-expected
using (var actual = LZ4Stream.Encode(File.Create("testfile.lz4-3-chained-actual"), new LZ4EncoderSettings()
{
ChainBlocks = true,
BlockSize = Mem.K64,
CompressionLevel = LZ4Level.L03_HC,
}))
{
source.Position = 0;
source.CopyTo(actual);
}
PrintComparison("testfile.lz4-3-chained-expected", "testfile.lz4-3-chained-actual");
}
static void PrintComparison(string expectedFile, string actualFile)
{
var expected = File.ReadAllBytes(expectedFile);
var actual = File.ReadAllBytes(actualFile);
if (expected.SequenceEqual(actual))
{
Console.WriteLine($"The files {expectedFile} and {actualFile} are the same.");
}
else
{
Console.Error.WriteLine($"The files {expectedFile} and {actualFile} are NOT the same!");
}
}
It will produce this output:
Hello, World!
The files testfile.lz4-1-independent-expected and testfile.lz4-1-independent-actual are the same.
The files testfile.lz4-1-chained-expected and testfile.lz4-1-chained-actual are NOT the same!
The files testfile.lz4-3-independent-expected and testfile.lz4-3-independent-actual are the same.
The files testfile.lz4-3-chained-expected and testfile.lz4-3-chained-actual are NOT the same!
Is output compatible?
Can it compressed with one and decompress with the other?
If yes, then this might interesting but low priority.
I wrote chaining code myself (just from spec), so it might behave a little bit differently.
Also, it can use x86 or x64 encoder (which do produce different results), so this is another thing to keep in mind.
So, my question is, regardless being different - is it compatible?
They are not the same size. It seems as if this library produces a 1 byte smaller file in the tests that I have made. This is a diff from the last block:
The highlighted bytes are the block header. This library produces a length of 0x00001F65 bytes and the cli produces 0x00001F66 bytes.
Description
During a test of the lz4 encoder I have seen differences in the encoded output compared to the lz4 cli when block chaining is enabled
To reproduce
Consider the binary attached input file testfile.zip. The file is zip compressed and must be decompressed.
Processing the file with the lz4 cli produces the following checksums:
Now consider this reproducing example program:
It will produce this output:
This can be verified from the file checksums:
Expected behavior
The expected behavior is that the encoder produces the same result as the lz4 cli when chaining is enabled.
Actual behavior
The encoder results are not the same.
Environment
Additional context
I have tried both the lz4 cli on windows and linux - it produces the same result on both platforms.
The text was updated successfully, but these errors were encountered: