-
-
Notifications
You must be signed in to change notification settings - Fork 2.2k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
GIF writing should use LZW encoding #5278
Comments
Hi Scott,
The short answer is I'm pretty sure this has nothing to do with the LZW encoding.
How I know: I used a couple utilities, dumpgif and runlzw (in https://github.com/raygard/giflzw/tree/main/utilities) to extract the LZW content and the corresponding unencoded pixel data from both gif files (call them ok.gif and bad.gif) above. I can see that the pixel data is not the same, which accounts for the size difference. I dumped with dumpgif, then for ok.gif, I used runlzw (which uses the same LZW encoder as the one in PIL) to re-encode the extracted pixel data, and it exactly matches the LZW data in ok.gif. (That was likely but not sure to happen. Some LZW encoders may get slightly better encoding, mostly by manipulating when LZW "clear" codes are emitted, but my encoder sends a clear code as soon as the hash table is full, and most other encoders do the same.)
So it is still a mystery why the PIL image has trouble in Teams. I have no knowledge of Teams. The only significant difference I can see between these two files is that ok.gif uses GIF disposal method 1 (do not dispose) for all four frames after the first, while bad.gif uses disposal method 3 (restore to previous image) on the fourth frame and 1 on the other three frames (after the first, which has no graphics control extension block in either file and so has no disposal specified.)
As to the size difference: ok.gif has a significantly smaller third frame than bad.gif (105 x 105 vs 256 x 129), and also has fewer colors in each of its color tables. GIF writers try to reduce the size by use of disposal methods and transparency to get some degree of "delta compression" between frames, by noting the parts of the image that have not changed and making them transparent. I suspect GIMP does a better job of locating these areas and maybe better quantization to get fewer colors in the tables (or something like that).
Just for kicks, try this version in place of the bad.gif above and see if it shows the same problem:
[image: trythis.gif]
![trythis](https://user-images.githubusercontent.com/24783736/163045124-206d5917-ce21-4d8a-8546-787987505bab.gif)
This is the bad.gif above but patched to use disposal method 1 on all frames after the first.
[EDIT: I was mistaken about the first frame not having a graphic control extension; all the frames do on both files.]
I am curious exactly how you use PIL to create the bad.gif to begin with. Does PIL automatically use different disposal methods from one frame to the next or is that something you do? I am (coincidentally) just starting to get back into trying to improve GIF handling in PIL.
And to answer your question "is this issue fully implemented": No. PIL always encodes GIFs from 8-bit pixels, even when the GIF has 2, 4, 8, 16, 32, 64 or 128 colors. The spec allows shorter pixel codes (indexes to the color tables) to be encoded starting with shorter LZW codes. My encoder can handle that but PIL needs some work to actually do it. That's another thing I'm intending to work on. It should make GIFs that use fewer colors somewhat smaller.
Ray
…On Tue, Apr 12, 2022 at 11:20 AM smccombie ***@***.***> wrote:
Hi, is this issue fully implemented? I'm using Pillow to produce animated
gifs and the output is larger than those produces with gimp using the
generate python gif as it's source material. The size difference, is mostly
trivial, around 3k, but the real concern I have is around the LZW encoding
of the image data. These files are being shared on the Microsoft Teams
desktop application and for some reason the gif produced in Python is not
animated after Microsoft's servers touch it, where-as the exact same image
after being re-saved through GIMP does. I've done this comparison on other
graphics produced by others on their machines, same behavior. I've narrowed
the differences down to 8-bit depth (which is always 000 in the packed byte
- 0 bits per major color channel? that makes no sense), ordering of GIF
extensions (this doesn't seem important) and the LZW compression (which I
believe is the culprit and accounts for the file size difference). Any
ideas? I've attached the two example images for reference, if this is not
the right place for asking, please let me know and I'll delete this post
and move to the appropriate place. I was hoping to contact Ray Gardner as
he seems to be the lead developer of the LZW component. Thanks!! Scott M.
GIMP (smaller and works correctly in Teams)
[image: gimp]
<https://user-images.githubusercontent.com/20762922/163018562-bcdba1e7-2d85-44df-bb1c-f08ca3a259d2.gif>
PYTHON (larger and doesn't work correctly in Teams)
[image: python]
<https://user-images.githubusercontent.com/20762922/163018566-fc07e861-d0a9-4c62-a030-f442816d1980.gif>
—
Reply to this email directly, view it on GitHub
<#5278 (comment)>,
or unsubscribe
<https://github.com/notifications/unsubscribe-auth/AF5CW6BPOCSUZKBXMMZDZ7DVEWWGXANCNFSM4YBPS4YQ>
.
You are receiving this because you authored the thread.Message ID:
***@***.***>
|
Holy smokes, yes that image works! And thank you for all of that information, I need a second to digest everything. I'm really an amateur at this, although I did also notice that Python emits the application extension for looping for each frame, that doesn't seem necessary either, but yes, looks like you solved the problem. for reference here is the significant portion of code, nothing spectacular: img = []
for i in range(0, **LENGTH**):
img[i] = **FUNCTION TO GET THE IMAGE**
buf = io.BytesIO()
img[0].save(buf, append_images = img[1:], comment = str(datetime.datetime.now()),
duration = 1000, loop = 0x0, format = "GIF", optimize = False, save_all = True) Digging through PIL now to see if I can control the disposal method... Or if I can call some underlying library, my issue is further complicated because I'm using Pyodide for Python, so hopefully there's a pure Python solution I can find... Otherwise, I have to patch the webassembly :( |
@smccombie Scott, try just putting About the "NETSCAPE" looping application extension appearing each frame: yes, I've seen that, and it would be nice to be rid of it. But I did not see that in the sample bad animation you said came from PIL. How did that happen? Did you do something to remove the extra extensions? And, it is really supposed to appear right after the global color table. I may report this as an issue, but may try to fix it myself first. |
@raygard Good morning, so I tried that fix, turns out it's still not working with disposal = 1. I'm not sure why that original bad file has only 1 application extension but this new one has 5 (1 for each frame). Here's the latest produced, I tried to simplify the file a bit as well to make it easier to review. Any advice would be appreciated, thanks!! I also removed the extra application extensions and re-arranged the first one and see if that solves my problem, nope, still doesn't work :( here it is working, re-saved with gimp: the file is significantly larger too, still seems unlikely that only 2 colors would inflate the size of the file by nearly 4K, the GCT would only account for 768 bytes of increase given the 8-bit depth that is always used, I'd have to read up on LZW compression, but seems like this is an ideal compression opportunity, here's ezgif using coalesce to de-optimize the image (full-frames for each frame) and factoring in the GCT size difference, seems like 2K of extra size is coming from somewhere |
@smccombie I love a good mystery, don't you? Let's continue this as a discussion #6207, as I think it has nothing to do with this (LZW) issue. But I would like to get to the bottom of it... |
Quick summary for anyone who visits this thread in the future - the above issues should have been fixed as of Pillow 9.2.0. |
What did you do?
Create a GIF
What did you expect to happen?
It should use LZW encoding, as most GIF applications do.
What actually happened?
It used Fredrik Lundh's cleverly compatible but suboptimal encoding from 1999.
What are your OS, Python and Pillow versions?
This isn't a bug report, it's an enhancement request. I'm pretty sure Fredrik did it this way to avoid the then-current LZW patent problems with Unisys. Those ended in 2004 after all Unisys patents on LZW expired. I'm puzzled why, 17 years later, LZW has been barely mentioned anywhere regarding GIFs from PIL. I see a mention in #617 and #4644. Both of these say GIFs are currently "uncompressed"; that's not quite right. Fredrik had a clever sort-of-run-length-encoding technique. I think using LZW still won't fix all the concerns in #617, I think PIL could use more work for animated GIFs but would need to look into that more.
I see in #617 "Filesize concerns" an invite from @wiredfool "If you’d like to contribute a gif compressor under the PIL license, go for it."
I have one ready to go, well tested at my end in Windows and Linux (WSL2 and Raspios). It's a mod to Gif.h and a major overhaul of GifEncode.c. A drop-in replacement that does not require any other changes.
PIL currently writes all GIFs as 256-color, with full 8 bits in the uncompressed pixels (as in tobytes()). My GifEncode.c writes them this way, but it is ready to go for shorter codes (smaller color tables) when the rest of PIL is (probably GifImagePlugin.py and others).
The text was updated successfully, but these errors were encountered: