-
Notifications
You must be signed in to change notification settings - Fork 79
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
JPEG images failling, being identified as MPO #34
Comments
@xtagon You haven't demonstrated this issue to be Pillow's problem yet. It very well may be, but let's be sure about it first. |
@aclark4life Right you are, I didn't intend blame, that's just my best guess based on the fact that Pillow is returning "MPO" as the format, and Pilbox only throws this particular error if the format returned is not in Pilbox's list of known formats. If not Pillow's error, the blame would probably fall on my images being invalid. |
@xtagon No problem. And as I said, "probably" is not particularly helpful in this case 😄. We need to know exactly what is going on and why. Additionally it sounds like pilbox may want to consider adding support for MPO in the event that "real" MPO images are discovered. That's not a workaround, it's a feature add. (Disclaimer: other than its similarity to JPEG I know nothing about MPO. Furthermore, I have no low-level knowledge of any image format; I'm the Pillow fork-author and project leader speaking on behalf of Pillow. @wiredfool is your best hope of getting programming help from the Pillow team.) |
Admittedly, I don't know much (anything) about MPO files, but here's what Pillow supports for it:
So it looks like Pilbox could access the individual JPEG files that comprise the MPO file and Pilbox could resize those files, but I'm not sure that Pillow supports saving the correspondingly resized MPO file. If that's the case, the output for these file types would have to default to JPEG for only the first image in the MPO and MPO would not be an allowable output format. I suspect that would satisfy most needs, but seems less than the ideal behavior. Pilbox is really intended to return the images in format that it receives them and this would complicate the output format logic. Anyway, I'll do more research on it this weekend. |
I can't speak for everyone's needs, but that behavior would suit our needs just fine. As far as our clients are concerned they're giving us a JPEG and expect a JPEG back. If the problem images are truly MPO and not JPEG, they don't know it. |
From my perspective though, this is a client problem in that you should be validating the images and only serving the ones that are supported. I'm not sure what kind of validation you have in place for that, but it would need to be more than a simple extension check. For example, this could have just as easily been a JPS file, which I don't think Pillow supports. With that in mind, if you did want to support MPO, you could convert that to a JPEG and serve that as the source image to be resized. Alternatively, I'd want to expand Pilbox to support more of these read-only formats, but I'm still not entirely convinced of the need yet. |
We don't have any validation other than MIME type/file extension checks before upload. The reason is because these images go directly from the browser to our Amazon S3 storage. The only processing that happens to these images happens via Pilbox behind a CDN. Before we did this, we had a background job triggered after each upload that loaded each image into memory and resized it with rmagick, and as much as we tried to optimize this process it was extremely slow and memory hungry. Pilbox solved this issue and we were super happy to get rid of that background worker all-together and just process images on the fly. The images are NEVER processed on our app server, only on a single Pilbox server for every app. Makes it easy to scale. So, in order to convert from MPO to JPEG before it hits pilbox, or to load it in memory and check the EXIF/metadata, we would have to re-implement this background job (slowing down the upload workflow again, going a step backwards and costing us more to run another server process other than Pilbox). If there is a way to just make Pilbox do it's best with what it's given, we would prefer it. If that doesn't make sense for the Pilbox philosphy, let me know. Thanks for bearing with me... :) |
Not sure what language your application is in, but if it's python, you could use python-magic to check the actual mime type without having to open the entire file in memory (it can just look at the first 1KB of file headers). Though I haven't confirmed that it can correctly identify these MPO files for you. That said, if you can't accurately identify the file types, there's always going to be a risk that Pilbox can't support the supplied format (whether it supports MPO or not). In that case, you'll need to always expect and handle resizing errors. I do this in my applications by having a placeholder image drawn using a |
The app dealing with uploads is written in Ruby. We can run background workers in other languages, but another thing for us is that we want images to be loadable as soon as possible after they're uploaded. If we have a background step, then when the user load's the page for the first time, they see that the images failed to load. With Pilbox only in the pipeline, all images load the first time, even if there is a small delay in processing. Still, you're right that this is going to be an issue with other image formats that purport to be JPEG. We're working on client-side fallbacks right now. I thought we'd attack it from the server-side angle also, so thank you again for your nginx example, it will be useful. I didn't think of using a @wiredfool provided a workaround solution. Adding the following to the top of my Pilbox instance's
This makes img.format return "JPEG" instead of "MPO". This confirms my initial suspicion, which was that if Pillow returns it as a JPEG it would work as a JPEG. |
One more alternative, you can create your own custom application which catches specific errors raised from Here's a generic custom application: https://github.com/agschwender/pilbox#extension If you need more guidance on how to accomplish what you're trying to achieve with the |
Actually, since it wasn't much work, here's what it would look like (warning: wrote this without any testing whatsoever) #!/usr/bin/env python
import tornado.gen
from pilbox.app import PilboxApplication, ImageHandler, \
start_server, parse_command_line
from pilbox.errors import ImageFormatError
class CustomApplication(PilboxApplication):
def get_handlers(self):
return [(r"/", CustomImageHandler)]
class CustomImageHandler(ImageHandler):
def prepare(self):
self.has_error = False
@tornado.gen.coroutine
def get(self):
self.validate_request()
resp = yield self.fetch_image()
try:
self.render_image(resp)
except ImageFormatError:
self.has_error = True
self.render_image(resp)
def get_argument(self, name, default=None):
if self.has_error and name == "op":
return "noop"
return super(CustomImageHandler, self).get_argument(name, default)
if __name__ == "__main__":
parse_command_line()
start_server(CustomApplication()) |
Sweet, I'll try it. I was starting to write my own but my Python is rusty. |
I got it to work (had to change |
Looking at the tornado code, I believe it's just calling this:
You'd want to put that as the first line in the |
@agschwender Works great. I have it working only if
|
No problem.
EDIT: That said, if I were you, I'd only do the fallback on permanent failures, e.g. |
I'll have to rethink the caching thing then. Maybe instead of rendering a fallback, I should just redirect to the original image URL. Thanks again. |
Hi @agschwender - I don't remember why I left this hanging but the workaround for this issue has been serving us well. BTW I would like to thank you for all the support and for writing this library! |
Hi,
Some of our images are failing with the error "Unknown format: MPO". The images are JPEG but Pillow is for some reason thinking they are MPO. I've reported this issue to Pilllow here: python-pillow/Pillow#1138
I realize this is not your issue, it's Pillows, but I thought I'd let you know about the problem. Maybe there's a workaround you can do, where if it detects an MPO it tries to process it as a JPEG anyway? If not, I understand. Thanks for reading.
The text was updated successfully, but these errors were encountered: