Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[WIP]: Type annotations for Image.py #2687

Closed
wants to merge 39 commits into from

Conversation

neiljp
Copy link
Contributor

@neiljp neiljp commented Aug 17, 2017

Fixes #2625 .

Changes proposed in this pull request:

  • Add type annotations for many functions (almost all)
  • Where mypy suggests, and in other places, add annotations for variables (often empty containers or initially-None)
  • Introduce typing names within if False: block, as well as type aliases to simplify annotations
  • Some imports, which may be temporary or unnecessary (based on how I'm running mypy)

Annotation caveats:

  • Some of these annotations may be incorrect, and some still use 'dynamic annotations' (ie. Any); any improvements in that direction would be welcome!
  • The documentation is great in most cases, though in some it is a little vague or does conflict!

There are a few cases where I don't follow why the code is why it is, or where a small change might allow mypy to 'pass' the problem (eg infer types).

I'm currently using a development branch of mypy which is being used in Zulip, but I'm happy to use another version if that'd be easier. mypy doesn't pass this as it stands, with the options:

-2 --ignore-missing-imports --follow-imports=silent --check-untyped-defs

PIL/Image.py Outdated
@@ -1049,6 +1108,7 @@ def crop(self, box=None):
return self._new(self._crop(self.im, box))

def _crop(self, im, box):
# type: (Image, LURD) -> Image
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is actually a core Imaging object, which is not an Image.Image but rather a PIL._imaging object.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It took me a while to realise what was happening (the code is new to me) but as per my comment below this is clearer now.

PIL/Image.py Outdated
@@ -1146,6 +1212,7 @@ def getbbox(self):
return self.im.getbbox()

def getcolors(self, maxcolors=256):
# type: (int) -> Optional[List[Tuple[int, int]]]
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Why isn't this an Optional[int]?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Optional[int] is equivalent to Union[int, None], which isn't the case here?

Copy link
Member

@wiredfool wiredfool Aug 18, 2017

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Ah, so it's that the parameter is always an int, not that it's sometimes required as a named parameter.

I'm new to dealing with typing, especially in python. There's a learning curve here...

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Right, I have to think about this too, with function parameters, as Optional can be a misleading term!

@wiredfool
Copy link
Member

wiredfool commented Aug 18, 2017

I'm having trouble getting mypy to run in a manner that looks remotely correct.

Starting in a py35 virtual environment, just so that I'm not dealing with any python2 stuff. Pip installed mypy (for now), and running with the command line options above. This branch of pillow is installed in the virtualenv with make install. From outside the Pillow directory, from PIL import Image works correctly as expected.

However, mypy can't find the module:

(vpy35)erics@builder-1204-x64:~/test$ mypy -2 --ignore-missing-imports --follow-imports=silent --check-untyped-defs -m PIL.Image
mypy: can't find module 'PIL.Image'
(vpy35)erics@builder-1204-x64:~/test$ mypy -2 --ignore-missing-imports --follow-imports=silent --check-untyped-defs -m PIL
mypy: can't find module 'PIL'

Running from within the PIL directory, I get this:

(vpy35)erics@builder-1204-x64:~/Pillow$ mypy -2 --ignore-missing-imports --follow-imports=silent --check-untyped-defs PIL/Image.py | more
/home/erics/Pillow/PIL/BmpImagePlugin.py:128: error: INTERNAL ERROR -- please report a bug at https://github.com/python/mypy/issues version: 0.521
PIL/Image.py:75: error: Module 'PIL' has no attribute '_imaging'
PIL/Image.py:406: error: Missing return statement
[ ... many more lines]

I'm mostly worried about the Module 'PIL' has no attribute '_imaging' error, as that indicates that mypy isn't able to get the core imaging functions from the C module which has been installed.(*) This is not altogether surprising due to the source file layout and the path tricks needed to be able to access the PIL module from within the Pillow directory. (see test-installed.py for one of the workarounds)

[*] Also leading to the question, can mypy infer types from C extensions, where there is an explicit PyArg_ParseTuple call?

@neiljp
Copy link
Contributor Author

neiljp commented Aug 18, 2017

I've not run mypy using module imports before, so I don't have a response from experience for that case. To be precise, most recently the exact command I'm (now) running is:

mypy PIL/Image.py -2

from the main repo directory, though the absence of those extra flags does have the side effect of checking the imports.

I've not actually built the C code, since as far as I know mypy only checks python and doesn't delve into eg. built object code. On that note, and related to your comments above, I've actually been attempting to make some stubs for the 'core' as '_imaging.pyi'.

I'm thinking that the type aliases/imports might best be in a separate module?

Is this something you think might be merged, down the line?

PIL/Image.py Outdated
self.palette = None
self.info = {}
self.palette = None # type: Optional[ImagePalette.ImagePalette]
self.info = {} # type: Dict[Text, Any]

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Instead of Dict[Text, Any], I think this could be Dict[Text, Union[int, Tuple[int, ...], bytes]].

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We can certainly work on this, though I'm not sure exactly what it might represent. If the set of keys/values are limited, a TypedDict might be plausible in the future, even :)

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Info can be almost anything. It's pulled from the metadata in images, so bytes, string, tuples of int/float/?, float, ints are certainly possible. You can pretty much fit a ham sandwich in tiff metadata, and that's the source for a lot of this.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Right, so leave it at Any for now? That said, I think I read that it could be better to use object where it could be 'anything', but is that going to work for all the types in the python2 that PIL[low] supports?

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yes, leave it as any.

return self

def __exit__(self, *args):
# type: (*Any) -> None

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Even though it doesn't really matter in this case (i.e. __exit__ wouldn't do anything with arguments), this (*Any) could I think be constrained to (type, Exception, traceback) where traceback is just a quick import traceback (although I'm rather unfamiliar with this module, having never used it in my own code).

Copy link
Contributor Author

@neiljp neiljp Aug 19, 2017

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'm not too familiar with __exit__, but I believe in python2 that *args can only be one type?
If not, that's certainly plausible, particularly if changing the function signature (which I've been avoiding).

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

There's a specific set of what can be called on a context manager. We can change the signature to that, as it's the only way that this should ever get called.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Right, sorry, I forgot the context...was context-manager.

@@ -632,6 +677,7 @@ def _repr_png_(self):

@property
def __array_interface__(self):
# type: () -> Dict[Text, Any]

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

According to the NumPy documentation on __array_interface__, and looking over the source file (particularly at _conv_type_shape), I think this particular Any could be constrained further to Tuple[Tuple[int, ...], Text, int, bytes].

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Do you mean that to be a Union? From the docs only, and given that this needs to potentially interoperate with other code which exposes all the dictionary keys, I'd imagine this at least needs to include None and List as well?

This may be where it's sometimes easier to use Any, at least initially ;)

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

+1 on any here for now, this is an implementation of an interface defined elsewhere for datatype->datatype conversion, it's not something that's going to get hit in ordinary development., so type hints for IDEs aren't going to be essential.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Maybe object?

@@ -667,6 +715,7 @@ def __setstate__(self, state):
self.frombytes(data)

def tobytes(self, encoder_name="raw", *args):

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Nothing to do with this particular PR, but from the typing, it seems that the function _getencoder is rather...strange. Like why do we need to tweak arguments ?

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Backwards compatibility. Plugins for PIL were added in the Python 1.6 era, IIRC.

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

So Pillow targets almost every Python distribution? Not just the currently supported versions?

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

No, but the interface dates from that era, and there could be private image plugins out in the wild using the interface. So we can't just change the interface and touch up our plugins, if we make changes, we need to do backwards compatibility and deprecation periods.

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@wiredfool as you can probably tell, I'm still new to this, and contributing to open source. I'm trying though.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@bjd2385 No worries. You clearly know more about type systems than I do.

There's a lot of history in this code base, and some parts of it look strange until you try to work out what it would take to modernize it without breaking everyone else's code. Sometimes updating works, sometimes it doesn't. (tiff metadata is a poster child for that)

@@ -1092,12 +1154,14 @@ def draft(self, mode, size):
pass

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The signature could probably be (Text, Tuple[int, ...]) -> Something...but why does the docstring say it's implemented for JPEG && PCD images when it's just a pass?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Re the type, I'd imagine this is (Mode, Size) based on the type aliases, ie. (Text, Tuple[int, int]); or could the tuple be bigger?

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It's an iterable of length 2. There are places in calling code (i.e. django) where a 2 element list is passed in.

We've hit this usage before when making things tighter with the type/size checking.

Copy link
Contributor Author

@neiljp neiljp Aug 19, 2017

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The 'ABC's aren't solid in my mind, but this could be Sequence? (__getitem__, __len__)

Copy link

@emmeowzing emmeowzing Aug 19, 2017

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

That looks right to me, i.e. Sequence.

I know for some types it's a little ambiguous; like, when should I only use Iterable instead of Sequence? I see for the former we have __abstractmethods__ = frozenset({'__getitem__', '__len__'}), and then for the latter __abstractmethods__ = frozenset({'__iter__'}), but I have yet to see an implementation of an interable in Python that doesn't use __getitem__ with __iter__ to retrieve values. Moreover, most container types have a __len__ method, like list, dict, tuple, etc., so we have that for free (after all, how else would we store the contents internally in a class, whilst providing the machinery/algorithms to work on those contents in a different way, thus defining a new container).

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Well, while it's possible to infer length while iterating, these are supposed to be length==2 so it makes sense to assume it might be tested (which it currently is not?)

@wiredfool Re django etc, does this mean that all such Size usage should not be a tuple?

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It means that at one point we checked for a tuple, and broke people on release.

Otoh, if we document it as a tuple, people being typesafe will get it right, and eventually people using list because it works will get linted out as pep8 starts pushing typing.

I think our current stance is we specify a tuple, but we don't reject a 2list (anymore)

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@bjd2385 Documentation is pulled from Image.Image, but the implementations of the file formats inherit from Image -> ImageFile -> JpegImagePlugin.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@wiredfool Right, so specifying Tuple is good for Size, as it is now.

PIL/Image.py Outdated
@@ -1122,6 +1186,7 @@ def filter(self, filter):
return merge(self.mode, ims)

def getbands(self):
# type: () -> Tuple[Any]

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Instead of Tuple[Any], couldn't this be Tuple[str, ...], or...I'm so confused by ImageMode.py, and why are there so many global variables? It should be far easier to narrow down what this method should return as a type.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This was due to the same point I raised elsewhere regarding Tuples of unknown length.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

*update: so, probably Tuple[Mode, ...], which is the same as you specified, except that Mode is (in an updated commit I made) an alias for Text, not str.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Should return a tuple of single character strings, where len==number of bands/channels.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'm not sure there's a way to specify a 'char' in the type (ie. length==1), but we can certainly add a type alias for Band = Text, which documents it nicely?

Aliases of the same underlying type are interchangeable (eg. Band and Mode), but we could go beyond this and make them separate types.

PIL/Image.py Outdated
@@ -1236,6 +1307,7 @@ def getpalette(self):
return None # no palette

def getpixel(self, xy):
# type: (Coord) -> Tuple[Any]

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Could this Any be constrained to a type of float or int? For example, Tuple[Union[float, int], ...].

Copy link
Contributor Author

@neiljp neiljp Aug 19, 2017

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The issue here is that we're returning a tuple, but whose length varies - though actually also the 'color' type, which I'm not sure how to type offhand?

So, if we alias Color = [something], then ideally the function would return Union[Color, Tuple[Color, Color], Tuple[Color, Color, Color], ...]... but where we go all the way to the maximum number of layers in the image! Since I'm not sure if that's possible to quantify, I stuck with Tuple for now, though it should probably be Union[Color, Tuple] for now, for some value (type) of Color.

That is, unless '...' does something special that I'm not aware of, in this context?

update: apparently it does :) So the solution would then probably be Union[Color, Tuple[Color, ...]] ?

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think Union[Color, Tuple[Color, ...]]looks right. But what do you mean by alias Color = [something]? As in Color refers to a list of some values?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

See the previous added code at the top of Image.py, now in aliases.py.

We could add Color = float, for example.

@neiljp
Copy link
Contributor Author

neiljp commented Aug 19, 2017

@bjd2385 Thanks for the comments. I've replied to what I could, and have pushed the changes I've made, which include:

  • pulling the aliases into a separate file - maybe rename to _aliases.py, or _types.py?
  • adding some aliases
  • attempt to document _imaging via a stubs file (very much WIP)
  • add 'ignore' annotations for ne/eq, as they seem unclear in general stubs
  • __repr__ return type to str from Text (it's not unicode ever?)
  • changes to imports which hopefully improve the situation

On other projects I'm used to committing solo in PRs; will it be a problem if I force-push, or might you fork from my branch to work/commit?

Copy link

@emmeowzing emmeowzing left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Checked more type signatures.

PIL/Image.py Outdated
@@ -1209,6 +1278,7 @@ def getextrema(self):
return self.im.getextrema()

def getim(self):
# type: () -> Any

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I get a type of PyCapsule when running this method. However, I can't find the definition of this type anywhere in the pillow source, except a mention at L3085 in _imaging.c (I'm not terribly familiar with using C extensions with Python).

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

A PyCapsule is an internal python type: https://docs.python.org/3.6/c-api/capsule.html

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I couldn't find an explicit type for these so far. This may be an issue for mypy or typeshed, or perhaps we just leave it at Any for now?

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Ohhh, okay, that makes sense! I have no idea how to annotate that then. May as well be Any.

PIL/Image.py Outdated
@@ -1250,6 +1322,7 @@ def getpixel(self, xy):
return self.im.getpixel(xy)

def getprojection(self):
# type: () -> Tuple[List[Any], List[Any]]

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think we can constrain this return type to Tuple[List[int], List[int]] , but I'm confused a little by what _binary.i8 actually does. I wasn't aware that in Python 2, bytes is str.

PIL/Image.py Outdated
@@ -1262,6 +1335,7 @@ def getprojection(self):
return [i8(c) for c in x], [i8(c) for c in y]

def histogram(self, mask=None, extrema=None):
# type: (Optional[Image], Optional[Any]) -> List[int]

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Instead of Optional[Any], which resolves to the same thing as Any I believe, perhaps this should be Optional[ImagingCore] ? Also, the docstring doesn't have a corresponding :param mentioning this parameter.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

You're referring to the extrema parameter? I've looked at this again, and added an Extrema alias to aliases.py; let me know what you think.

PIL/Image.py Outdated
@@ -1426,6 +1503,7 @@ def alpha_composite(self, im, dest=(0,0), source=(0,0)):
self.paste(result, box)

def point(self, lut, mode=None):
# type: (Union[List, Callable[[Any], Any]], Optional[Text]) -> Image

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Out of curiosity -- i'm not familiar with what you're describing with [Any]. Is it meant to be List[Any]?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

For a Callable, a fully-typed version is Callable[[PARAM_LIST], RETURNED].

The related PEP (484?) or the mypy docs are pretty good for this :)

Copy link

@emmeowzing emmeowzing Aug 19, 2017

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@neiljp Okay, that makes sense. I've never written it that way and PyCharm/Mypy didn't complain, but I'll try it in the future. For complex functions and decorators I think we'd begin to have width issues and have to break an annotation over multiple lines.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

What was the alternative form you used?

@@ -2142,6 +2237,7 @@ def __transformer(self, box, image, method, data,
self.im.transform2(box, image.im, method, data, resample, fill)

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Where is this method definition? I'd like to know what fill is, why it defaults to a 1. I don't see where it's used in _imaging.c/_transform2 either; i.e. I see it's parsed, but I don't see it being used anywhere in that function.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

No, it doesn't appear to be used. I'm going to say historical anomaly, as it's passed in and used in ImageTransform, but here in transform2 it's hardwired to 1.

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@wiredfool could that be removed in a future PR then?

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Probably. It's technically accessible part of the public interface, but as part of the core object it's something that we can change reasonably carefully.

I'm kind of wondering what became of im.transform, since we're just left with im.transform2

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@wiredfool that's a good point.

@wiredfool
Copy link
Member

wiredfool commented Aug 19, 2017

On other projects I'm used to committing solo in PRs; will it be a problem if I force-push, or might you fork from my branch to work/commit?

I'd rather not have rebases or force-pushes when there's a lot of discussion going on, as it wipes out the history.

I'm thinking that the type aliases/imports might best be in a separate module?

I think that's a reasonable approach, similar to a header file in C.

Is this something you think might be merged, down the line?

I think so. There are really threefour criteria I see:

  1. Is it a future maintenance burden? Does it mess up the source files or add tons of effort?
  2. Is it complete/correct. (or at least, complete enough to be useful, and what is complete is correct)
  3. Are there tests or other (automated) acceptance criteria that will notify us when we break something.
  4. Documentation.

I think I'm comfortable saying that we'll hit 1. The impact to the source doesn't look bad from here, and I think on an incremental basis, it's not going to add too much effort to any individual feature.

2 is an ongoing effort.

3 would be a hook into our testing system. A new test case, e.g. Tests/test_types.py Gate it on typing and mypy being installed, run mypy on any source files, and have a pass / fail based on the results. This needs to fail on a new method in Image.Image with missing types or incorrect types. If we don't have this, the type support is likely to decay.

4 Documentation may not need to be huge, but there should be some.

PIL/Image.py Outdated
return self.size[1]

def _new(self, im):
# type: (Image) -> Image
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

That's (ImagingCore) -> Image

Anything that's assigned to self.im had better be an imaging core.

(realizing that mypy should us that...)

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@wiredfool Regarding ImagingCore, what about this type?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@wiredfool I'd actually just made this change, after fixing some imports too; I take it you've seen _imaging.pyi?

PIL/Image.py Outdated
@@ -599,7 +635,7 @@ def _dump(self, file=None, format=None, **options):
self.save(file, format, **options)
return file

def __eq__(self, other):
def __eq__(self, other): # type: ignore # type: (Image) -> bool
return (self.__class__.__name__ == other.__class__.__name__ and
Copy link
Member

@wiredfool wiredfool Aug 19, 2017

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The __eq__ and other __[comparison]__ methods take any object, as they're the implementation of img == a, and img == True or img == 1. They're valid comparisons even if the other object isn't an image. False, but valid.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think the problem I read about was the return type (for some reason), rather than the parameter, but we can certainly look at this again.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

...and it seems to work 👍

PIL/Image.py Outdated
@@ -655,6 +695,7 @@ def __getstate__(self):
self.tobytes()]

def __setstate__(self, state):
# type: (List) -> None
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pretty sure this is actually (tuple), as this is the pickle interface.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Ah, yes, I worked backwards from getstate passing out a list :)

PIL/Image.py Outdated
if ymargin is None:
ymargin = xmargin
self.load()
return self._new(self.im.expand(xmargin, ymargin, 0))

def filter(self, filter):
# type: (ImageFilter.Filter) -> Image
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Can be a callable returning a filter as well. (ducktyped at that, that implements filter(ImagingCore))

@neiljp
Copy link
Contributor Author

neiljp commented Aug 19, 2017

I'd rather not have rebases or force-pushes when there's a lot of discussion going on, as it wipes out the history.

Unfortunately I unthinkingly just did one earlier; ok this time?

I'm thinking that the type aliases/imports might best be in a separate module?

I think that's a reasonable approach, similar to a header file in C.

That's in now, of course, though I'm not sure of the best name.

Is this something you think might be merged, down the line?

I think so. There are really threefour criteria I see:

Is it a future maintenance burden? Does it mess up the source files or add tons of effort?
Is it complete/correct. (or at least, complete enough to be useful, and what is complete is correct)
Are there tests or other (automated) acceptance criteria that will notify us when we break something.
Documentation.

I think I'm comfortable saying that we'll hit 1. The impact to the source doesn't look bad from here, and I think on an incremental basis, it's not going to add too much effort to any individual feature.
👍
2 is an ongoing effort.
👍
3 would be a hook into our testing system. A new test case, e.g. Tests/test_types.py Gate it on typing and mypy being installed, run mypy on any source files, and have a pass / fail based on the results. This needs to fail on a new method in Image.Image with missing types or incorrect types. If we don't have this, the type support is likely to decay.
Right, happy to work on this and possibly compare to eg. Zulip.
4 Documentation may not need to be huge, but there should be some.
👍

PIL/Image.py Outdated
@@ -1426,6 +1496,7 @@ def alpha_composite(self, im, dest=(0,0), source=(0,0)):
self.paste(result, box)

def point(self, lut, mode=None):
# type: (Union[List, Callable[[Any], Any]], Optional[Mode]) -> Image
Copy link
Member

@wiredfool wiredfool Aug 19, 2017

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Also ImagePointHandler. And the Callable should be something like Callable[[Numeric],Numeric], though the two numeric values don't necessarily need to be the same type. (i.e., float->int is possible)

(not that I've seen an ImagePointHandler, but there's a special case there below. )

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Also that's a List[float]?
We can just use float here, since both are possible?

@@ -1784,6 +1862,7 @@ def rotate(self, angle, resample=NEAREST, expand=0, center=None,
]

def transform(x, y, matrix):
# type: (float, float, Sequence[float]) -> Tuple[float, float]
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Doesn't the Matrix here have to be a Tuple due to the expansion?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

a) The code uses a List :)
b) I tested with 2.7 and 3.5, and n=[3,4];(c,d)=n;c;d gives what you'd hope for.
c) This is only used within rotate, so if it passes mypy we're probably good, in terms of defining a public interface at least.

PIL/Image.py Outdated
@@ -1814,6 +1893,7 @@ def transform(x, y, matrix):
return self.transform((w, h), AFFINE, matrix, resample)

def save(self, fp, format=None, **params):
# type: (Union[Text, pathlib.Path, BytesIO], Optional[Text], **Any) -> None
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think this can also be a StringIO.StringIO in py2, but that's a py2 only type.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Is that as the filename to open or the file object to 'write' to? StringIO is unicode, at least by 2.6/7?

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

StringIO.StringIO is binary capable in py2 (and removed in py3), but it was replaced with the io.*Io classes. It's the file to write to, and in open, read from. But I have no idea how that's going to interact with mypy, as it's only defined in py2. This may be another case where it's unlikely that people who are using typing are going to be using StringIO, as they're going to be modernizing other sections of their code, and io.BytesIO is a drop in replacement.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think this is resolved by using BinaryIO, which is defined in typing; I've come across this before while trying to resolve an issue in Zulip. Also see stack overflow.
I've made this change in an upcoming set of commits.

PIL/Image.py Outdated
@@ -2223,6 +2317,7 @@ def _check_size(size):


def new(mode, size, color=0):
# type: (Mode, Size, Optional[Union[int, float, Tuple, Text]]) -> Image
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This union might make sense aliased/named as a Colorspec. (and the Tuple is going to be 1..4 of ints at this point)

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

  • Colorspec is preferable to eg. simply Color?
  • Are all 'color' uses going to be of that type, eg. for getpixel/putpixel and no doubt others?
  • I'm not sure if we gain anything with having Union[int, float] as part of that, cf float?

Either way, something like:
Union[int, float, Tuple[int], Tuple[int, int], Tuple[int, int, int], Tuple[int, int, int, int], Text]
or (simplified, if the Tuple values are still ints?):
Union[float, Tuple[int], Tuple[int, int], Tuple[int, int, int], Tuple[int, int, int, int], Text]?

Copy link
Member

@wiredfool wiredfool Aug 20, 2017

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

No, not all colors are going to take the text equivalents, though the draw, fill, (and maybe paste) are going to. The pyaccess and putpixel are much more constrained to the underlying type of the image, and getpixel is going to be exactly related to the underlying pixel type. (Note that's the ImageColor module)

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I've now tried to work on this in the latest 3 commits.

PIL/aliases.py Outdated
Matrix4 = Tuple[float, float, float, float]
Matrix12 = Tuple[float, float, float, float, float, float, float, float, float, float, float, float]
Mode = str
Extrema = Union[Tuple[Any, Any], Tuple[Tuple[Any, Any], ...]] # Any -> float?
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think the Extrema can be tuples of either int or floats. (and potentially nested)

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pixels are going to either be:

Single int/float, of up to 32 bit width, or tuples of 1 to 4 8 bit ints

I'm not sure if the type system can or should distinguish uint8 vs int32 at the python level.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

As per above, likely Extrema can be simply of floats?

As above, would this cover it?
Union[float, Tuple[int], Tuple[int, int], Tuple[int, int, int], Tuple[int, int, int, int], Text] ?

Is Text a special case for new ?

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

In the C layer, Extrema match line 11 above, where Any is either int/float, and each one has to be consistent and match the image type. So, Tuple[int, float] is invalid. Actually, due to a current limitation, multiband images are restricted to being 8 bit ints, so the actual type could be:

Extrema = Union[Tuple[float, float], Tuple[int,int], Tuple[Tuple[int,int],...]]

For pixels, text is a special case of input. I'd have to check the code to see if the single band float came back in a tuple, but I expect that it does.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Let's mark it as that for now 👍

@neiljp
Copy link
Contributor Author

neiljp commented Aug 20, 2017

I managed to extract the commits from my rebased branch and they seem to have gone through ok :)

PIL/Image.py Outdated
Image.__init__(self)
self.tile = []
self.tile = [] # type: List[Tuple] ## FIXME More detail? Should tile be in __init__ & _new?
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Tile is a messy type, and really would be good to document. ImageFile is probably the core of where that happens, along with all of the decoders.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I've added some comments with some notes of potential values, but it's not used in the current round of actual type-checking, within Image.py.

@neiljp
Copy link
Contributor Author

neiljp commented Aug 21, 2017

I think I've followed most of the changes except for the Color ones, and added others besides, perhaps most noticeably a huge chunk of the stubs in _imaging.pyi.

Most of the remaining things to tag/discuss in Image.py should now be tagged with 'FIXME TYPING' in a comment, though of course any/all of the typing that's already there can be checked too!

Note the related PR for code changes; I'm keeping this one for just annotations if at all possible, whereas the other has small commits which improve the code and/or allow type inference more easily.

PIL/_imaging.pyi Outdated
def putpixel(self, xy: XY, value: Any) -> None: ...
def pixel_access(self, readonly: int) -> Any: ...

def convert(self, mode: Mode, dither: int=..., paletteimage: Any=...) -> ImagingCore: ... ### what is paletteimage?
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

ImagingCore, see Image.Quantize

PIL/_imaging.pyi Outdated
def convert_transparent(self, mode: Mode, t: Union[Tuple[int, int, int], int]) -> ImagingCore: ... ### Name t better?
def copy(self) -> ImagingCore: ...
def crop(self, bbox: LURD) -> ImagingCore: ...
def expand(self, xmargin: int, ymargin: int, mode_index: int=0) -> Any: ... ### Include default?
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Should be ImagingCore

PIL/_imaging.pyi Outdated
def copy(self) -> ImagingCore: ...
def crop(self, bbox: LURD) -> ImagingCore: ...
def expand(self, xmargin: int, ymargin: int, mode_index: int=0) -> Any: ... ### Include default?
def filter(self, size: Size, divisor: float, offset: float, kernel: Any) -> ImagingCore: ... ### what is kernel?
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

A list of float32, length size[x] * size[y]

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I've set this to List[float]. It's not so important, but is this more accurately 'kerneldata' as a label?

PIL/_imaging.pyi Outdated
def crop(self, bbox: LURD) -> ImagingCore: ...
def expand(self, xmargin: int, ymargin: int, mode_index: int=0) -> Any: ... ### Include default?
def filter(self, size: Size, divisor: float, offset: float, kernel: Any) -> ImagingCore: ... ### what is kernel?
def histogram(self, tuple2: Tuple[float, float]=..., mask: ImagingCore=...) -> List[int]: ... ### Do these have default values? Do they take None? What is tuple2?
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Tuple2 is the extrema for the channel that's being processed, It's optional, and either ints or floats.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

They have default values, but can they be None? I'm not completely following the C interface code in all the cases.

PIL/_imaging.pyi Outdated
# def offset(self): ... ### Function removed?

def paste(self, core: ImagingCore, box: LURD, coremask: ImagingCore) -> None: ...
def point(self, lut: Any, mode: Optional[Mode]) -> ImagingCore: ... ### Clarify Any
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It's a list of floats or ints.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I've made this Union[List[int], List[float]], rather than List[Union[int, float]]; right?

PIL/_imaging.pyi Outdated

def resize(self, size: Size, resample: int=...) -> ImagingCore: ...
def transpose(self, method: int) -> ImagingCore: ...
def transform2(self, box: LURD, core: ImagingCore, method: int, data: Any, resample: int=..., fill: int=...) -> None: ... ### What is data?
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

A list of float. They're the matrix entries for the transform, n=6 or 8

PIL/_imaging.pyi Outdated
def isblock(self) -> int: ...

def getbbox(self) -> Optional[LURD]: ...
def getcolors(self, maxcolors: int) -> List[Tuple[int, Any]]: ... ### Any is a 'Color'?
Copy link
Member

@wiredfool wiredfool Aug 21, 2017

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Any is Union[Int, Float, Tuple[int,int], Tuple[int,int,int],Tuple[int,int,int,int]]

Answering the questing from last night, do single band images have their values wrapped in a tuple.

PIL/_imaging.pyi Outdated
def getbbox(self) -> Optional[LURD]: ...
def getcolors(self, maxcolors: int) -> List[Tuple[int, Any]]: ... ### Any is a 'Color'?
def getextrema(self) -> Tuple[float, float]: ...
def getprojection(self) -> Tuple[Any, Any]: ... ### Need types
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Tuple[List[int],List[int]] where len(list[i]) = size[i]

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'm not sure that constraint can go in the type!

@emmeowzing
Copy link

Out of curiosity, what is holding up a merge of this PR at this point? I'm still becoming familiar with GitHub and making contributions.

@neiljp
Copy link
Contributor Author

neiljp commented Nov 10, 2017

@radarhere @wiredfool
I appreciate this is likely not mergeable as it stands, but this has been quiet since the rebase so just checking in to see how you feel about this PR.

@wiredfool
Copy link
Member

wiredfool commented Nov 10, 2017

Looking at the criteria here: #2687 (comment)

  1. Is it a future maintenance burden? Does it mess up the source files or add tons of effort?
  2. Is it complete/correct. (or at least, complete enough to be useful, and what is complete is correct)
  3. Are there tests or other (automated) acceptance criteria that will notify us when we break something.
  4. Documentation.

Right now I think we're missing on 2 and 3. When I look at the output of mypy PIL/Image.py -2, I'm seeing a ton of output, and it appears to be more along the lines of issues in the typing system rather than underlying issues with the code. Until that output is manageable, then I'm not sure how we'd either:

  • ensure that nothing is broken on changes
  • have useful output for consumers of the library.

Some of this can be traced to the typing system having trouble with dynamic usage, where either we're dispatching off of a type because we're flexible about the item passed in (e.g. filename or open file like object) or we're ducktyping and it's not picking that up. But for it to be useful, we've got to find a way to deal with it.

@wiredfool
Copy link
Member

https://github.com/wiredfool/Pillow/tree/pr_2687 has a couple more commits to move closer.

@wiredfool
Copy link
Member

@radarhere radarhere changed the title [WIP]: Type annotations for Image.py. [WIP]: Type annotations for Image.py Dec 6, 2017
@wiredfool
Copy link
Member

https://github.com/wiredfool/Pillow/tree/pr_2687 now passes mypy --ignore-missing-imports -2 src/PIL/Image.py with no errors with the latest released mypy as of this weekend.

Getting to this point took more effort than I would have liked -- including some 'single static type' style rewriting and some selective ignoring of constructs that are difficult. Python's ducktyping does not play well with it's type system, in that there are many cases where an attribute is checked, and then that constraint is ignored in the then/else clause. There are also lots of places where a 2 or 4-tuple of ints is checked for length, and then interpreted as an Tuple[Any,...]. Finally, I'm unclear if it's even possible to have a type that is 'SupportsBufferInterface' because as far as I can tell, that's a C api level attribute.

Never the less, that branch now passes the tests, and in theory can remain that way. I'm not sure that it's entirely right, I'm not sure that it's the cleanest way to do it, but it does work.

@hugovk
Copy link
Member

hugovk commented Mar 24, 2018

See #2941 for a continuation of this.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging this pull request may close these issues.

Support for typing?
5 participants