Add dominant color method #4874

392781 · 2020-08-18T22:15:37Z

Changes proposed in this pull request:

Adds a method to determine the dominant colors in an image using k-means clustering

I added some very basic tests but seeing as I'm very new at open source contributing I am open to suggestions to improve them.

…e using k-means clustering algorithm

Tests/test_image_getdominantcolors.py

src/PIL/Image.py

Co-authored-by: Andrew Murray <3112309+radarhere@users.noreply.github.com>

392781 · 2020-08-19T19:35:22Z

One thing to note about this is that k-means clustering takes a few seconds to compute for larger images which is bothersome. I'm currently working on improving the speed of the algorithm.

One catch-all method of doing this is to rescale the image for faster results. This solution is simple and doesn't require a lot of bloated code on top of the algorithm. The cost is that there is a loss of pixel information through rescaling so the dominant colors will come out different than original image.

Another is to do a bit of math on the color spaces used and throw out some less necessary pixels (kind of how Color-Thief does it on the RGBA space. The benefit is avoiding some more computation. The cost is readability of the method and the need to cover all supported color spaces. One way to sidestep this is to convert to a more familiar color space such as HSV, RGB, or RGBA; however, I'm not that knowledgable about these spaces and the effects of converting between them (however, it seems colorthief does it so I'm not sure.)

I'm hesitant to implement option 2 because I do not want this method to become too bloated or more complicated than it already is... But these are some ideas to think about. For now I'll test out resizing as an option and maybe play around with converting modes.

EDIT: I decided to implement the option to rescale since excluding colors yielded mixed results.

hugovk · 2020-08-20T06:31:18Z

The Image module is pretty big and I wonder if this would be better suited in another one?

hugovk · 2020-08-20T06:44:31Z

Tests/test_image_getdominantcolors.py

+
+def test_getdominantcolors():
+    def getdominantcolors(mode):
+        im = hopper(mode)


I see what you mean about it being slow, pytest Tests/test_image_getdominantcolors.py takes ~5.67s on my Mac.

Resizing to (10, 10) reduces it to 0.23s.
Resizing to (20, 20) reduces it to 0.41s.

Do you think 10x10 might be too small to be useful?

Suggested change

im = hopper(mode)

im = hopper(mode)

im.thumbnail((10, 10))

Small sizes like that give a very rough approximation especially if you're looking for a palette that's more than 3 colors. The colors returned tend to be much darker. I found that (100,100) gives somewhat decent results. The issue becomes exacerbated with large images (1080p wallpapers)

hugovk · 2020-08-20T07:09:13Z

Tests/test_image_getdominantcolors.py

+    assert getdominantcolors("YCbCr") == 3
+    assert getdominantcolors("CMYK") == 3
+    assert getdominantcolors("RGBA") == 3
+    assert getdominantcolors("HSV") == 3


We can improve these tests so they're testing more than just they return a thing of length three.

We can also use @pytest.mark.parametrize so each test case is run independently so cannot depend on an earlier case.

Here's an example:

import pytest from .helper import hopper @pytest.mark.parametrize( "test_mode, expected", [ ("F", [28.8104386146252, 66.26185757773263, 127.53211228743844]), ("I", [136, 94, 40]), ("L", [25, 63, 127]), ("P", [11, 71, 159]), ("RGB", [(172, 117, 94), (53, 44, 55), (95, 127, 185)]), ("YCbCr", [(130, 108, 155), (123, 163, 105), (47, 131, 131)]), ("CMYK", [(201, 210, 199, 0), (159, 127, 69, 0), (82, 137, 160, 0)]), ("RGBA", [(31, 24, 37, 255), (129, 125, 150, 255), (87, 67, 72, 255)]), ("HSV", [(140, 131, 85), (177, 70, 46), (97, 131, 186)]), ], ) def test_getdominantcolors(test_mode, expected): def getdominantcolors(mode): im = hopper(mode) im.thumbnail((10, 10)) colors = im.getdominantcolors() return colors assert getdominantcolors(test_mode) == expected

What do you think?

Do those expected return values look right?

Might they be different on other systems? Especially the first one. If so, we can adjust it somewhat, possibly have a special case for that.

It would be good to add further test_X functions to verify the other parameters of im.getdominantcolors, plus warnings and exceptions.

I think my earlier comment never posted. The values for HSV are way off, straight up wrong even when I tested yesterday. I think the way I implemented the algorithm doesn't play nice with the way I approximate the centers of that color space.

In addition when I convert CMYK and YCbCr to RGB they tend to be off by a few color values... This isn't too severe of a problem though.

One solution to the HSV case is to simply convert to RGBA and do the calculations there... I'm going to see if I could fix what's happening before I divert to this option.

My 2nd concern is that the returned colors are mostly subjective to the viewer. If say we used some other implementation as a base case, k-means may start out randomized and return slightly different results. Perhaps I could test within a range of accepted values?

392781 · 2020-08-20T15:56:43Z

The Image module is pretty big and I wonder if this would be better suited in another one?

* [`ImageColor`](https://pillow.readthedocs.io/en/stable/reference/ImageColor.html)

* [`ImageOps`](https://pillow.readthedocs.io/en/stable/reference/ImageOps.html)

* [`ImageStat`](https://pillow.readthedocs.io/en/stable/reference/ImageStat.html)

I've been thinking the same thing. I think if it were placed anywhere else it may be in ImageOps... ImageColor doesn't feel like a great fit for what this does unless maybe it gives an option to return colors in string formats.

nulano · 2020-08-20T16:11:50Z

I'm not sure ImageOps is the right place. It is a set of functions that return a processed variation of the source image, not properties of an image.

The ImageOps module contains a number of ‘ready-made’ image processing operations.

ImageColor doesn't seem right either:

The ImageColor module contains color tables and converters from CSS3-style color specifiers to RGB tuples. This module is used by PIL.Image.new() and the ImageDraw module, among others.

ImageStat looks like the closest match, but the PIL.ImageStat.Stat class caches results and the proposed function takes parameters. Maybe make this a module function in ImageStat (e.g. PIL.ImageStat.dominant_colors(im, ...))?

The ImageStat module calculates global statistics for an image, or for a region of an image.

src/PIL/Image.py

Co-authored-by: Andrew Murray <3112309+radarhere@users.noreply.github.com>

src/PIL/Image.py

392781 added 4 commits August 17, 2020 14:37

Added getdominantcolors method - Finds the dominant colors in an imag…

eda259b

…e using k-means clustering algorithm

Added support for various image modes

f7afdda

Created some basic tests of getdominantcolors functionality

6f3e77b

Added some comments

9c0f98a

radarhere added the Enhancement label Aug 19, 2020

radarhere reviewed Aug 19, 2020

View reviewed changes

Tests/test_image_getdominantcolors.py Outdated Show resolved Hide resolved

radarhere reviewed Aug 19, 2020

View reviewed changes

src/PIL/Image.py Outdated Show resolved Hide resolved

radarhere reviewed Aug 19, 2020

View reviewed changes

src/PIL/Image.py Outdated Show resolved Hide resolved

radarhere reviewed Aug 19, 2020

View reviewed changes

src/PIL/Image.py Outdated Show resolved Hide resolved

392781 and others added 3 commits August 19, 2020 08:28

Update src/PIL/Image.py

8cce088

Co-authored-by: Andrew Murray <3112309+radarhere@users.noreply.github.com>

Update src/PIL/Image.py

37ff4b9

Co-authored-by: Andrew Murray <3112309+radarhere@users.noreply.github.com>

Update Tests/test_image_getdominantcolors.py

c48e871

Co-authored-by: Andrew Murray <3112309+radarhere@users.noreply.github.com>

Added a quality resizing option for larger images

21315e5

hugovk reviewed Aug 20, 2020

View reviewed changes

radarhere reviewed Oct 24, 2020

View reviewed changes

src/PIL/Image.py Outdated Show resolved Hide resolved

Update src/PIL/Image.py

5593b62

Co-authored-by: Andrew Murray <3112309+radarhere@users.noreply.github.com>

hugovk force-pushed the master branch from 747e44b to 9e3ad5e Compare December 18, 2020 20:12

radarhere reviewed Mar 28, 2021

View reviewed changes

src/PIL/Image.py Outdated Show resolved Hide resolved

Simplified check for number of channels

7725aa5

392781 closed this by deleting the head repository Sep 5, 2023

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Add dominant color method #4874

Add dominant color method #4874

392781 commented Aug 18, 2020

392781 commented Aug 19, 2020 •

edited

Loading

hugovk commented Aug 20, 2020

hugovk Aug 20, 2020

392781 Aug 20, 2020 •

edited

Loading

hugovk Aug 20, 2020

392781 Aug 20, 2020

392781 commented Aug 20, 2020

nulano commented Aug 20, 2020 •

edited

Loading

Add dominant color method #4874

Add dominant color method #4874

Conversation

392781 commented Aug 18, 2020

392781 commented Aug 19, 2020 • edited Loading

hugovk commented Aug 20, 2020

hugovk Aug 20, 2020

Choose a reason for hiding this comment

392781 Aug 20, 2020 • edited Loading

Choose a reason for hiding this comment

hugovk Aug 20, 2020

Choose a reason for hiding this comment

392781 Aug 20, 2020

Choose a reason for hiding this comment

392781 commented Aug 20, 2020

nulano commented Aug 20, 2020 • edited Loading

392781 commented Aug 19, 2020 •

edited

Loading

392781 Aug 20, 2020 •

edited

Loading

nulano commented Aug 20, 2020 •

edited

Loading