I have a function that eat a couple of seconds. The function should return the Top(n) colors in a given Image. The return must be sorted, so i could work with the rgb values from the first, second, third top most color.
On the fist hand i had an PIL.Image Object that i looped over x,y coords and counted it in an defaultdict. I've replaced the PIL Object in my project with a Numpy array what gived me a big Boost but i don't know how to replace the defaultdict in this situation.
My current solution:
import numpy as np
from scipy import misc # for example Image
from collections import defaultdict
def count_colors(img, n):
img = img.reshape(-1, img.shape[-1])
color = defaultdict(int)
for pixel in img:
rgb = (pixel[0], pixel[1], pixel[2])
color[rgb] += 1
sorted_color = sorted(color.items(), key=lambda k_v: k_v[1], reverse=True)
sorted_color = sorted_color[:n]
return sorted_color
img = misc.face() # example Numpy Image array
top_colors = count_colors(img, n=5)
display(top_colors)
Current Output:
[((9, 9, 9), 1062),
((10, 10, 10), 700),
((8, 8, 8), 668),
((9, 7, 8), 586),
((9, 7, 10), 579)]
Is there a real Numpy way to solve this?
Approach #1
We can use np.unique(.., axis=0, return_counts=True)
to get counts for each unique color and then np.argpartition
for the top N colors among them for a compact vectorized solution -
def topN_colors(img, N):
unqc,C = np.unique(img.reshape(-1,img.shape[-1]), axis=0, return_counts=True)
topNidx = np.argpartition(C,-N)[-N:]
return unqc[topNidx], C[topNidx]
Approach #2
Another mainly based on a 24-bit integer 2D reduction for a hopefully more efficient solution -
# https://stackoverflow.com/a/57236217/ @tstanisl
def scalarize(x):
# compute x[...,2]*65536+x[...,1]*256+x[...,0] in efficient way
y = x[...,2].astype('u4')
y <<= 8
y +=x[...,1]
y <<= 8
y += x[...,0]
return y
def topN_colors_v2(img, N):
img2D = scalarize(img)
unq,idx,C = np.unique(img2D, return_index=True, return_counts=True)
topNidx = np.argpartition(C,-N)[-N:]
return img.reshape(-1,img.shape[-1])[idx[topNidx]], C[topNidx]
Note that argpartition
doesn't keep the order. To keep order, use range()
with it. More info. So, in np.argpartition
replace -N
with range(-N,0)
to get colors and their counts in ascending order. For descending order, simply flip the final outputs.
Verification with sample
# Sample setup
np.random.seed(0)
# some random set colors
colors = np.array([[2,5,6],[1,2,3],[6,7,8],[5,3,1],[7,4,2]])
# Random image with colors chosen off colors
idx = np.random.randint(0,len(colors),(50,40))
img = colors[idx]
img = img.astype(np.uint8)
# Given that we know the unique colors as `colors` and the indices
# use to get the image `img, let's "manually" compute the
# top N=2 colors and their counts
In [45]: count = np.bincount(idx.ravel())
In [46]: colors[count.argsort()[-2:]]
Out[46]:
array([[1, 2, 3],
[5, 3, 1]], dtype=uint8)
In [47]: count[count.argsort()[-2:]]
Out[47]: array([393, 446])
# Verify our solution
In [48]: topN_colors(img, N=2)
Out[48]:
(array([[1, 2, 3],
[5, 3, 1]], dtype=uint8),
array([393, 446]))
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With