I'm doing some data analysis of images. For some analysis I would like to convert the image's pixels in to HSV from RGB in which they are originally stored.
At the moment I'm using this code:
def generate_hsv(im):
coords = product(range(im.shape[0]), range(im.shape[1]))
num_cores = multiprocessing.cpu_count()
m = Parallel(n_jobs=num_cores)(delayed(process_pixels)(im[i]) for i in coords)
return np.array(m).reshape(im.shape)
Where process_pixels
is just a wrapper for my conversion function:
def process_pixels(pixel):
return rgb_to_hsv(pixel[0], pixel[1], pixel[2])
The thing is it runs sluggishly.
Is there a more efficient way to do this? Or a better way to parallelize?
As Warren Weckesser said, the conversion function is problematic. I ended up using matplotlib:
matplotlib.colors.rgb_to_hsv(arr)
It now runs a million times faster.
Colorsys
module has its implementation for each pixel with the input being expected as (R,G,B)
. Now, colorsys
's implementation is listed below -
def rgb_to_hsv(r, g, b):
maxc = max(r, g, b)
minc = min(r, g, b)
v = maxc
if minc == maxc:
return 0.0, 0.0, v
s = (maxc-minc) / maxc
rc = (maxc-r) / (maxc-minc)
gc = (maxc-g) / (maxc-minc)
bc = (maxc-b) / (maxc-minc)
if r == maxc:
h = bc-gc
elif g == maxc:
h = 2.0+rc-bc
else:
h = 4.0+gc-rc
h = (h/6.0) % 1.0
return h, s, v
I have gone in with the assumption that the image being read is in (B,G,R)
format, as is done with OpenCV's cv2.imread
. So, let's vectorize the above mentioned function so that we could work with all pixels in a vectorized fashion. For vectorization, the usually preferred method is with broadcasting
. So, with it, a vectorized implementation of rgb_to_hsv
would look something like this (please notice how corresponding parts from the loopy code are transferred here) -
def rgb_to_hsv_vectorized(img): # img with BGR format
maxc = img.max(-1)
minc = img.min(-1)
out = np.zeros(img.shape)
out[:,:,2] = maxc
out[:,:,1] = (maxc-minc) / maxc
divs = (maxc[...,None] - img)/ ((maxc-minc)[...,None])
cond1 = divs[...,0] - divs[...,1]
cond2 = 2.0 + divs[...,2] - divs[...,0]
h = 4.0 + divs[...,1] - divs[...,2]
h[img[...,2]==maxc] = cond1[img[...,2]==maxc]
h[img[...,1]==maxc] = cond2[img[...,1]==maxc]
out[:,:,0] = (h/6.0) % 1.0
out[minc == maxc,:2] = 0
return out
Runtime test
Let's time it for a standard RGB image of size (256,256)
and to create that let's use random numbers in [0,255]
.
Here's a typical way to use colorsys's rgb_to_hsv
on an image of pixels :
def rgb_to_hsv_loopy(img):
out_loopy = np.zeros(img.shape)
for i in range(img.shape[0]):
for j in range(img.shape[1]):
out_loopy[i,j] = colorsys.rgb_to_hsv(img[i,j,2],img[i,j,1],img[i,j,0])
return out_loopy
As alternatives, there are also matplotlib's
and OpenCV's
color converion versions, but they seem to produce different results. For the sake of timings, let's include them anyway.
In [69]: img = np.random.randint(0,255,(256,256,3)).astype('uint8')
In [70]: %timeit rgb_to_hsv_loopy(img)
1 loops, best of 3: 987 ms per loop
In [71]: %timeit matplotlib.colors.rgb_to_hsv(img)
10 loops, best of 3: 22.7 ms per loop
In [72]: %timeit cv2.cvtColor(img,cv2.COLOR_BGR2HSV)
1000 loops, best of 3: 1.23 ms per loop
In [73]: %timeit rgb_to_hsv_vectorized(img)
100 loops, best of 3: 13.4 ms per loop
In [74]: np.allclose(rgb_to_hsv_vectorized(img),rgb_to_hsv_loopy(img))
Out[74]: True # Making sure vectorized version replicates intended behavior
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With