I'm dealing with very large image arrays of uint16
data that I would like to downscale and convert to uint8
.
My initial way of doing this caused a MemoryError
because of an intermediary float64
array:
img = numpy.ones((29632, 60810, 3), dtype=numpy.uint16)
if img.dtype == numpy.uint16:
multiplier = numpy.iinfo(numpy.uint8).max / numpy.iinfo(numpy.uint16).max
img = (img * multiplier).astype(numpy.uint8, order="C")
I then tried to do the multiplication in place, the following way:
if img.dtype == numpy.uint16:
multiplier = numpy.iinfo(numpy.uint8).max / numpy.iinfo(numpy.uint16).max
img *= multiplier
img = img.astype(numpy.uint8, order="C")
But I run into the following error:
TypeError: Cannot cast ufunc multiply output from dtype('float64') to dtype('uint16') with casting rule 'same_kind'
Do you know of a way to perform this operation while minimizing the memory footprint?
Where can I change the casting rule mentioned in the error message?
Q : "Do you know of a way to perform this operation while minimizing the memory footprint?"
First, let's get the [SPACE]
-domain sizing right. The base-array is 29k6 x 60k8 x RGB x 2B in-memory object:
>>> 29632 * 60810 * 3 * 2 / 1E9 ~ 10.81 [GB]
having eaten some 11 [GB]
of RAM.
Any operation will need some space. Having a TB
-class [SPACE]
-Domain for purely in-memory numpy-vectorised tricks, we are done here.
Given the O/P task was to minimise the memory footpint, moving all the arrays and their operations into numpy.memmap()
-objects will solve it.
I finally found a solution that works after some reading of the numpy ufunc documentation.
multiplier = numpy.iinfo(numpy.uint8).max / numpy.iinfo(numpy.uint16).max
numpy.multiply(img, multiplier, out=img, casting="unsafe")
img = img.astype(numpy.uint8, order="C")
I should have found this earlier, but it's not an easy read if you are not familiar with some of the technical vocabulary.
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With