Making this C array processing code more python (and even numpy)

Question

I'm trying to get my head around the amazing list processing abilities of python (And eventually numpy). I'm converting some C code I wrote to python.

I have a text datafile where first row is a header, and then every odd row is my input data and every even row is my output data. All data space separated. I'm quite chuffed that I managed to read all the data into lists using nested list comprehensions. amazing stuff.

with open('data.txt', 'r') as f:
    # get all lines as a list of strings
    lines = list(f)

    # convert header row to list of ints and get info
    header = map(int, lines[0].split(' '))
    num_samples = header[0]
    input_dim = header[1]
    output_dim = header[2]
    del header    

    # bad ass list comprehensions 
    inputs = [[float(x) for x in l.split()] for l in lines[1::2]]
    outputs = [[float(x) for x in l.split()] for l in lines[2::2]]
    del x, l, lines

Then I want to produce a new list where each element is a function of a corresponding input-output pair. I couldn't figure out how to do this with any python specific optimizations. Here it is in C-style python:

# calculate position
pos_list = [];
pos_y = 0
for i in range(num_samples):
    pantilt = outputs[i];
    target = inputs[i];

    if(pantilt[0] > 90):
        pantilt[0] -=180
        pantilt[1] *= -1
    elif pantilt[0] < -90:
        pantilt[0] += 180
        pantilt[1] *= -1

    tan_pan = math.tan(math.radians(pantilt[0]))
    tan_tilt = math.tan(math.radians(pantilt[1]))

    pos = [0, pos_y, 0]
    pos[2] = tan_tilt * (target[1] - pos[1]) / math.sqrt(tan_pan * tan_pan + 1)
    pos[0] = pos[2] * tan_pan
    pos[0] += target[0]
    pos[2] += target[2]
    pos_list.append(pos)
del pantilt, target, tan_pan, tan_tilt, pos, pos_y

I tried to do it with a comprehension, or map but couldn't figure out how to:

draw from two different lists (both input and output) for each element of the pos_list array
put the body of the algorithm in the comprehension. would it have to be a separate function or is there a funky way of using lambdas for this?
would it even be possible to do this with no loops at all, just stick it in numpy and vectorize the whole thing?

Divakar · Accepted Answer

One vectorized approach using boolean-indexing/mask -

import numpy as np

def mask_vectorized(inputs,outputs,pos_y):
    # Create a copy of outputs array for editing purposes
    pantilt_2d = outputs[:,:2].copy()

    # Get mask correspindig to IF conditional statements in original code
    mask_col0_lt = pantilt_2d[:,0]<-90
    mask_col0_gt = pantilt_2d[:,0]>90

    # Edit the first column as per the statements in original code
    pantilt_2d[:,0][mask_col0_gt] -= 180
    pantilt_2d[:,0][mask_col0_lt] += 180

    # Edit the second column as per the statements in original code
    pantilt_2d[ mask_col0_lt | mask_col0_gt,1] *= -1

    # Get vectorized tan_pan and tan_tilt 
    tan_pan_tilt = np.tan(np.radians(pantilt_2d))

    # Vectorized calculation for: "tan_tilt * (target[1] .." from original code 
    V = (tan_pan_tilt[:,1]*(inputs[:,1] - pos_y))/np.sqrt((tan_pan_tilt[:,0]**2)+1)

    # Setup output numpy array
    pos_array_vectorized = np.empty((num_samples,3))

    # Put in values into columns of output array
    pos_array_vectorized[:,0] = inputs[:,0] + tan_pan_tilt[:,0]*V
    pos_array_vectorized[:,1] = pos_y
    pos_array_vectorized[:,2] = inputs[:,2] + V

    # Convert to list, if so desired for the final output
    # (keeping as numpy array could boost up the performance further)
    return pos_array_vectorized.tolist()

Runtime tests

In [415]: # Parameters and setup input arrays
     ...: num_samples = 1000
     ...: outputs = np.random.randint(-180,180,(num_samples,5))
     ...: inputs = np.random.rand(num_samples,6)
     ...: pos_y = 3.4
     ...: 

In [416]: %timeit original(inputs,outputs,pos_y)
100 loops, best of 3: 2.44 ms per loop

In [417]: %timeit mask_vectorized(inputs,outputs,pos_y)
10000 loops, best of 3: 181 µs per loop

Ami Tavory · Answer

Suppose you read your file into a list, like so:

lines = open('data.txt', 'r').readlines()

The header is this:

lines[0]

The even lines are:

even = lines[1:][::2]

and the odd lines are:

odd = lines[2:][::2]

Now you can create a list using itertools.izip from these two lists:

itertools.izip(even, odd)

This is a sort of list-like thingy (you can loop over it, or just write list( ... ) around it to make it into a true list), whose each entry is a pair of your input-output data.

Making this C array processing code more python (and even numpy)

Tags:

python

arrays

c

vectorization

numpy

memo

2 Answers

Divakar

Ami Tavory

Recent Activity

Donate For Us

Making this C array processing code more python (and even numpy)

Tags:

python

arrays

c

vectorization

numpy

memo

2 Answers

Divakar

Ami Tavory

Related questions

Recent Activity

Donate For Us