I currently have a class which stores a list of lists. The inner lists are not of the same length. I made the class subscriptable with the following code (possibly not the best way of doing this, and perhaps overly fancy).
class MyClass:
def __init__(self):
#
self.instructions = []
# for demo purposes
self.instructions.append([0, 1, 2])
self.instructions.append([3, 4, 5, 6])
self.instructions.append([7, 8])
def __getitem__(self, ind):
if ind >= 0:
iterator = self.instructions.__iter__()
compare = int.__gt__
inc = int.__add__
else:
iterator = reversed(self.instructions)
compare = int.__le__
inc = int.__sub__
s = 0
for tp in iterator:
L = len(tp)
if compare(inc(s, L), ind):
return tp[ind-s]
else:
s = inc(s, L)
else:
raise IndexError('index out of range')
This works. For instance
>>> x = MyClass()
>>> x[5]
5
>>> x[-5]
4
Now, I need to modify the class so it now stores two list of lists. The two lists are instructions
and annotations
, and both have the same length. But len(instructions[i])
does not have to equal len(annotations[i])
.
class NewClass:
def __init__(self):
#
self.instructions = []
self.annotations = []
# for demo purposes
self.instructions.append([0, 1, 2])
self.instructions.append([5, 6, 7, 8])
self.instructions.append([12, 13])
self.annotations.append([3, 4])
self.annotations.append([9, 10, 11])
self.annotations.append([14, 15, 16])
def __getitem__(self, ind):
pass
I want to make this subscriptable, with the order of elements oscillating between the instructions
sublists and the annotations
sublists. The demo data indicates the subscripting order. I want
>>> y = NewClass()
>>> y[9]
9
>>> y[-4]
13
What's an efficient way of doing this?
I could write a solution where I alternatively iterate through the two sublists. But I feel like I am straying far from the correct solution. I am also looking for a non-for-loop solution as I am dealing with long lists.
The standard balance between storage cost and runtime cost for random access into the (non-stored) concatenation of several arrays is to store a table of the offsets of the beginning of each list (i.e., the sum of the length of every list before it) and use binary search on that table:
import itertools
import bisect
class Index:
def __init__(self,ll):
self.ll = ll
self.oo = list(itertools.accumulate(map(len,ll), initial=0))
def __getitem__(self, i):
if i < 0:
i += self.oo[-1]
j = bisect.bisect(self.oo,i)
if not 0 < j <= len(self.ll):
raise IndexError
return self.ll[j-1][i-self.oo[j-1]]
def __iter__(self):
return itertools.chain.from_iterable(self.ll)
# Example:
i = Index(
[[9,1,7],
[3,0],
[],
[4,4,4,2]]
)
assert i[4]==0 and i[8]==2
The j-1
is because the initial 0 causes an i
of 0 to be assigned the insertion point of 1. You can omit the ,initial=0
(and in fact the last element of self.oo
as well) at the cost of more complicated code for the edge/error cases. __iter__
is provided because it is asymptotically faster than indexing with successive integers, each of which must be subjected to the binary search even though usually the same sublist will be found.
Obviously extending this to support the interleaving of two lists (of equal length) is trivial: sum the interleaved lengths and then use divmod(j-1,2)
to obtain the index into a list and the selection between the two lists (respectively).
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With