The question is, how can I remove elements that appear more often than once in an array completely. Below you see an approach that is very slow when it comes to bigger arrays. Any idea of doing this the numpy-way? Thanks in advance.
import numpy as np
count = 0
result = []
input = np.array([[1,1], [1,1], [2,3], [4,5], [1,1]]) # array with points [x, y]
# count appearance of elements with same x and y coordinate
# append to result if element appears just once
for i in input:
for j in input:
if (j[0] == i [0]) and (j[1] == i[1]):
count += 1
if count == 1:
result.append(i)
count = 0
print np.array(result)
Again to be clear: How can I remove elements appearing more than once concerning a certain attribute from an array/list ?? Here: list with elements of length 6, if first and second entry of every elements both appears more than once in the list, remove all concerning elements from list. Hope I'm not to confusing. Eumiro helped me a lot on this, but I don't manage to flatten the output list as it should be :(
import numpy as np
import collections
input = [[1,1,3,5,6,6],[1,1,4,4,5,6],[1,3,4,5,6,7],[3,4,6,7,7,6],[1,1,4,6,88,7],[3,3,3,3,3,3],[456,6,5,343,435,5]]
# here, from input there should be removed input[0], input[1] and input[4] because
# first and second entry appears more than once in the list, got it? :)
d = {}
for a in input:
d.setdefault(tuple(a[:2]), []).append(a[2:])
outputDict = [list(k)+list(v) for k,v in d.iteritems() if len(v) == 1 ]
result = []
def flatten(x):
if isinstance(x, collections.Iterable):
return [a for i in x for a in flatten(i)]
else:
return [x]
# I took flatten(x) from http://stackoverflow.com/a/2158522/1132378
# And I need it, because output is a nested list :(
for i in outputDict:
result.append(flatten(i))
print np.array(result)
So, this works, but it's impracticable with big lists. First I got RuntimeError: maximum recursion depth exceeded in cmp and after applying sys.setrecursionlimit(10000) I got Segmentation fault how could I implement Eumiros solution for big lists > 100000 elements?
np.array(list(set(map(tuple, input))))
returns
array([[4, 5],
[2, 3],
[1, 1]])
UPDATE 1: If you want to remove the [1, 1] too (because it appears more than once), you can do:
from collections import Counter
np.array([k for k, v in Counter(map(tuple, input)).iteritems() if v == 1])
returns
array([[4, 5],
[2, 3]])
UPDATE 2: with input=[[1,1,2], [1,1,3], [2,3,4], [4,5,5], [1,1,7]]:
input=[[1,1,2], [1,1,3], [2,3,4], [4,5,5], [1,1,7]]
d = {}
for a in input:
d.setdefault(tuple(a[:2]), []).append(a[2])
d is now:
{(1, 1): [2, 3, 7],
(2, 3): [4],
(4, 5): [5]}
so we want to take all key-value pairs, that have single values and re-create the arrays:
np.array([k+tuple(v) for k,v in d.iteritems() if len(v) == 1])
returns:
array([[4, 5, 5],
[2, 3, 4]])
UPDATE 3: For larger arrays, you can adapt my previous solution to:
import numpy as np
input = [[1,1,3,5,6,6],[1,1,4,4,5,6],[1,3,4,5,6,7],[3,4,6,7,7,6],[1,1,4,6,88,7],[3,3,3,3,3,3],[456,6,5,343,435,5]]
d = {}
for a in input:
d.setdefault(tuple(a[:2]), []).append(a)
np.array([v for v in d.itervalues() if len(v) == 1])
returns:
array([[[456, 6, 5, 343, 435, 5]],
[[ 1, 3, 4, 5, 6, 7]],
[[ 3, 4, 6, 7, 7, 6]],
[[ 3, 3, 3, 3, 3, 3]]])
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With