This question refers to this problem on lintcode. I have a working solution, but it takes too long for the huge testcase. I am wondering how can it be improved? Maybe I can decrease the number of comparisons I make in the outer loop.
class Solution:
    # @param strs: A list of strings
    # @return: A list of strings
    def anagrams(self, strs):
        # write your code here
        ret=set()
        for i in range(0,len(strs)):
            for j in range(i+1,len(strs)):
                if i in ret and j in ret:
                    continue
                if Solution.isanagram(strs[i],strs[j]):
                    ret.add(i)
                    ret.add(j)
        return [strs[i] for i in list(ret)]
    @staticmethod
    def isanagram(s, t):
        if len(s)!=len(t):
            return False
        chars={}
        for i in s:
            if i in chars:
                chars[i]+=1
            else:
                chars[i]=1
        for i in t:
            if i not in chars:
                return False
            else:
                chars[i]-=1
                if chars[i]<0:
                    return False
        for i in chars:
            if chars[i]!=0:
                return False
        return True
Update: Just to add, not looking for built-in pythonic solutions such as using Counter which are already optimized. Have added Mike's suggestions, but still exceeding time-limit.
Skip strings you already placed in the set. Don't test them again.
# @param strs: A list of strings
# @return: A list of strings
def anagrams(self, strs):
    # write your code here
    ret=set()
    for i in range(0,len(strs)):
        for j in range(i+1,len(strs)):
            # If both anagrams exist in set, there is no need to compare them.
            if i in ret and j in ret:
                continue
            if Solution.isanagram(strs[i],strs[j]):
                ret.add(i)
                ret.add(j)
    return [strs[i] for i in list(ret)]
You can also do a length comparison in your anagram test before iterating through the letters. Whenever the strings aren't the same length, they can't be anagrams anyway. Also, when a counter in chars reaches -1 when comparing values in t, just return false. Don't iterate through chars again.
@staticmethod
def isanagram(s, t):
    # Test strings are the same length
    if len(s) != len(t):
        return False
    chars={}
    for i in s:
        if i in chars:
            chars[i]+=1
        else:
            chars[i]=1
    for i in t:
        if i not in chars:
            return False
        else:
            chars[i]-=1
            # If this is below 0, return false
            if chars[i] < 0:
                return False
    for i in chars:
        if chars[i]!=0:
            return False
    return True
Instead of comparing all pairs of strings, you can just create a dictionary (or collections.defaultdict) mapping each of the letter-counts to the words having those counts. For getting the letter-counts, you can use collections.Counter. Afterwards, you just have to get the values from that dict. If you want all words that are anagrams of any other words, just merge the lists that have more than one entry.
strings = ["cat", "act", "rat", "hut", "tar", "tact"]
anagrams = defaultdict(list)
for s in strings:
    anagrams[frozenset(Counter(s).items())].append(s)
print([v for v in anagrams.values()])
# [['hut'], ['rat', 'tar'], ['cat', 'act'], ['tact']]
print([x for v in anagrams.values() if len(v) > 1 for x in v])
# ['cat', 'act', 'rat', 'tar']
Of course, if you prefer not to use builtin functionality you can with just a few more lines just as well use a regular dict instead of defaultdict and write your own Counter, similar to what you have in your isanagram method, just without the comparison part.
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With