I have the following code that is self explanatory in the docstring. How do I get it to not flag single letters with a 1, thereby turning a single digit into 2 in the final compressed string?
For example in the docstring it turns AAABBBBCDDDD -> A3B4C1D4 but I want it to turn into A3B4CD4. I'm new at this so it's any comments are greatly appreciated.
class StringCompression(object):
'''
Run Length Compression Algorithm: Given a string of letters, such as
nucleotide sequences, compress it using numbers to flag contiguous repeats.
Ex: AAABBBBCDDDD -> A3B4C1D4
>>>x = StringCompression('AAAAbC')
>>>x.compress()
'A4bC'
'''
def __init__(self, string):
self.string = string
def compress(self):
'''Executes compression on the object.'''
run = ''
length = len(self.string)
if length == 0:
return ''
if length == 1:
return self.string #+ '1'
last = self.string[0]
count = 1
i = 1
while i < length:
if self.string[i] == self.string[i - 1]:
count += 1
else:
run = run + self.string[i - 1] + str(count)
count = 1
i += 1
run = (run + self.string[i - 1] + str(count))
return run
Here's an alternative solution using itertools.groupby and a generator:
from itertools import chain, groupby
x = 'AAABBBBCDDDD'
def compressor(s):
for i, j in groupby(s):
size = len(list(j))
yield (i, '' if size==1 else str(size))
res = ''.join(chain.from_iterable(compressor(x)))
print(res)
A3B4CD4
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With