I am having some trouble with a piece of code below:
Input: li is a nested list as below:
li = [['>0123456789 mouse gene 1\n', 'ATGTTGGGTT/CTTAGTTG\n', 'ATGGGGTTCCT/A\n'], ['>9876543210 mouse gene 2\n', 'ATTTGGTTTCCT\n', 'ATTCAATTTTAAGGGGGGGG\n']]
Using the function below, my desired output is simply the 2nd to the 9th digits following '>' under the condition that the number of '/' present in the entire sublist is > 1.
Instead, my code gives the digits to all entries. Also, it gives them multiple times. I therefore assume something is wrong with my counter and my for loop. I can't quite figure this out.
Any help, greatly appreciated.
import os
cwd = os.getcwd()
def func_one():
outp = open('something.txt', 'w') #output file
li = []
for i in os.listdir(cwd):
if i.endswith('.ext'):
inp = open(i, 'r').readlines()
li.append(inp)
count = 0
lis = []
for i in li:
for j in i:
for k in j[1:] #ignore first entry in sublist
if k == '/':
count += 1
if count > 1:
lis.append(i[0][1:10])
next_func(lis, outp)
Thanks, S :-)
Your indentation is possibly wrong, you should check count > 1 within the for j in i loop, not within the one that checks every single character in j[1:].
Also, here's a much easier way to do the same thing:
def count_slashes(items):
return sum(item.count('/') for item in items)
for item in li:
if count_slashes(item[1:]) > 1:
print item[0][1:10]
Or, if you need the IDs in a list:
result = [item[0][1:10] for item in li if count_slashes(item[1:]) > 1]
Python list comprehensions and generator expressions are really powerful tools, try to learn how to use them as it makes your life much simpler. The count_slashes function above uses a generator expression, and my last code snippet uses a list comprehension to construct the result list in a nice and concise way.
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With