I am trying a hangman code in python. For matching a character of a word , iam using index function to get the location of character. Ex :word = 'COMPUTER'
user_input = raw_input('Enter a character :') # say 'T; is given here
if user_input in word:
print "\nThe Character %c is present in the word \n" %user_input
word_dict[word.index(user_input)] = user_input
#so the output will looks like
{0: '_', 1: '_', 2: '_', 3: '_', 4: '_', 5: 'T', 6: '_', 7: '_'}
Now , my problems comes when it comes with the repeated character.
# Another example
>>> 'CARTOON'.index('O')
4
For the second 'O', how to get its index. since i have used this 'index' logic, i am looking to continue on this way.
As per the str.index docs, signature looks like this
str.index(sub[, start[, end]])
The second parameter is the starting index to search from. So you can pass the index which you got for the first item + 1, to get the next index.
i = 'CARTOON'.index('O')
print 'CARTOON'.index('O', i + 1)
Output
5
The above code can be written like this
data = 'CARTOON'
print data.index('O', data.index('O') + 1)
You can even have this as a utility function, like this
def get_second_index(input_string, sub_string):
return input_string.index(sub_string, input_string.index(sub_string) + 1)
print get_second_index("CARTOON", "O")
Note: If the string is not found atleast twice, this will throw ValueError.
The more generalized way,
def get_index(input_string, sub_string, ordinal):
current = -1
for i in range(ordinal):
current = input_string.index(sub_string, current + 1)
else:
raise ValueError("ordinal {} - is invalid".format(ordinal))
return current
print get_index("AAABBBCCCC", "C", 4)
A perhaps more pythonic method would be to use a generator, thus avoiding the intermediate array 'found':
def find_indices_of(char, in_string):
index = -1
while True:
index = in_string.find(char, index + 1)
if index == -1:
break
yield index
for i in find_indices_of('x', 'axccxx'):
print i
1
4
5
An alternative would be the enumerate built-in
def find_indices_of_via_enumerate(char, in_string):
return (index for index, c in enumerate(in_string) if char == c)
This also uses a generator.
I then got curious as to perf differences. I'm a year into using python, so I'm only beginning to feel truly knowledgeable. Here's a quick test, with various types of data:
test_cases = [
('x', ''),
('x', 'axxxxxxxxxxxx'),
('x', 'abcdefghijklmnopqrstuvw_yz'),
('x', 'abcdefghijklmnopqrstuvw_yzabcdefghijklmnopqrstuvw_yzabcdefghijklmnopqrstuvw_yzabcdefghijklmnopqrstuvwxyz'),
]
for test_case in test_cases:
print "('{}', '{}')".format(*test_case)
print "string.find:", timeit.repeat(
"[i for i in find_indices_of('{}', '{}')]".format(*test_case),
"from __main__ import find_indices_of",
)
print "enumerate :", timeit.repeat(
"[i for i in find_indices_of_via_enumerate('{}', '{}')]".format(*test_case),
"from __main__ import find_indices_of_via_enumerate",
)
print
Which, on my machine results in these timings:
('x', '')
string.find: [0.6248660087585449, 0.6235580444335938, 0.6264920234680176]
enumerate : [0.9158611297607422, 0.9153609275817871, 0.9118690490722656]
('x', 'axxxxxxxxxxxx')
string.find: [6.01502799987793, 6.077538013458252, 5.997750997543335]
enumerate : [3.595151901245117, 3.5859270095825195, 3.597352981567383]
('x', 'abcdefghijklmnopqrstuvw_yz')
string.find: [0.6462750434875488, 0.6512351036071777, 0.6495819091796875]
enumerate : [2.6581480503082275, 2.6216518878936768, 2.6187551021575928]
('x', 'abcdefghijklmnopqrstuvw_yzabcdefghijklmnopqrstuvw_yzabcdefghijklmnopqrstuvw_yzabcdefghijklmnopqrstuvwxyz')
string.find: [1.2539417743682861, 1.2511990070343018, 1.2702908515930176]
enumerate : [7.837890863418579, 7.791800022125244, 7.9181809425354]
enumerate() method is more expressive, pythonic. Whether or not perf differences matter depends on the actual use cases.
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With