I have a list of tweets. They look like this:
data = [['trading $aa $BB stock market info'],
['$aa is $116 market is doing well $cc $ABC']]
I want to extract stock tickers:
['$aa', '$BB']
['$aa', '$cc', '$ABC']]
I have tried this:
for i in data:
print re.findall(r'[$]\S*', str(i))
And, the output contains $116 as well:
['$aa', '$BB']
['$aa', '$116', '$cc', '$ABC']]
Any suggestions?
Match the dollar sign, one letter, and then anything that's not a space:
re.findall(r'[$][A-Za-z][\S]*', str(i))
I'll just leave this here for people looking for a regex that matches a stock ticker
re.fullmatch('([A-Za-z]{1,5})(-[A-Za-z]{1,2})?', symbol)
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With