Parse Java Source Files with Python [closed]

Question

I have a bunch of Java source files. I need to write a python script that goes through the source files and identifies all string literals and their location.

The problem is the strings could be in a couple of different forms such as:

String literal - "Hello World"
Combination of literals - "Hello" + "World"

I have come up with a couple of ideas to accomplish this:

Go line by line through the source files looking for " and using that to identify the location of a string
Use a regular expression

Do you have any comments on the ways I suggested on doing this or another method which I have not thought about?

In case your wondering, were doing internationalization on our code base. That's why I am trying to automate this process.

igni · Accepted Answer

Using re module is the quickest solution.

you can use re.finditer() which returns each matched regex with the content and position

>>> for m in re.finditer(r"\w+ly", text):
...     print '%02d-%02d: %s' % (m.start(), m.end(), m.group(0))

Mike Pennington · Answer

Another option is PLY, which is a pure-python lex / yacc. It was written by David Beazley... he has some slides that demonstrate the functionality. This would require a BNF grammar to quantify the syntax you are parsing. I'm not sure if you want to go that far.

If you don't want to use BNF, pyparsing is another choice.

Parse Java Source Files with Python [closed]

Tags:

python

regex

parsing

user489041

2 Answers

igni

Mike Pennington

Recent Activity

Donate For Us

Parse Java Source Files with Python [closed]

Tags:

python

regex

parsing

user489041

2 Answers

igni

Mike Pennington

Related questions

Recent Activity

Donate For Us