What is the state-of-the-art way of accessing and modifying an AST in Java? I only found old examples on that with a lot of deprecated code. I found several descriptions and also I have an unclear view of how tools like antlr play into this whole concept. Maybe I should add that I want to parse an existing program, not to write an AST from scratch.
What I want to do with this AST is to transform it so that is easily possible to extract business rules for a rules engine. Maybe you have a good approach on that idea.
I believe what the Java community largely uses are things like the Eclipse AST interface (or access to the AST as offered by the Java compiler). This is basically tree nodes and lots of procedural computations to test node types and walk up and down trees. I don't think that is "state of the art" in general. Eclipse I think does offer some information how identifiers are tied to definitions ("name resolution").
The ANTLR parsers will help you build ASTs, and I'm sure there's a complete Java front for ANTLR that does that already, check their site. I think tree walking is the same as it is for the Eclipse AST. I don't think ANTLR's front end offers name resolution.
Better schemes involve attribute grammars, which allow you to build analyzers using dataflow computations across tree nodes. You can find Java implementations in Silver and JastAdd. These don't seem to be widely known in the Java community. JastAdd offers access to name resolution as well as data flow information, both of which IMHO are needed to do any interesting code analyses.
Pattern directed schemes are better yet; you describe syntax fragments of interest and corresponding actions. (Attribute grammars are kind of like pattern directed schemes limited to single tree nodes; pattern directed schemes operate on sets of tree nodes whose structure you personnally don't have to know). Program transformation systems (PTS) such as Stratego, DMS, and TXL offer these. However none of these are coded in Java. I'm pretty sure Stratego and TXL have full Java grammars and trees off the shelf, but nothing beyond that. DMS offers attribute grammars, name resolution, flow analysis, rewrites on trees using patterns, and even data flow based pattern matchers, for a variety of languages including Java.
You need as much analysis information as you can get to support "business rule extraction". If you think that is going to be easy, you're in for a rude surprise. While the code analysis capability is a necessary condition, to recognize business rules you need knowledge from outside the system as to the business vocabulary and actions of interest, and how they are mapped onto the code. The code doesn't contain that information.
EDIT: based on discussion in comments, OP suggests a semiautomated process implying that a person brings this extra knowledge to the process; I quite agree that is necessary. He may find this presentation on extracting business rules helpful to see why, and what one might do about that.
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With