Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

is it possible to retrieve a list from parser-defined label in antlr4?

Tags:

antlr

antlr4

Take this dummy antlr4-grammar:

grammar testingGrammar;
@header{package gen;}

dsopt_rename: 'rename' (OLDN=ID '=' NEWN=ID)+;
ID: [a-zA-Z_];

My target is java. I want to get two lists: oldNames and newNames; can be done like this:

@Override
public DsOption visitDsopt_rename(Dsopt_renameContext ctx) {    
    LinkedList<String> oldNames = new LinkedList<String>();
    LinkedList<String> newNames = new LinkedList<String>();
    for (int i=0; i < ctx.ID().size(); ++i) {
        LinkedList<String> rename = (i%2 == 1) ? oldNames : newNames;
        rename.add(ctx.ID(i).getText());
    }
    return new DsOptRename(oldNames, newNames);
}

I would have preferred the following -also known as "second approach"- (had it worked):

@Override
public DsOption visitDsopt_rename(Dsopt_renameContext ctx) {    
    LinkedList<String> oldNames = new LinkedList<String>();
    LinkedList<String> newNames = new LinkedList<String>();
    ctx.OLDN().forEach(e -> oldNames.add(e.getText()));
    ctx.NEWN().forEach(e -> oldNames.add(e.getText()));
    return new DsOptRename(oldNames, newNames);
}

Apparently, labels ctx.OLDN (without parenthesis) and ctx.NEWN just keep the first iteration of the list, not the whole list (while, for example, ID keeps the entire list).

First question: 1. Is possible to fix the second code to get the work done by using the second approach (i. e. not touching the grammar)? Keep in mind this example was easy enough so that the first code works fine, but had I had a rule like 'example: (ID ID? ID)+;' another approach would be required; maybe it's not possible to fix it because this approach is not supposed to work in the first place (rule should have been defined differently).

  1. What is the best way of modifying the grammar to get it done; I am thinking:
grammar testingGrammar;
@header{package gen;}

dsopt_rename: 'rename' (oldn '=' newn)+;
oldn: ID;
newn: ID;
ID: [a-zA-Z_];

but it's probably error-prone because oldn and newn could match unintendedly.

Thanks for your time!

like image 456
Javier Rivera Avatar asked Sep 12 '25 04:09

Javier Rivera


1 Answers

Use the += notation to collect your tokens:

grammar testingGrammar;

dsopt_rename
 : 'rename' ( lhs+=ID '=' rhs+=ID )+
 ;

ID     : [a-zA-Z_];
SPACES : [ \t\r\n]+ -> skip;

Test it like this:

String source = "rename a = A b = B c = C";

testingGrammarLexer lexer = new testingGrammarLexer(CharStreams.fromString(source));
testingGrammarParser parser = new testingGrammarParser(new CommonTokenStream(lexer));

testingGrammarParser.Dsopt_renameContext ctx = parser.dsopt_rename();

List<Token> lhsTokens = ctx.lhs;
List<Token> rhsTokens = ctx.rhs;

System.out.printf("lhsTokens=%s\nrhsTokens=%s\n", lhsTokens, rhsTokens);

which will print:

lhsTokens=[[@1,7:7='a',<3>,1:7], [@4,13:13='b',<3>,1:13], [@7,19:19='c',<3>,1:19]]
rhsTokens=[[@3,11:11='A',<3>,1:11], [@6,17:17='B',<3>,1:17], [@9,23:23='C',<3>,1:23]]

More info: https://github.com/antlr/antlr4/blob/master/doc/parser-rules.md#rule-element-labels

like image 112
Bart Kiers Avatar answered Sep 14 '25 23:09

Bart Kiers