Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

ANTLR4 - How do I get the token TYPE as the token text in ANTLR?

Tags:

antlr

antlr4

Say I have a grammar that has tokens like this:

AND        : 'AND' | 'and' | '&&' | '&';
OR         : 'OR' | 'or' | '||' | '|' ;
NOT        : 'NOT' | 'not' | '~' | '!';

When I visualize the ParseTree using TreeViewer or print the tree using tree.toStringTree(), each node's text is the same as what was matched.

So if I parse "A and B or C", the two binary operators will be "and" / "or". If I parse "A && B || C", they'll be "&&" / "||".

What I would LIKE is for them to always be "AND" / "OR / "NOT", regardless of what literal symbol was matched. Is this possible?

like image 475
Ryan O. Avatar asked Sep 13 '25 16:09

Ryan O.


1 Answers

This is what the vocabulary is for. Use yourLexer.getVocabulary() or yourParser.getVocabulary() and then vocabulary.getSymbolicName(tokenType) for the text representation of the token type. If that returns an empty string try as second step vocabulary.getLiteralName(tokenType), which returns the text used to define the token.

like image 56
Mike Lischke Avatar answered Sep 17 '25 03:09

Mike Lischke