I want to generate an antlr lexer at runtime -- that is, generate the grammar and from the grammar generate the lexer class, and its supporting bits at runtime. I am happy to feed it into the the java compiler, which is accessible at runtime.
A lexer (often called a scanner) breaks up an input stream of characters into vocabulary symbols for a parser, which applies a grammatical structure to that symbol stream.
A lexer and a parser work in sequence: the lexer scans the input and produces the matching tokens, the parser then scans the tokens and produces the parsing result.
An ANTLR lexer creates a Token object after matching a lexical rule. Each request for a token starts in Lexer. nextToken , which calls emit once it has identified a token. emit collects information from the current state of the lexer to build the token.
In computer-based language recognition, ANTLR (pronounced antler), or ANother Tool for Language Recognition, is a parser generator that uses LL(*) for parsing.
Here's a quick and dirty way to:
.g file given a String as grammar-source,.g file,.java files,import java.io.*;
import javax.tools.*;
import java.lang.reflect.*;
import org.antlr.runtime.*;
import org.antlr.Tool;
public class Main {
    public static void main(String[] args) throws Exception {
        // The grammar which echos the parsed characters to theconsole,
        // skipping any white space chars.
        final String grammar =
                "grammar T;                                                  \n" +
                "                                                            \n" +
                "parse                                                       \n" +
                "  :  (ANY {System.out.println(\"ANY=\" + $ANY.text);})* EOF \n" +
                "  ;                                                         \n" +
                "                                                            \n" +
                "SPACE                                                       \n" +
                "  :  (' ' | '\\t' | '\\r' | '\\n') {skip();}                \n" +
                "  ;                                                         \n" +
                "                                                            \n" +
                "ANY                                                         \n" +
                "  :  .                                                      \n" +
                "  ;                                                           ";
        final String grammarName = "T";
        final String entryPoint = "parse";
        // 1 - Write the `.g` grammar file to disk.
        Writer out = new BufferedWriter(new FileWriter(new File(grammarName + ".g")));
        out.write(grammar);
        out.close();
        // 2 - Generate the lexer and parser.
        Tool tool = new Tool(new String[]{grammarName + ".g"});
        tool.process();
        // 3 - Compile the lexer and parser.
        JavaCompiler compiler = ToolProvider.getSystemJavaCompiler();
        compiler.run(null, System.out, System.err, "-sourcepath", "", grammarName + "Lexer.java");
        compiler.run(null, System.out, System.err, "-sourcepath", "", grammarName + "Parser.java");
        // 4 - Parse the command line parameter using the dynamically created lexer and 
        //     parser with a bit of reflection Voodoo :)
        Lexer lexer = (Lexer)Class.forName(grammarName + "Lexer").newInstance();
        lexer.setCharStream(new ANTLRStringStream(args[0]));
        CommonTokenStream tokens = new CommonTokenStream(lexer);
        Class<?> parserClass = Class.forName(grammarName + "Parser");
        Constructor parserCTor = parserClass.getConstructor(TokenStream.class);
        Parser parser = (Parser)parserCTor.newInstance(tokens);
        Method entryPointMethod = parserClass.getMethod(entryPoint);
        entryPointMethod.invoke(parser);
    }
}
Which, after compiling and running it like this (on *nix):
java -cp .:antlr-3.2.jar Main "a b    c"
or on Windows
java -cp .;antlr-3.2.jar Main "a b    c"
, produces the following output:
ANY=a ANY=b ANY=c
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With