编程语言
首页 > 编程语言> > java-Antlr4如何检测无法识别的令牌并且给定的句子无效

java-Antlr4如何检测无法识别的令牌并且给定的句子无效

作者:互联网

我正在尝试与Antlr合作开发一种新语言.这是我的语法文件:

grammar test;

program : vr'.' to'.' e 
        ;
e: be
 | be'.' top'.' be
 ;
be: 'fg' 
  | 'fs' 
  | 'mc' 
  ;
to: 'n' 
  | 'a' 
  | 'ev' 
  ;
vr: 'er' 
  | 'fp' 
  ;
top: 'b' 
  | 'af' 
  ;
Whitespace : [ \t\r\n]+ ->skip 
           ;

Main.java

String expression = "fp.n.fss";
//String expression = "fp.n.fs.fs";
ANTLRInputStream input = new ANTLRInputStream(expression);
testLexer lexer = new testLexer(input);
CommonTokenStream tokens = new CommonTokenStream(lexer);
testParser parser = new testParser(tokens);
//remove listener and add listener does not work
ParseTree parseTree = parser.program();

一切都对有效句子有益.但是我想捕获无法识别的标记和无效的句子,以便返回有意义的消息.这是我的问题的两个测试案例.

fp.n.fss => anltr gives this error token recognition error at: ‘s’ but i could not handle this error. There are same example error handler class which use BaseErrorListener but in my case it does not work.
fp.n.fs.fs => this sentence is invalid for my grammar but i could not catch. How can i catch invalidations like this sentence?

解决方法:

首先欢迎来到SO,也欢迎来到ANTLR部分!错误处理似乎是经常被问到的主题之一,这里有一个非常好的关于在Java/ANTLR4中处理错误的主题.

您很可能想扩展defaultErrorStrategy的功能来处理特定问题,并以与仅在以下位置打印错误行1:12令牌识别错误的方式不同的方式处理它们:’s’.

为此,您可以实现自己的默认错误策略类版本:

Parser parser = new testParser(tokens);
            parser.setErrorHandler(new DefaultErrorStrategy()
    {

        @Override
        public void recover(Parser recognizer, RecognitionException e) {
            for (ParserRuleContext context = recognizer.getContext(); context != null; context = context.getParent()) {
                context.exception = e;
            }

            throw new ParseCancellationException(e);
        }


        @Override
        public Token recoverInline(Parser recognizer)
            throws RecognitionException
        {
            InputMismatchException e = new InputMismatchException(recognizer);
            for (ParserRuleContext context = recognizer.getContext(); context != null; context = context.getParent()) {
                context.exception = e;
            }

            throw new ParseCancellationException(e);
        }
    });

 parser.program(); //back to first rule in your grammar

我还建议您分解解析器和词法分析器的语法,如果不是出于可读性考虑,还因为许多用于分析.g4文件(特别是ANTLRWORKS 2)的工具都会抱怨隐式声明.

对于您的示例,可以将其修改为以下结构:

grammar test;

program : vr DOT to DOT e 
        ;
e: be
 | be DOT top DOT be
 ;
be: FG 
  | FS
  | MC 
  ;
to: N
  | A 
  | EV
  ;
vr: ER 
  | FP 
  ;
top: B
  | AF
  ;
Whitespace : [ \t\r\n]+ ->skip 
           ;

DOT : '.'
    ;

A: 'A'|'a'
 ;

AF: 'AF'|'af'
 ;
N: 'N'|'n'
 ;
MC: 'MC'|'mc'
 ;
EV:'EV'|'ev'
 ;
FS: 'FS'|'fs'
 ;
FP: 'FP'|'fp'
 ;
FG: 'FG'|'fg'
 ;
ER: 'ER'|'er'
 ;
B: 'B'|'b'
 ;

您还可以找到defaultErrorStrategy类here可用的所有方法,并将这些方法添加到“新”错误策略实现中以处理所需的任何异常.

希望这对您的项目有所帮助,并祝您好运!

标签:lexer,antlr,java,parsing,antlr4
来源: https://codeday.me/bug/20191118/2026121.html