Antlr

From BITPlan Wiki
Revision as of 08:25, 13 November 2017 by Wf (talk | contribs)
Jump to navigation Jump to search

ANTLR is a parser generator tool.

BITPlan has been using ANTLR in projects for a few years now and Wolfgang Fahl has been active in improving ANTLR see e.g. Motivation

Library with helpers for ANTLR Language development com.bitplan.antlr

To simplify Parser development with ANTLR BITPlan has created a library with some helper code for ANTLR Language Development and published it as Open Source at:

Base Class Language Parser

The abstract base class LanguageParser has some helper code that makes language development and debugging easier.

Example

Exp grammar

/**
 * Copyright 2016-2017 BITPlan GmbH
 * Author: Wolfgang Fahl
 *
 * this is an Example Antlr Grammar
 *
 *
 * it is specified using antlr syntax and uses the ANTLR V4 parser generator
 * see http://www.antlr.org
 * 
 * for Eclipse you might want to install the IDE support:
 * https://github.com/jknack/antlr4ide
 * 
 */
grammar Exp;
 /* This will be the entry point of our parser. */
eval returns [double value]
    :    exp=additionExp {$value = $exp.value;}
    ;

/* Addition and subtraction have the lowest precedence. */
additionExp returns [double value]
    :    m1=multiplyExp       {$value =  $m1.value;} 
         ( '+' m2=multiplyExp {$value += $m2.value;} 
         | '-' m2=multiplyExp {$value -= $m2.value;}
         )* 
    ;

/* Multiplication and division have a higher precedence. */
multiplyExp returns [double value]
    :    a1=atomExp       {$value =  $a1.value;}
         ( '*' a2=atomExp {$value *= $a2.value;} 
// ...
         )* 
    ;
    
/* An expression atom is the smallest part of an expression: a number. Or 
   when we encounter parenthesis, we are making a recursive call back to the
   rule 'additionExp'. As you can see, an 'atomExp' has the highest precedence. */

atomExp returns [double value]
    :    n=Number                {$value = Double.parseDouble($n.text);}
    |    '(' exp=additionExp ')' {$value = $exp.value;}
    ;

/* A number: can be an integer value, or a decimal value */
Number
    :    ('0'..'9')+ ('.' ('0'..'9')+)?
    ;

/* We're going to ignore all white space characters */
WS  
    :   (' ' | '\t' | '\r'| '\n') {
    	
    }
    ;

Java Source for ExpLanguage Parser

This code wraps the generated ExpParser into a class derived from LanguageParser

/**
 * example parser
 * @author wf
 *
 */
public class ExpLanguageParser extends LanguageParser {
  private ExpParser parser;
  ExpLexer lexer;
  
  public ExpParser getParser() {
    return parser;
  }
  
  @Override
  protected ParseTree getRootContext(Parser parser) {
    if (!(parser instanceof ExpParser)) {
      throw new RuntimeException("wrong parser type for getRootContext, expected Rule but got "+parser.getClass().getName());
    } else {
      ExpParser expParser=(ExpParser) parser;
      return expParser.eval();
    }
  }

  @Override
  protected ParseTree parse(ANTLRInputStream in, String inputText)
      throws Exception {
    lexer = new ExpLexer(in);
    parser=new ExpParser(getTokens(lexer));
    ParseTree result=super.parse(lexer,getParser());
    return result;
  }

  @Override
  public void showParseTree() {
    super.showParseTree(getParser());
  }

}

JUnit Test

With the BaseTest abstract base Junit Test class testing and debugging gets easier. Take the following JUnit Test:

JUnit Test Source Code

  @Test
  public void testExpressionParser() throws Exception {
    String expressions[] = { "2*3", "4+5", "(2+3)*(4+5)" };
    ExpLanguageParser exprParser = new ExpLanguageParser();
    for (String expression : expressions) {
      super.runParser(exprParser, expression, 0);
    }
  }

if you add some invalid Expression to the expressions:

JUnit Test Source Code

 
   String expressions[] = { "2*3", "4+5", "(2+3)*(4+5)","(4+5)--(6-7)" };

a graphical parse Tree will show showing you what's wrong: Parsetree example.png

Motivation

While struggling with an Issue in ANTLR 4.4 see

there was a need to be able whether a parser would "timeout". The support was this was implemented by adding a timed call

/**
   * test the given rule with the given timeout
   * 
   * @param inputText
   * @param expectedErrors
   * @param timeOutMSecs
   * @throws Exception
   * @return the parser
   */
  public LanguageParser doTestParser(final String inputText, final int expectedErrors, int timeOutMSecs)
      throws Exception {
        if (debug) {
          System.out.println(inputText);
        }
        LanguageParser result = timedCall(new Callable<LanguageParser>() {
          public LanguageParser call() throws Exception {
            return runParser(inputText, expectedErrors);
          }
        }, timeOutMSecs, TimeUnit.MILLISECONDS);
        return result;
      }

Another need was to be able to test hundreds of files in a list of directories. For this the SourceDirectory class was added.

This base class can then be used with the testParseFilesInDirectories() function

public List<LanguageParser> testParseFilesInDirectories(File rootDir, List<SourceDirectory> sourceDirectories,
      String[] extensions, String[] ignorePrefixes, int limit, int progressStep) throws Exception {

Prerequisites

This is another useful function that is linked with this library.

Installation

# get the Railroad Diagrams for ANTLR 4 grammar rules. 
git clone https://github.com/bkiers/rrd-antlr4
# make it locally available
cd rrd-antlr4
mvn clean install 
# then install the local com.bitplan.antlr 
cd ..
git clone https://github.com/BITPlan/com.bitplan.antlr
cd com.bitplan.antlr
mvn install