Antlr

From BITPlan Wiki
Revision as of 14:27, 14 October 2017 by Wf (talk | contribs) (→‎Motivation)
Jump to navigation Jump to search

ANTLR is a parser generator tool.

BITPlan has been using ANTLR in projects for a few years now and Wolfgang Fahl has been active in improving ANTLR see e.g. #Motivation

Library with helpers for ANTLR Language development com.bitplan.antlr

To simplify Parser development with ANTLR BITPlan has created a library with some helper code for ANTLR Language Development and published it as Open Source at:

Base Class Language Parser

The abstract base class LanguageParser has some help code that makes language development and debugging easier.

Example

Exp grammar

/**
 * Copyright 2016-2017 BITPlan GmbH
 * Author: Wolfgang Fahl
 *
 * this is an Example Antlr Grammar
 *
 *
 * it is specified using antlr syntax and uses the ANTLR V4 parser generator
 * see http://www.antlr.org
 * 
 * for Eclipse you might want to install the IDE support:
 * https://github.com/jknack/antlr4ide
 * 
 */
grammar Exp;
 /* This will be the entry point of our parser. */
eval returns [double value]
    :    exp=additionExp {$value = $exp.value;}
    ;

/* Addition and subtraction have the lowest precedence. */
additionExp returns [double value]
    :    m1=multiplyExp       {$value =  $m1.value;} 
         ( '+' m2=multiplyExp {$value += $m2.value;} 
         | '-' m2=multiplyExp {$value -= $m2.value;}
         )* 
    ;

/* Multiplication and division have a higher precedence. */
multiplyExp returns [double value]
    :    a1=atomExp       {$value =  $a1.value;}
         ( '*' a2=atomExp {$value *= $a2.value;} 
// ...
         )* 
    ;
    
/* An expression atom is the smallest part of an expression: a number. Or 
   when we encounter parenthesis, we are making a recursive call back to the
   rule 'additionExp'. As you can see, an 'atomExp' has the highest precedence. */

atomExp returns [double value]
    :    n=Number                {$value = Double.parseDouble($n.text);}
    |    '(' exp=additionExp ')' {$value = $exp.value;}
    ;

/* A number: can be an integer value, or a decimal value */
Number
    :    ('0'..'9')+ ('.' ('0'..'9')+)?
    ;

/* We're going to ignore all white space characters */
WS  
    :   (' ' | '\t' | '\r'| '\n') {
    	
    }
    ;

Java Source for ExpLanguage Parser

This code wraps the generated ExpParser into a class derived from LanguageParser

/**
 * example parser
 * @author wf
 *
 */
public class ExpLanguageParser extends LanguageParser {
  private ExpParser parser;
  ExpLexer lexer;
  
  public ExpParser getParser() {
    return parser;
  }
  
  @Override
  protected ParseTree getRootContext(Parser parser) {
    if (!(parser instanceof ExpParser)) {
      throw new RuntimeException("wrong parser type for getRootContext, expected Rule but got "+parser.getClass().getName());
    } else {
      ExpParser expParser=(ExpParser) parser;
      return expParser.eval();
    }
  }

  @Override
  protected ParseTree parse(ANTLRInputStream in, String inputText)
      throws Exception {
    lexer = new ExpLexer(in);
    parser=new ExpParser(getTokens(lexer));
    ParseTree result=super.parse(lexer,getParser());
    return result;
  }

  @Override
  public void showParseTree() {
    super.showParseTree(getParser());
  }

}

JUnit Test

With the BaseTest abstract base Junit Test class testing and debugging gets easier. Take the following JUnit Test:

JUnit Test Source Code

  @Test
  public void testExpressionParser() throws Exception {
    String expressions[] = { "2*3", "4+5", "(2+3)*(4+5)" };
    ExpLanguageParser exprParser = new ExpLanguageParser();
    for (String expression : expressions) {
      super.runParser(exprParser, expression, 0);
    }
  }

if you add some invalid Expression to the expressions:

JUnit Test Source Code

 
   String expressions[] = { "2*3", "4+5", "(2+3)*(4+5)","(4+5)--(6-7)" };

a graphical parse Tree will show showing you what's wrong: Parsetree example.png

Motivation

While struggling with an Issue in ANTLR 4.4 see

there was a need to be able whether a parser would "timeout". The support was this was implemented by adding a timed call

/**
   * test the given rule with the given timeout
   * 
   * @param inputText
   * @param expectedErrors
   * @param timeOutMSecs
   * @throws Exception
   * @return the parser
   */
  public LanguageParser doTestParser(final String inputText, final int expectedErrors, int timeOutMSecs)
      throws Exception {
        if (debug) {
          System.out.println(inputText);
        }
        LanguageParser result = timedCall(new Callable<LanguageParser>() {
          public LanguageParser call() throws Exception {
            return runParser(inputText, expectedErrors);
          }
        }, timeOutMSecs, TimeUnit.MILLISECONDS);
        return result;
      }

Another need was to be able to test hundreds of files in a list of directories. For this the SourceDirectory class was added.

This base class can then be used with the testParseFilesInDirectories() function

public List<LanguageParser> testParseFilesInDirectories(File rootDir, List<SourceDirectory> sourceDirectories,
      String[] extensions, String[] ignorePrefixes, int limit, int progressStep) throws Exception {

Prerequisites

Installation

# get the Railroad Diagrams for ANTLR 4 grammar rules. 
git clone https://github.com/bkiers/rrd-antlr4
# make it locally available
cd rrd-antlr4
mvn clean install 
# then install the local com.bitplan.antlr 
cd ..
git clone https://github.com/BITPlan/com.bitplan.antlr
cd com.bitplan.antlr
mvn install