Monday, April 18, 2016

Right recursion in JavaCC

Being LL(k) parser generator, JavaCC does not support left recursion and productions like

A ->  a | Ax

need to be converted to 

A -> a (x)*

Apparently, right recursion also requires some care.  Suppose we need to implement a right-associative operator = to parse strings like a=b=c=d:

right -> letter = right | letter
letter -> ["a"-"z"]

Simple JavaCC grammar, like below, would not work.

SKIP :
{
  " "
| "\t"
| "\n"
| "\r"
| "\f"
}

TOKEN [IGNORE_CASE] :
{
    <EQ: "=">
  | <LETTER: ["a"-"z"]>
}

public void Right() :
{}
{
    <LETTER> <EQ> Right() | <LETTER>
}

Compiling generated Java code would produce an error:

Error:(29, 12) error: unreachable statement

try {
  switch ((jj_ntk==-1)?jj_ntk_f():jj_ntk) {
  case LETTER:{
    jj_consume_token(LETTER);
    jj_consume_token(EQ);
    Right();
    break;
    }{
    jj_consume_token(LETTER);
    break;
    }
  default:
    jj_la1[0] = jj_gen;
    jj_consume_token(-1);
    throw new ParseException();

}

Adding a LOOKAHEAD at the first choice point fixes the problem:

public void Right() :
{}
{
    LOOKAHEAD(2) <LETTER> <EQ> Right() | <LETTER>

}

Usually JavaCC issues a warning about choice conflict involving two expansions, but not in this case.