mercredi 16 mars 2011

Circumventing the C# reference to reference problem

Recently I have been working on a simple recursive descent parser in C# .Net 4.0 and I encountered an error I did not expect; the following is not valid C#:

Rule integer, factor, expression, term;

integer = +Parser.digit_p();
   
factor = integer
 | Parser.ch_p('(') + expression + Parser.ch_p(')')
 | (Parser.ch_p('-') + factor)
 | (Parser.ch_p('+') + factor);
term = factor *((Parser.ch_p('*') + factor) | (Parser.ch_p('/') + factor));
expression = term * ((Parser.ch_p('+') + factor) | (Parser.ch_p('-') + factor));

This makes sense because we cannot use the rules before they have been initialized !
In C++ we'd just take a reference or pointer to the rule until they are defined; but this does not work in C#.

Instead I had to create a simple RuleRef class as follow:
public class RuleRef
{
 private Rule ptr;

 public Rule Value
 {
  get
  {
   return ptr;
  }

  set
  {
   ptr = value;
  }
 }

 public static implicit operator Rule(RuleRef r)
 {
  return r.Value;
 }

 public RuleRef()
 {

 }

 public RuleRef(Rule value)
 {
  ptr = value;
 }

 public static RuleRef operator +(RuleRef lhs, RuleRef rhs)
 {
  return new Sequence(lhs, rhs);
 }

 public static RuleRef operator +(RuleRef lhs)
 {
  return new OneOrMore(lhs);
 }

 public static RuleRef operator |(RuleRef lhs, RuleRef rhs)
 {
  return new Or(lhs, rhs);
 }

 public static RuleRef operator &(RuleRef lhs, RuleRef rhs)
 {
  return new Sequence(lhs, rhs);
 }

 public static RuleRef operator *(RuleRef lhs, RuleRef rhs)
 {
  return new Sequence(lhs, new ZeroOrMore(rhs));
 }
}

This allow me to then write:
RuleRef integer = new RuleRef();
RuleRef factor = new RuleRef();
RuleRef expression = new RuleRef();
RuleRef term = new RuleRef();

integer = +Parser.digit_p();
   
Rule _factor = integer.Value.WithAction(DebugPrint)
 | Parser.ch_p('(') + expression + Parser.ch_p(')')
 | (Parser.ch_p('-') + factor)
 | (Parser.ch_p('+') + factor);

Rule _term = factor *((Parser.ch_p('*') + factor) | (Parser.ch_p('/') + factor));

Rule _expression = term * ((Parser.ch_p('+') + factor) | (Parser.ch_p('-') + factor));

// Resolve the references to their correct values
factor.Value = _factor;
term.Value = _term;
expression.Value = _expression;

In the next post I will detail my simple recursive descent parser entirely written in C#/Linq.