Thursday, April 3, 2008

Solutions to the Exercises in Episode 6

Without further ado, the solution to the exercise in Episode 6:

1. By now you should have a good idea on the implementation of scope in Squaak. We haven't implemented the for-statement yet, as it needs proper scope handling to implement. Implement this. Check out Episode 3 for the BNF rules that define the syntax of the for-statement. When implementing it, you will run into the same issue as we did when implementing subroutines and parameters. Use the same trick for the implementation of the for-statement.

First, let us look at the BNF of the for-statement:
for-statement ::= 'for' for-init ',' expression [step]

step ::= ',' expression

for-init ::= 'var' identifier '=' expression
It's pretty easy to convert this to Perl 6 rules:
rule for_statement {
'for' <for_init> ',' <expression> <step>?
'do' <statement>* 'end'

rule step {
',' <expression>

rule for_init {
'var' <identifier> '=' <expression>
Pretty easy huh? Let's take a look at the semantics. A for-loop is just another way to write a while loop, but much easier in certain cases. This:
for var <ident> = <expr1>, <expr2>, <expr3> do
corresponds to:
var <ident> = <expr1>
while <ident> <= <expr2> do
<ident> = <ident> + <expr3>
If <expr3> is absent, it defaults to the value "1". Note that the step expression (expr3) should be positive; the loop condition contains a "<=" operator. When you specify a negative step expression, the loop variable will only decrease in value, which will never make the loop condition false (unless it overflows, but that's a different issue; this might even raise an exception in Parrot; this I do not know). Allowing negative step expressions introduces more complexity, which I felt was not worth the trouble for this tutorial language.

Note that the loop variable <ident> is local to the for loop; this is expressed in the equivalent while loop by the surrounding do/end pair: a new do/end pair defines a new (nested) scope; after the "end" keyword, the loop variable is no longer visible.

Let's implement the action method for the for-statement. As was mentioned in the exercise description, we're dealing with the same situation as with subroutine parameters. In this case, we're dealing with the loop variable, which is local to the for-statement. Let's check out the rule for for_init:
method for_init($/) {
our $?BLOCK;
our @?BLOCK;

## create a new scope here, so that we can
## add the loop variable
## to this block here, which is convenient.
$?BLOCK := :blocktype('immediate'),
:node($/) );

my $iter := $( $<identifier> );
## set a flag that this identifier is being declared
## the identifier is initialized with this expression
$iter.viviself( $( $<expression> ) );

## enter the loop variable into the symbol table.
$?BLOCK.symbol($, :scope('lexical'));

make $iter;

So, just as we created a new PAST::Block for the subroutine in the action method for parameters, we create a new PAST::Block for the for-statement in the action method that defines the loop variable. (Guess why we made for-init a subrule, and didn't put in "var <ident&gt = <expression>" in the rule of for-statement). This block is the place to live for the loop variable. The loop variable is declared, initialized using the viviself attribute, and entered into the new block's symbol table. Note that after creating the new PAST::Block object, we put it onto the stack scope.

Now, the action method for the for statement is quite long, so I'll just embed my comments, which makes reading it easier.
method for_statement($/) {
our $?BLOCK;
our @?BLOCK;
First, get the result object of the for statement initialization rule; this is the PAST::Var object, representing the declaration and initialization of the loop variable.
    my $init := $( $<for_init> );
Then, create a new node for the loop variable. Yes, another one (besides the one that is currently contained in the PAST::Block). This one is used when the loop variable is updated at the end of the code block (each iteration). The difference with the other one, is that it doesn't have the isdecl flag, and it doesn't have a viviself clause, which would result in extra instructions checking whether the variable is null (and we know it's not, because we initialize the loop variable).
    ## cache the name of the loop variable
my $itername := $;
my $iter := :name($itername),
:node($/) );
Now, retrieve the PAST::Block node from the scope stack, and push all statement PAST nodes onto it.
    ## the body of the loop consists of the statements written by the user and
## the increment instruction of the loop iterator.

my $body := @?BLOCK.shift();
$?BLOCK := @?BLOCK[0];
for $<statement> {
If there was a step, we use that value; otherwise, we use assume a default step size of "1".
Negative step sizes won't work, but if you Feel Lucky, you could go ahead and try. It's not that hard, it's just a lot of work, and I'm too lazy for that now.... ehm, I mean, I leave it as the proverbial exercise to the reader.
    my $step;
if $<step> {
my $stepsize := $( $<step>[0] );
$step := $iter, $stepsize, :pirop('add'), :node($/) );
else { ## default is increment by 1
$step := $iter,
:node($/) );

The incrementing of the loop variable is part of the loop body, so add the incrementing statement to $body.
The loop condition uses the "<=" operator, and compares the loop variable with the maximum value that was specified.
    ## while loop iterator <= end-expression
my $cond := $iter, $( $<expression> ),
:name('infix:<=') );

Now we have the PAST for the loop condition and the loop body, so now create a PAST to represent the (while) loop.
    my $loop := $cond, $body,
:node($/) );

Finally, the initialization of the loop variable should go before the loop itself, so create a PAST::Stmts node to do this:
    make $init, $loop,
:node($/) );

Wow, we've done it! This was a good example of how to implement a non-trivial statement type using PAST.


rdescamps said...

Your grammar rule seems not to correspond to the action method:
In the for_statement rule you are using a block subrule but in the action method you are iterating through the statement match object, which never exist.
As a result the parsing phase seems to work but no statement is executed.

Anyway it was not clear to me in your explanation why you wanted to use two embedded blocks (the outer for the while and an other one for the statements).

kjs said...

You're right. It's fixed now. There's no need for two embedded blocks; only one is needed (to limit the scope of the loop iterator declared with 'var').
Thanks for the good catch!