Wednesday, June 11, 2008

Implementing the 'return' statement in Squaak

Recently, Patrick Michaud made first steps to extend the Parrot Abstract Syntax Tree (PAST) to handle return statements. In this short report, I'll show how to implement a return statement in Squaak, the PCT tutorial language we created earlier. First, however, we'll have a quick review of why implementing 'return' is not straightforward. Parrot has these fancy calling conventions, right? So why not just use the .return directive?

The reason for this is that the PCT implements blocks (or scopes) as PIR subroutines. Consider the following Squaak code snippet:
sub foo() 
do
do
return 42
end
end
end
The subroutine foo translates to three different blocks, or scopes: one for each do-end pair, and one for the subroutine itself. Each block is represented by a PIR subroutine, which are nested (using the PIR :outer flag, if you were curious). I'll not go into details, because I'm sure this will be explained in more detail by others later on, but I just want to explain the basics here.
When a .return directive is executed by Parrot, it will return to the calling subroutine. As the foo subroutine above consists of three subroutines, the return statement above will return to its outer block (which invokes its nested block), and not to the caller of the foo subroutine.

Instead of using the .return directive, Parrot will use the exceptions subsystem to implement control constructs such as return statements. Now, as I predicted before, I'm sure someone will explain all this in much more detail, but for now, we just want to get an idea how to actually use all this and implement a return statement. So let's not talk any longer, and look at how to do this.

First, we should extend the grammar of Squaak. As the return statement is a statement, add an alternative to the statement rule, like so:
rule statement {
...
| <return_statement> {*} #= return_statement
}

rule return_statement {
'return' <expression>
{*}
}
In order to allow for syntax like "x = foo()", we need to extend the rule for term. Note that sub_call should come before primary. (Once Longest Token Matching is implemented, this is no longer necessary).
rule term {
...
| <sub_call> #= sub_call
| <primary> #= primary
...
}
Now, that was easy huh? Let's look at the actions. The action method for statement will just dispatch to the action method in the key specified after #=.
method return_statement($/) {
my $expr := $( $<expression> );
make PAST::Op.new( $expr,
:pasttype('return'),
:node($/) );
}
If you read the PCT tutorial, this code should be easy to read. First, we get the result object for the expression. Then we create a new PAST::Op node, this time of the new pasttype 'return'.
Now, you might think this is all, but not quite. We have to specify that the block representing the subroutine is doing the actual returning. This is done using the following line of code in the action method for sub_definition:
$past.control('return_pir');
So, now we have implemented the return statement in Squaak, let's see what happens. Rebuild Squaak, and start the Squaak interpreter in interactive mode:
$ ../../parrot squaak.pbc
Now type:
sub main() return 42 end x = main() print(x)
You could also store this in a file, and then specify the file when running the Squaak compiler, but this is easier for now. After hitting return (no pun intended), you'll see:
> 42
Isn't the Parrot Compiler Toolkit a fabulous tool?

No comments: