Saturday, May 31, 2008

P6object: Perl 6 metaclasses for Parrot



A week or so ago I worked on creating a new metaclass library ("P6object") for use by Rakudo and the other compiler tools. It replaces the Protoobject.pbc and other metaclass components that those tools had previously been using. This article provides some background and details about the library.

Background

The P6object library is based heavily on the Perl 6 object model described in Synopsis 12. Perl 6's default object model looks a lot like a standard class-based model when it's used in typical programming, but it also has some important differences.

One key feature of Perl 6's OO model is the concept of "prototype objects", or "protoobjects" for short. A protoobject is an "empty" or undefined instance of a class that proxies as a "generic instance" for the class as a whole. In other words, the protoobject for a class allows us to reason about and calculate what an instance of the class can do without having to have a defined instance of that class. For this reason Synopsis 12 talks about protoobjects as being the "class object" -- i.e., it's the thing you use when you want to talk generically about the class. In fact, Perl 6 doesn't have a mandatory Class type, it's all done with protoobjects and metaclasses (we'll cover metaclasses in a bit).

The most common use for a protoobject is to create a new object:
    class Dog { 
method bark() { say "Woof!"; }
}

my $fido = Dog.new();
Another common use of protoobjects is to test 'isa' or 'does' semantics:
    if $fido ~~ Dog  { ... }
In the cases above we use the name "Dog" to indicate a class, but whereas Parrot and other languages would take "Dog" to be a Class object that defines the attributes and methods for objects in the class, in Perl 6 the "Dog" symbol above refers to an instance of the Dog class that reports itself as being undefined -- i.e., a protoobject.

If you're now thinking "all this protoobject stuff is making things complicated" -- don't worry. Most of the time a Perl 6 programmer won't have to think about protoobjects -- just do the natural thing (as in the examples above) and it all works out correctly.

We can use .WHAT on any object to obtain the protoobject for the object's type. One use for a protoobject is to get a stringified form of the (short) name of the type.
    say $fido.WHAT;         # "Dog\n"
So, now that we know something about protoobjects, what's a metaclass? Well, a metaclass is the compiler's underlying representation of a class. Synopsis 12 doesn't say a lot about how metaclasses work internally, leaving those details up to the implementation. But any time we want to manipulate the class itself, such as adding an attribute or determining the available methods, we use a metaclass to do it. We get to the metaclass of an object by using .HOW:
    $fido.HOW.methods()     # get the methods list for $fido
Dog.HOW.attributes() # get the attribute list for Dog objects

Using P6object

Okay, with that background in mind, let's look at the P6object implementation. From this point I'll be using PIR for my examples, because that's what I expect most people using P6object will be using. However, it's nearly all method calls and symbol table lookups, so it's relatively easy to follow, and of course one can access the library from NQP or Rakudo.

First, to load the library one uses the load_bytecode opcode:
    load_bytecode 'P6object.pbc'
Of course, if a program has already loaded PGE or PCT, then the P6object library is already loaded. Once the library has been loaded, we can access the P6metaclass object and use it to create a new class:
    .local pmc p6meta
p6meta = get_hll_global 'P6metaclass'
p6meta.'new_class'('Dog', 'attr'=>'legs tail')
This creates a new class called "Dog", and creates attribute slots named "legs" and "tail". Methods for the new class are defined the same way it's done in normal PIR -- decorate a sub in the appropriate namespace with ":method":
    .namespace ['Dog']
.sub "bark" :method
say "Woof!"
.end
Once the class is created, we can get its protoobject and use that to create a new instances of the class. So, to do the PIR equivalent of the Perl 6
    $fido = Dog.new();
$fido.bark();
one would write in PIR
    .local pmc dogproto, fido
dogproto = get_hll_global 'Dog'
fido = dogproto.'new'()
fido.'bark'() # "Woof!\n"
Note that the new class exists as a normal Parrot class -- i.e., one can still create new Dog objects by using the new opcode or the Dog parrot class via get_class. But once a decision is made to use P6object, it may be better to stick with its defined interfaces for metaprogramming operations. More on this below.

To create a subclass of an existing class, simply supply a "parent" argument to the new_class method:
    ##  Perl 6:
## class Beagle is Dog { ... }
## $snoopy = Beagle.new();

.local pmc beagleproto, snoopy
p6meta.'new_class'('Beagle', 'parent'=>'Dog')
beagleproto = get_hll_global 'Beagle'
snoopy = beagleproto.'new'()
Classes created using new_class always have P6object as one of the ancestor classes, which defines default .WHAT and .HOW methods for all objects. Thus:
    ##  Perl 6:
## say $snoopy.WHAT;

$P0 = snoopy.'WHAT'() # get snoopy's protoobject
$S0 = $P0 # stringify it
say $S0 # "Beagle\n"
Methods such as .isa, .can, and .does are defined on the metaclass for each object (as described in Synopsis 12).
    ##  Perl 6:
## $i = $snoopy.HOW.can('bark');
## $i = $snoopy.HOW.isa('Dog');

$P0 = snoopy.'HOW'() # get snoopy's metaclass
$I0 = $P0.'can'('bark') # see if snoopy can bark
$I0 = $P0.'isa'('Dog') # see if snoopy is a Dog
If the name of a class is segmented using double-colons, then P6object automatically places the protoobject in the appropriate Parrot namespace:
    p6meta.'new_class'('NQP::Grammar::Actions')

$P0 = get_hll_global 'NQP::Grammar::Actions' # wrong
$P0 = get_hll_global ['NQP';'Grammar'], 'Actions' # right
So, what do we gain from all of this? First, it provides a Perl 6-like foundation for all of the objects used in the Perl 6-related compiler components, including PCT, PGE, NQP, and Rakudo. Thus, each of these tools knows that the objects coming from another component support the Perl 6 metaprogramming model, which aids consistency. Also, most metaprogramming operations are method based, which means the tools can use a single access paradigm (method calls) to do their metaprogramming, instead of having to work with an irregular set of PIR opcodes with varying operand types.

One key aspect of P6object that has particular importance to Rakudo Perl is that P6object can add Perl-6 like roles to Parrot's existing classes and built-in PMC types. For example, Perl 6 expects to work with 'Int', 'Str', and 'Num' objects, but other libraries in Parrot might return values that are 'Integer', 'String', or 'Float' PMCs. Since checking and autoboxing every value could get cumbersome and/or expensive, P6object allows us to give Perl 6 object model behaviors to existing classes. This is done using the 'register' method on P6metaclass:
    p6meta.'register'('Float')
This line creates a protoobject for Parrot's built in Float type, and adds .WHAT and .HOW methods to Float objects. The protoobject also gives us a .new method for building Float PMCs:
    .local pmc floatproto
floatproto = get_hll_global 'Float'
$P0 = floatproto.'new'()
Of course, the old way of creating Float objects in PIR still works:
    $P0 = new 'Float'
$P0 = 6.28318
And, as mentioned above, Floats receive the .WHAT and .HOW methods that all P6objects have:
    $P1 = $P0.'WHAT'()
$S1 = $P1
say $S1 # "Float\n"
The "name" option to .register and .new_class causes the protoobject to be created using a different name:
    p6meta.'register'('ResizablePMCArray', 'name'=>'List')

.local pmc list
list = new 'ResizablePMCArray' # create a RPA
$P1 = list.'WHAT'() # get the protoobject
$S1 = $P1 # stringify it
say $S1 # "List\n"
This is useful for mapping Parrot's built-in types to HLL-specific class
names.

Note that registering a class doesn't mean that new "List" will work in PIR, because Parrot still thinks of the class as "ResizablePMCArray". But using the protoobject for List will do what we want:
    .local pmc listproto, list
listproto = get_hll_global 'List'
list = listproto.'new'()
We can also register classes to use a specific protoobject instead of creating a new one:
    ##  create a subclass of Hash called MyHash
p6meta.'new_class'('MyHash', 'parent'=>'Hash')

## register existing Hash class as being MyHash
.local pmc myhashproto
myhashproto = get_hll_global 'MyHash'
p6meta.'register'('Hash', 'protoobject'=>myhashproto)
This has the result of causing all Parrot Hash objects to report themselves as instances of "MyHash", and to return the same protoobject for 'MyHash' and 'Hash' objects.

Finally, the "parent" argument to .register cause the named parent class(es) to be added as parents of the class being registered. If the class being registered is a built-in PMC type or otherwise cannot have parent classes added, then the methods of the parent classes are composed into the class directly.

Thus the following causes Hash objects to return the MyHash protoobject and metaclass in response to .WHAT and .HOW, and adds all of the methods of MyHash to the existing Hash PMC type (unless Hash already has such methods).
    p6meta.'register'('Hash', 'parent'=>'MyHash', 'protoobject'=>'MyHash')

Summary

The P6object library provides quite a few other features for managing and manipulating classes, and new features such as roles, attribute management, and method exporters are in the works. With these features, P6object provides a robust Perl 6-like interface to Parrot's underlying object model.

No comments: