PIRC is a new implementation of the PIR language, and currently under heavy construction. While the code is documented, it does not provide an easy overview of implementation issues and design decisions. In order to solve this documentation gap, I'm writing these design decisions in a series of parrotblog.org articles. This very short article discusses the memory management of PIRC.
The current PIR compiler, IMCC, has had a lot of memory leaks, most of which have been solved by now. Although conceptually manual memory management through malloc() and free() is very simple and straightforward, things become ugly once you loose overview of when datastructures go out of scope, and as a result you forget to free the memory. In PIRC a different approach is taken.
Note that instead of using malloc() and free() directly, you should use mem_sys_allocate() and mem_sys_free(), respectively, which are provided by Parrot.
PIRC parses the PASM or PIR input, builds a data structure, and generates a Parrot Byte Code file (a so-called Packfile). (at the moment of writing, the PBC is not generated yet, but work is underway to fix this; might take a while, though). The data structures that are created represent the PASM or PIR program being parsed, and as such can be considered an Abstract Syntax Tree.
As these data structures are used throughout the compilation phase, PIRC can cheat with memory management. Instead of freeing the memory of the data structures by manually calling mem_sys_free(), PIRC keeps track of all allocated memory blocks. Only when PIRC is done with the bytecode generation will it release all resources. This is fine, because the data structures are needed up to the end anyway, so memory is not occupied longer than necessary.
Poor Man's Garbage Collector
Whenever PIRC needs memory, a block of memory is allocated through Parrot's memory allocation function, mem_sys_allocate() (or a variant, which zeroes out all bytes). Before returning a pointer to the allocated block of memory, however, PIRC stores a pointer to the block of memory as well (in a list), and only then is the pointer to the block returned.
When PIRC is done with compiling, it will go through the list of memory pointers, and release the memory pointed to by each of the pointers. In a sense, you could consider this a garbage collector, except that there's no reuse of memory (but there's no need to anyway).
"Wait a minute," you might say, where are these pointers stored then? Well, of course, these pointers are stored in an extensible list, as we don't know the number of pointers to store beforehand. Surely the list cannot be allocated and store a pointer to itself. This is fine, because the problem of keeping track of memory is now isolated to a single point in the program. The nodes in the list of allocated memory pointers are allocated directly through Parrot's memory functions. When all stored pointers are mem_sys_free()d, we only have to remember to mem_sys_free() these nodes.
In this way, there's no need to worry about when pointers should be freed: it's done automatically, as long as PIRC allocates its memory through its built-in memory management system.
List of Pointers
Obviously, we don't want to create a new node for each pointer to store. Instead, a node can store a number of pointers, currently set to 512. So, after allocating memory for 512 times, a new node is created. This number of 512 was decided upon after some experimentation, but might prove too low for real world PIR input. As it's #define'd, it's easy to change, though.
In this very short article, I described how PIRC does its memory management. PIRC cheats a bit, by storing each pointer to an allocated block of memory in a special list. Once PIRC is done with compiling, all these pointers are passed to mem_sys_free(), Parrot's free() function (one by one, obviously). This way, you don't have to worry about when to free the memory in PIRC.