Recently this piece on ECS has been on my mind. To finally kick it out, here are my notes on what can be improved.
The author uses Common LISP to define an extended language for writing program that favour the ECS style. While it is a very nice read and it achieves what was intended I think this approach can be improved upon. The main problems I see are the following: 1. this is not LISP anymore, 2. it reimplements redefinition of data at runtime.
The first problem is solved by accepting that this was an exploratory solution. The extended language is a prototype and not the final API to actually write programs using the ECS. The second is instead an implementation problem.
My take on both these problems would be to use the Common Lisp Object System (CLOS) as it solves both. Once we teach CLOS how we want to lay out our memory we can get data locality and we have not extended the language for non-control-flow reasons. Moreover CLOS already supports redefinition at runtime therefore we are not duplicating features and the second problem goes away.
A tour of ECS and CLOS ECS
Here’s an example of the current API.
This is the CLOS flavoured version of the same API. It is good to see the two side by side to spot the differences.
If this reads like normal LISP code to you1 then we have achieved our first objective. Our programs do not need to bend too much to allow us to benefit from the ECS architecture.
But how does this work?
We define two classes,
image, with the standard syntax for classes.
We also specify both as inheriting from the
ecs-component. This will take
care of our requirement for spatial locality as we will make the instance
creation takes memory from a common pool to the class.
The following snippet shows you how we allocate two positions object.
Because they are an instance of an
ecs-component their allocation
goes through our library-defined allocation scheme which means we can
allocate them as successive cells in an array. We have achieved data locality.
draw function takes a reference to the underlying storage for our
positions and images, it starts a loop over the two together and calls
draw-image. Before the loop it executes the thunk
and when the loop exits it executes the thunk
It also takes the two references as read-only so that any attempt at modifying them or their data will error.
But why does this work and why is CLOS required?
ecs-component will receive the class slots from
bitmap at class definition time. It defers class creation to
The job of the metaclass is to prepare the underlying storage for the classes defined. This is exactly what we need to have all instances of the same component share one big contiguous array.
We have to use a metaclass because we need to know which slots are defined to decide how to lay out in memory the structure for our class and from that define all the procedures such as getters and setters.
Moreover once we have allocated the array for the class instances we
also want to use it in our loops. As we need to reference these arrays
we can also define some more names, e.g.
ecs-images, that we can reference later. This is how our macro
with-component* is able to get the two underlying arrays and other
metadata, e.g. counts of each components. There are various ways
to implement this and it is not important right now.
Finally we are at the end of our example. Data is layed out in memory as we want and we are ready to loop over it as fast as we can.
For each component passed to
with-component* we get the reference to
the underlying array and the metadata. Then we loop over the elements
of the arrays and call
draw-image for it’s side-effects.
I hope that this has convinced you that we can achieve the same benefits of the ECS pattern without both downsides I pointed out.
We don’t have to change the way we write our programs to obtain data locality because we can extend LISP in two directions which are orthogonal: we can extend the syntax and we can change the data representation without altering the syntax.
We can extend the syntax of the language when it makes sense to add
more specialized control-flow. Our
with-component* is implemented as
a macro and is not a glorified for loop but specialized control flow.
As it takes a function to execute in a loop we have to also account for non-local control flow that can happen in the function body. Moreover it allows us to set up the read-only feature, which can be enabled for just a subset of the components!
The next extension is instead in the orthogonal direction of the data representation. This is why we should use the CLOS and all its facilities as it provides fine-grained control over all the decisions about data representation.
We could have implemented a sqlite storage for our instances and it would have worked the same.
Moreover the CLOS already provides facilities for classes redefinition! You don’t have to figure out how to implement a component redefinition if you follow the CLOS specification.
As a bonus to this tour here is something that is missing in the original presentation: arenas.
Arenas are a specialization of this allocation scheme where the user supplies the underlying storage instead of delegating it to the ECS library. For very hot sections of your code where you create lots of objects to immediately throw them away it makes sense to reuse the space or to just throw it out in block.
This is easily implemented because our approach defines the method
make-instance for the classes that inherit from
which means we can take additional parameters.
This is the holy grail of data-locality: give the user the choice to override your default when they know best.
This CLOS flavoured approach has drawbacks. Implementing an ECS style object system in CLOS using metaclasses is doable but you will have to read the documentation and get informed about all the possible interactions. There is a multitude of things to learn about the CLOS implementation which will make your head spin! It may be too much of a task for some exploratory programming which is definitely why you should not start there.
For the less parenthetically inclined this can also be unreadable. After all it is LISP. ↩