-
Notifications
You must be signed in to change notification settings - Fork 3
Knowledge Dump Dynamic Memory Management
This page is subject to updates as further decisions are made on the direction of dynamic memory management in Epoch. Check back frequently for the latest developments. Also, be sure to check carefully for status updates and revisions.
For almost any non-trivial program, it eventually becomes necessary to manage memory (and other resources) dynamically - that is, in a way that is not completely predictable at compile time. This document sets forth the initial planning and brainstorming for how dynamic resource management will be realized in 64-bit Epoch.
For contextual background, 32-bit Epoch allocates all structure-typed objects dynamically. Unused memory is reclaimed using a mark-and-sweep garbage collector. While this was a sufficiently powerful approach for implementing the basics of 32-bit Epoch, it has major performance implications and serious drawbacks when it comes to controlling allocation behavior of a running program.
There are a number of factors in 64-bit Epoch that make this a complex challenge.
- Garbage collection should be fully opt-in by the programmer, not a mandatory language feature.
- As a result, dynamic allocation - and deallocation - must be supported explicitly.
- The runtime library should not be a standalone dynamically linked library; anything needed by an Epoch program must be statically linked or available from the OS. (GC support is an acceptable exception to this rule.)
- It is highly desirable to have a feature like C++'s "RAII" model supported by the language.
- Different types of allocation strategy (including, but hardly limited to, stack allocators, freelists, garbage collected heaps, manually reclaimed heaps, pools, pointer-bumping/block-release allocators, and so on) must be supported.
- Insofar as it is technically possible, there must be a way to either seamlessly interoperate between allocation strategies, or hand responsibility of objects between memory arenas.
The remainder of this document is dedicated to maintaining a historical account of the development of Epoch's memory model.
A major motivating factor for the Epoch language was the ease of implementing programs in the presence of a reliable, precise garbage collector. In other words, we are strongly committed to having access to optional GC for the purposes of prototyping, rapid iteration, and even shipping non-performance-sensitive production code. As such, there will be at least some way to write Epoch programs and enjoy the presence of a precise GC.
It is an optional - albeit highly desirable - objective of this research to allow GC code to interoperate cleanly with manual allocation strategies. If this turns out to be infeasible, or if the penalties are too high, we may require that GC be "on" or "off" for the entirety of a single program's execution. However, we'll do our best to avoid that outcome.
Another purely optional objective is to allow resources besides memory (files, sockets, handles to other OS resources, and so on) to reliably interoperate with garbage collection.
Without a garbage collector to fall back on, there must be a way to explicitly control resource lifetimes. Ideally these features would be built in such a way that programmers can still rely on automatic correctness. In other words, it would be best to support truly manual allocation/deallocation, but have most programs rely on a layer above this that provides certain guarantees of correct behavior. In particular, we'd like to avoid exposing Epoch programmers to the headaches of dangling pointers, use-after-free bugs, double deletion, and other similar issues.
For this reason in particular, it is unlikely that a true general-purpose "pointer" will be added to the language. We strongly prefer immovable references that can refer to dynamically allocated memory, but can't be relocated to refer to other memory (regardless of allocation method). This avoids a massive class of runtime bugs, including but not limited to buffer overrun type issues.
A major consideration in the implementation of 64-bit Epoch has been the minimization - even outright removal - of dependencies on a runtime library. Ideally, all language facilities could be implemented in terms of OS services, and all that would be needed to run an Epoch program would be the executable image itself.
This is complicated substantially by the desire to keep a GC available for Epoch programs. Indeed we have essentially resigned ourselves to shipping a sidecar library for GC implementations. The only alternative is to statically link the GC implementation with each running binary; this has numerous undesirable consequences, such as inability to address bugs in the GC with a single library update. It's also just plain difficult, since the GC implementation is unlikely to be written in Epoch itself.
Of course this does leave us with one potential strategy of implementing the GC in Epoch and shipping it as a code module instead of a linked library. This may be a desirable approach, however it would come at the cost of potentially introducing highly unsafe mechanisms to the core Epoch language. While this is not totally out of the question, we'd like to be exceedingly careful in any approach that affects the "path of least resistance." One of Epoch's pillars is that the easiest code to write should always be "good enough" to ship - and this must not be compromised. If there's a way to write the GC in Epoch without marring the language, we'll probably do that.
For better or worse, there are aspects of C++ that aren't horrible. The RAII philosophy is among them. RAII provides a strong guarantee of correctness since automatic destruction of resources is well-ordered and predictable. We'd like to have something akin to RAII in Epoch, but it remains unclear how best to accomplish that goal. In a perfect world, RAII could live alongside GC memory pools and other custom allocation strategies. As before, though, we'd really prefer not to make the language too error-prone in the face of such features.
The design and development of Epoch have been heavily informed by the author's experiences with game development. For fields that have near-realtime performance requirements and strong limitations on memory usage, a language needs to be flexible with now resources are allocated and released. There are many different strategies for allocating memory. In order to consider Epoch a success, we'd like to support those strategies insofar as they can be expressed as Epoch programs. (In a perfect world we'd allow for arbitrary memory management schemes while retaining a degree of safety and correctness. It remains to be seen how that balance plays out.)
Supporting multiple allocation strategies is all well and good, but in order for real software to work, memory references between strategies must be supported. For example, it must be possible to allocate a GC-controlled object that has references to a manually managed memory heap, and vice versa. Restricting this too much would create an undue burden on the programmer to segregate all resources perfectly. Moreover, such restriction also limits the kinds of programs that can be written in Epoch.
// Define a structure of some interesting data
// Note that it has a reference count field
structure Payload :
string Text,
integer RefCount
// This will be our "smart pointer" object type
structure<type T> SmartRef :
T ref Target
// Define an acquisition strategy for the SmartRef template type
acquire<type T> SmartRef<T> ref this :
{
AtomicIncrement(this.Target.RefCount)
}
// Define a release strategy for the SmartRef
release<type T> SmartRef<T> ref this:
{
if(AtomicDecrement(this.Target.RefCount) == 0)
{
delete(this.Target)
}
}
// Define some helpers for the SmartRef
with<type T> SmartRef<T> ref this :
{
PeekRefCount : -> this.Target.RefCount
}
// Define some messages to operate on the payload
with Payload ref this :
{
Print : { print(this.Text) }
}
// Now use it!
entrypoint :
{
// Create a smart reference to a newly allocated object
// The object will drop its reference count and be freed
// when this reference goes out of scope, RAII style.
SmartRef<Payload> obj = new Payload("Test", 0)
// While the object is alive, we can send messages to it.
obj.Target => Print()
// We can also send messages to our smart pointer object!
print(cast<string>(obj => PeekRefCount()))
}