Mercerenies - Blog [RAII is Just Continuations]

RAII is Just Continuations

Posted June 8, 2019

This post is going to be a bit heavier on the theory. If you're not familiar with monads in Haskell, I recommend Learn You a Haskell. If you haven't used the continuation monad, I recommend this article from A Neighborhood of Infinity. That's about all you should need for this post. We'll mention LLVM a bit for the sake of example, but no in-depth knowledge is involved there.

Today, I found myself writing this chunk of Haskell code for a bit of llvm-hs interaction.


handleAndOutput :: FilePath -> FilePath -> IO ()
handleAndOutput inp outp =
    handleFile inp >>= \case
      Nothing -> pure ()
      Just res -> withContext $ \ctx -> do
        withHostTargetMachine $ \tm -> do
          withModuleFromAST ctx res $ \res' -> do
            writeObjectToFile tm (File outp) res'

The important part for our discussion is the bottom four lines. I'm taking a complicated data structure from llvm-hs-pure and compiling it down to an object file. In the process, I need to borrow several "non-pure" data structures, in the sense that these are data structures controlled by a C++ library and therefore are not managed by the Haskell runtime. We see this sort of pattern a lot in Haskell when interacting with foreign code. The relevant type signatures are


withContext :: (Context -> IO a) -> IO a
withHostTargetMachine :: (TargetMachine -> IO a) -> IO a
withModuleFromAST :: Context -> Module -> (Module -> IO a) -> IO a

So, once we've applied the first couple of arguments to withModuleFromAST the pattern seems to be


withSomething :: (thing -> IO a) -> IO a

Each of these has the same behavior. They open or allocate some resource, run the function argument, then close the resource. Like I said, we see this pattern a lot when interfacing with things outside the Haskell runtime. System.IO provides withFile, which opens a file, runs some code, and closes the file at the end. Foreign.C.String provides withCString, which marshalls a Haskell string into a C string and frees the memory at the end.

This pattern isn't even unique to Haskell. Python's with statements follow the same pattern: run some entry code, do some stuff, then run some exit code. But both of these constructs suffer from a similar problem, which my first code snippet exhibits. If I need to borrow several resources in succession (a task not uncommon when interacting with foreign code), then I'm going to naturally end up nested several layers deep.

Now, C++ and Rust actually have a solution to this nesting problem. In C++ and Rust, we can simply make local variables which allocate resources and trust that deallocation will happen on time as soon as the local goes out of scope. This pattern is called Resource Acquisition is Initialization, or RAII for short. The C++ code equivalent to my Haskell code from before might look something like this.


// This is example code; this is NOT compatible with the LLVM C++ library
void handle_and_output(string in, string out) {
  auto res = handle_file(in);
  Context ctx {};
  TargetMachine tm { get_host_target_machine() };
  Module m = module_from_ast(ctx, res);
  write_object_to_file(tm, out, m);
}

Compare that to what we started with in Haskell.


handleAndOutput :: FilePath -> FilePath -> IO ()
handleAndOutput inp outp =
    handleFile inp >>= \case
      Nothing -> pure ()
      Just res -> withContext $ \ctx -> do
        withHostTargetMachine $ \tm -> do
          withModuleFromAST ctx res $ \res' -> do
            writeObjectToFile tm (File outp) res'

Note how, in the C++ example, I simply allocate the resources and trust that they'll go out of scope at the end. There's no increase in nesting, and we're not creeping over to the right side of the screen. It sure would be nice if we could do this in Haskell.


-- Not going to compile yet, obviously.
handleAndOutputCont :: FilePath -> FilePath -> ContT r IO ()
handleAndOutputCont inp outp =
    handleFile inp >>= \case
      Nothing -> pure ()
      Just res -> block $ do
        ctx <- withContext
        tm <- withHostTargetMachine
        res' <- withModuleFromAST ctx res
        liftIO $ writeObjectToFile tm (File outp) res'

In our totally hypothetical example here, block is a function that "protects" the outer scope and frees any resources allocated within it. We protect the writeObjectToFile with a liftIO because, more than likely, we'll be writing our own home-baked monad to do all this.

But... will we? Is there a monad that already does what we want to do? If you've read the title of this post, you already know the answer, but let's pretend you didn't and take another look at that type signature.


withSomething :: (thing -> IO a) -> IO a

Aha! That looks like the continuation monad.


data ContT r m a = ContT { runContT :: (a -> m r) -> m r }

In fact, writing such a wrapper is almost laughably simple.


withSomething :: (thing -> IO a) -> IO a
withSomething = -- ... Some black magic

withSomethingCont :: ContT a IO thing
withSomethingCont = ContT withSomething

Now, we need that block function from before. That, too, is surprisingly simple. When we want to free all of our resources, we just... run the continuation and unpack that layer of the monad stack.


block :: Monad m => ContT r m r -> m r
block f = runContT f pure

Of course, you could just use runContT directly, but if you're already returning the thing you want to be returning, then it's simpler and arguably more readable to get rid of the runContT and replace it with a shorter name (in our case, since we're returning () the point is moot anyway).

And that's... about all there is to it. Now the code I showed above where we borrow all of the resources in a monad and release them at the end of the block works as intended. As long as we're inside a block, we can be assured that the resource will be freed at the end. You can also nest block's (though you'll end up with multiple continuations on your monad stack, which may be annoying if for some reason you have to write an explicit type signature for something inside the block), and since each block is it's own continuation mode, continuation tricks like callCC won't be allowed to escape the block, so you can't bypass the deallocation.

So all of those withSomething functions you see in Haskell are really just values in the continuation monad. If you ever need to use several in rapid succession, consider operating inside the ContT monad and running the continuation at the end to free all the resources.

[back to blog index]