Two Great Tastes that Taste Great Together
Monads are, in a way, all about sequences of operations which depend on data living "inside" a data structure and produce instances of that data structure with new data inside. For instance, the list monad sequences operations which depend on things inside a list, and which produce lists themselves. The state monad sequences operations which depend, in a sense, on data living inside a state function and result in new state functions. The "maybe" monad sequences operations depending on values which produce "maybe" values (like (Just x) or (Nothing)).

Think about this enough and you start to look for monads whenever you deal with any kind of data structure. Recently, I've thought a lot about hygienic macros, of which the key component is "syntax", a kind of decorated representation of source code. If you write simple macros that don't do anything like variable capture, you basically don't have to worry about the distinction between syntax and datum, but when you start selectively messing up hygiene, it becomes important.
And you start to notice patterns. Specifically, you are often "unpacking" a piece of syntax, doing something to it, and then "resyntaxing" it when you are done. Can we conceptualize syntax as a monad? If we do so, do we get anything? It is useful? I suspect for many seasoned readers the first is obvious: any data structure which "holds things" has an associated monad. We'll see about the second - I actually don't know myself, and expect to find out as I write this post.
Better Monads
We'll be operating in my Scheme of Choice, Racket and we'll use my
monad library better-monads
. To define a monad in better-monads
you simply create an instance of the monad
struct, defined as:
(struct monad (bind return plus zero) #:transparent)
Plus
and zero
are optional, and you set them to #f
if you can't
provide meaningful values. That is:
(require functional/better-monads)
(define the-id-monad
(monad (lambda (v f) (f v))
(lambda (v) v)
#f #f))
Creates the identity monad. We can then use it in an mlet*
expression.
(mlet* in: the-id-monad
((x 10)
(y (+ x 11)))
(return (+ x y)))
Which will be 31, because the identity monad doesn't do anything at all. The library defines and exports some common monads:
(mlet* in: the-list-monad
((x '(a b c))
(y '(q r s)))
(return (list x y)))
Results in:
'((a q) (a r) (a s) (b q) (b r) (b s) (c q) (c r) (c s))
If you aren't down with monads, read my tutorials on using them in Lisps. I'm told they are helpful.
The Syntax Monad
The easiest place to start with a monad is with return
, since return
is our simplest monadic function. That is, there is a family of
functions which accept values and produce monadic values and return is
often one of the simplest of these functions. What is return in the
syntax monad?
The super naive but technically correct answer is:
(define (syntax-return item)
(datum->syntax (syntax _) item))
This takes an item, any kind of datum, and wraps it into a
syntax-object
with datum->syntax
. Datum->syntax
requires,
however, a "seed" syntax object from which to copy the syntactic
information that helps maintain (or violate) hygiene. This version of
return
just creates a new syntax
object with (syntax _)
as the
seed, but this isn't very useful. We are almost always trying to
create a syntax object which borrows someone else's scope information,
say to capture a variable.
If this were Haskell, we might have to live with this sub-optimal
solution (you don't really need return
as far as I can tell, its
just a handy way to put things into the monad). However, we're using
Scheme and that means we can take advantage of our greater flexibility
to write this version:
(define (syntax-return item . rest)
(match rest
[(list) (syntax-return item (syntax _))]
[(list seed)
(datum->syntax seed item)]))
This function takes an optional second argument which can specify the
seed if need be. better-monads
doesn't care about the fact that
this return has a different signature than other returns. All that
matters is that everything works out at runtime.
Bind
is often a sticking point but in the syntax monad it is very
easy to understand:
(define (syntax-bind stx stx<datum>)
(stx<datum> (syntax->datum stx)))
Here we use a little notational hint I use in my code. The <datum>
part of the symbol stx<datum>
indicates that stx<datum>
is a
function waiting for a single value, the datum
, to produce
something, namely a piece of syntax (hence the stx
). The value
stx
(without any <>
) is a syntax object. Bind
simply extracts
the value in the syntax using syntax->datum
and calls the function
stx<datum>
on that value. The rules of our monad dictate that the
result must be some syntax, but this (again) is enforced at run-time,
not by any type-checker.
We can define our monad in two ways, depending on the behavior of
return
we want. The first is:
(require functional/point-free)
(define (the-syntax-monad<stx> stx)
(monad syntax-bind (partial< syntax-return stx) #f #f))
This monad is parametrized by the syntax we want to use as the seed
for return
operations. We accomplish this by partially applying
syntax-return
on the right with the value stx
passed in during the
monad creation. partial<
means "partially apply on the right" (as
opposed to >partial
, which means "on the left") and lives in my
point-free
library.
As I understand it, this kind of "value parametrized monad" is difficult to represent in Haskell. I've read that the "set monad" is hard to come up with for just this reason. In scheme, you just parametrize the set monad by whatever predicate you want to constitute equality.
We can also write a regular old monad:
(define the-syntax-monad (monad syntax-bind syntax-return #f #f))
Here we just expect the user to provide an appropriate seed
syntax-object when they call return
. I am almost certainly sure you
can't do this in Haskell without jumping through big hoops. Not
really a Haskell programmer, though, so its possible I'm wrong.
But Will it Blend?
Ok, so this is cute. We've got a monad over syntax
but can we use
it to do anything? Well, one application that springs to mind is
that we can use it to defined syntax-oriented versions of standard
list functions:
(define (syntax-car stx)
(mlet* in: (the-syntax-monad<stx> stx)
((contents stx))
(return (car contents))))
Or, using the more convenient notion of lifting a function into a monad:
(define (syntax-car stx)
((lift1 car (the-syntax-monad<stx> stx))
stx))
For functions of two arguments, this looks like:
(define (syntax-cons hd tl)
((lift2 cons (the-syntax-monad<stx> tl))
hd tl))
We can use these functions like so:
(syntax-car (syntax a b c d)) ;-> (syntax a)
or
(syntax-cons (syntax a) (syntax b c d)) ;->
(syntax a b c d)
In the second case, though, its important to realize that the syntax
a
has its syntactic metadata replaced with the meta-data for the
tail, that is (syntax b c d )
. If we want a
to keep its own
meta-data, we should write:
(define (syntax-cons hd tl)
(mlet* in: the-syntax-monad
((tl-contents tl))
(return (cons a tl-contents) tl)))
We could define this in a point free way (if we wanted to show off), as such:
(define (syntax-cons hd tl)
((lift1 (>partial cons hd)
(the-syntax-monad<stx> tl))
tl))
There are other various and sundry things which it might be easier to do/define using the syntax monad, but without using it a great deal, I'll just say that it seems to blend ok.
Conclusions
We've shown how to define and use a syntax
monad for Scheme syntax
objects. The essence of the implementation is the realization that
syntax, as a kind of wrapper over regular Scheme data, is enough like
a container that we can write a monad over it, much like we can create
a list monad over lists.
The utility of the monad seems to be limited by the fact that syntax containers aren't just dumb boxes around pieces of data, but contain important meta-data which is intimately related to their intended use. It isn't clear to me whether, given this fact, really using the syntax monad would ever be a win, but who knows? At the very least it is an interesting example of what I guess you could call a monad parametrized over a run time value, which is an interesting idiom, anyway.
All this code is up on my git repository.
Add a comment