My recent release of
Gazelle, a Javascript Lisp dialect, has,
unsurprisingly, produced some commentary about my choice to implement
the system in Emacs Lisp. Coincidentally, this also comes on the tail
of a long (and mostly civil, I might add) discussion I had in the
Lisp sub-reddit about the merits of
including in our rhetorical ecosystem discussion about unusual Lisp
dialects like
Newlisp and
Picolisp, in which I argued
that its useful to discuss and even program in these dialects. Many
people express strong opinions to the contrary, and while some of the
objections relate to aspects of Newlisp's implementation, like
One
Reference Only Memory
Management, to which,
as a dabbler in language implementation, I cannot speak, a lot of the
distaste for these dialects, and for Emacs Lisp, seems to relate to
their use of
dynamic
scope.
Now, in general, I like my Lisps static. A lot of what I fiddle with
in Lisp has to do with emulating features I like in Haskell (poorly),
and Haskell hews pretty close to orthodoxy on the scope issue. My
favorite Lisp is probably
Racket, and the
only support it provides for dynamic-like bindings is through
parameters.
In short, I am
no fan of dynamic binding, and was delighted as
anyone when Emacs Lisp began to support lexical binding natively, and
would be happy to program without it in any language.
However, I do not view dynamic scope as the
kiss of
death like many people
seem to,
especially when programming in Lisp dialects. Of which
more later. I'd like to make the case that perhaps, just maybe, we
should think of default dynamic scope in a Lisp as a wart, rather than
language nullifying feature.
Why do I feel this way?
Well, it has something to do with
Concatenative
Languages.
What?
Consider the following diagram:
------------------------------------------------------
dynamicish lexicish
+--------+ +--------+ +-----------+ +--------+
|factor | |picolisp| |common lisp| |haskell |
| | | | | | | |
+--------+ +--------+ +-----------+ +--------+
+--------+ +---------+ +-------+
|joy | |newlisp | |scheme |
| | | | | |
+--------+ +---------+ +-------+
Where the left hand side of each box indicates the position of the
language inside on a sloppily conceived of axis going from
"dynamicish" to "lexicish", ideas which we will try to pin down as we
go. (If, by the way, any actual computer scientists are reading this,
I apologize in advance for the conceptual butchery I am no doubt in
the process of committing.)
One thing this diagram reflects is the intuitive degree to which we
can depend on the nature of binding in the language - Picolisp and
Newlisp here obviously favor dynamic binding, Common Lisp supports
both well, Scheme favors lexical bindings somewhat more than Common
Lisp, and a language like Haskell is pretty much straight lexical
binding. (Exceptions abound of course, given the power of these
languages to add features.)
What is interesting is that if we change the above diagram by relabling
the axes:
------------------------------------------------------
code is simple data code is not simple data
+--------+ +--------+ +-----------+ +--------+
|factor | |picolisp| |common lisp| |haskell |
| | | | | | | |
+--------+ +--------+ +-----------+ +--------+
+--------+ +---------+ +-------+
|joy | |newlisp | |scheme |
| | | | | |
+--------+ +---------+ +-------+
We don't really need to move the boxes around very much. This is
particularly true if we make ourselves clearer and say that code is
data when the "official" result of evaluating a piece of "code data"
that represents a piece of code is encodable in that data in a simple
form. Examples will clarify that horrible monster of a sentence,
hopefully.
Concatenative Languages and Quotations
Put aside for the moment the oddness of concatenative languages, and
questions which may be appearing in your mind about why anyone would
want to program in one. The point here isn't to convince you of their
merits. The point is that concatenative languages (except
Forth), as
it lacks quotations) provide an example of languages where "code is
data" is particularly true, and where that data is particularly simple
while still retaining a consistent meaning.
Concatenative languages are based on the composition of functions
which map one stack onto another stack (usually). Because all
functions in such languages have the same interface, one writes a
concatenative program as a
list of functions, as in this piece of
Factor code:
1 1 +
Which evaluates to "2" and is read from left to right as "take the
stack and return it with 1 pushed onto it", do that again, and then
"take the stack, return a new stack with the result of the first two
items removed and added together". Note the utter absence of variable
bindings (functions do have names, of course). This (arguably
punishing) simplicity makes evaluation unusually simple: begin at the
left with an empty stack, and call each "word" on the stack until you
run out of functions. We can represent a program as a
flat list of
lexemes, each of which denotes a function. So the representation of
code can be a
list and its meaning is simple.
Here is what is interesting about these languages to me: they lack a
notion of variable binding, and so you might think they they lack
anonymous functions, which they do, strictly speaking. However,
because code is easily represented as a list,
lists serve the same
role as anonymous functions and in languages like Factor, these
fragments of code (called quotations because they are introduced via
something like
quote in Lisps) are constantly exploited to perform
the equivalent of "functional programming" idioms in Factor. For
instance, to add one to each number in a list of numbers, one writes
in Factor:
{ 1 2 3 } [ 1 + ] map
Or, less succinctly, but more evocatively:
{ 1 2 3 } 1 [ + ] curry map
Where I have just shown that one can
curry, really, partially apply,
and
compose (not illustrated) quotations.
The question one must ask is what do we really need for functional
programming? We must be able to apply functions, compose them, and
partially apply them.
Many, but not all, uses of lexical scope in
more lexical languages reduce transparently to these operations. We
might write in Scheme:
(map (curry + 1) (list 1 2 3))
For instance, with (a simple) curry being implemented as:
(define (curry f a)
(lambda (b) (f a b)))
Where we are exploiting the lexical scope introduced by curry to stick
two values
into the
lambda.
In a dynamic Lisp, we might say instead, as I often say in Emacs Lisp
when I don't want to engage the machinery of the old
lexical-let
implementation in
cl.el:
(defun curry (f a)
`(lambda (b) (funcall #',f ,a)))
Code is Sort of Data
Factor uses a special data type called a
quotation for its, um,
quotations, but this has more to do with type safety and compiltion
than with the semantics of what a quotation is. It is still basically
a list, so let us imagine that the implementation of the
curry word
is:
: curry ( item quotation -- quotations ) cons ;
That is, effectively,
curry is
cons. It adds an element to the
front of a list.
Now, the dynamic, Emacs Lisp version
(defun curry (f a)
`(lambda (b) (funcall #',f ,a))) ; thanks to coroi on reddit
; for pointing out an error
; here.
Has something in common with the Factor version. They both operate on
and return a simple representation of code, a list. The extra
complexity of Emacs Lisp lies in its evaluator. For instance:
(funcall (curry #'+ 1) 1)
is evaluated in the following way: the call to curry is evaluated,
returning a list.
1 evaluates to 1, of course, and then
funcall
does the following. The evaluator can see that the first argument is
a lambda because its head is
lambda. That means that it extends the
dynamic environment with a binding of 1 to
b, and then recursively
evaluates the body. That means the evaluator is carrying around an
environment which it needs to resolve the meaning of code. Still, its
pretty simple: just (conceptually) a list of stacks for each possible
variable binding with the latest binding on the tops.
The Scheme code, in contrast:
(define (curry f a)
(lambda (b) (f a b)))
Doesn't deal with any simple data structure.
f is a procedure, and
its representation is complex, hidden and has a lot to do with
compiling the Scheme code, probably. The return value, as the result
of a
lambda expression, is also a procedure, and similarly
inscrutible. The compiler probably does some efficient fiddling to
encode in the lambda itself where the references
f and
a point to:
in all likelihood, the compiled code removes the mention of specific
symbols entirely. Even though the
lambda represents a piece of code
in some sense, as it is executable, its representation is now quite
complicated. We can still say "Code is data" of Scheme lambda
expressions, but it is not very transparent or interesting data.
What about Quotations?
In Factor, quotations and lambdas are the same thing, because there
are no symbols to bind in any scopes. Higher order functions and
macros are basically the same!
In Emacs Lisp quotations are pretty close to
lambdas too: if they
happen to have as their
head a
lambda, then they are, subject to
some constraints, actually lambda expressions. If not, you can still
call
eval on them and if all of their free symbols are located in
the dynamic environment that you call eval from, then you get a
"meaningful" result. In picolisp, where this philosophy is entirely
embraced, one can write higher order functions that do the same thing
as macros because they can inspect code passed in.
In Scheme, a quotation introduces
syntax, primarily because you want
to use quotation during macro construction and hygienic macros need to
know more than just the list-structure of the code they work on to
ensure hygiene. So already the representation of code in just a
quotation is more complicated. This is the primary reason that Scheme
is "more staticy/lexical" than Common Lisp, in which a quotation
really does produce just a regular list. Scheme wants to ensure macro
hygeine, and for that reason static or lexical concerns must even
enter into your primary code represantation. In Common Lisp there is
a macro system which is distinct from the function system, but allows
you to use regular list operations to build the generated code. In Scheme the macro system is somewhat more cumbersome conceptually, with
special operations on "syntax" objects to enforce lexical hygeine.
Complexity has crawled deep into the language.
Finally, in a language like Haskell, you don't really have quotations
at all. The representation of any kind of code has become fairly
complex and generally not worth fiddling with (template metahaskell aside).
What does this mean?
The point of the above discussion is twofold:
- Languages lie on a continuum between dynamic and static and that
positions on that continuum have complexity and flexibility payoffs and
penalties. The very dynamic languages pick up a lot of flexibility
(which is good and bad) and the static languages pick up a lot of
complexity, but get certain conceptual guarantees and speed in
exchange.
- Especially in Lisps, which provide facilities for
meta-programming, this distinction isn't extremely important. Heck,
even Factor has an implementation of
lambda with lexically scoped
variables that compiles down to stack code. Emacs Lisp has a
lexical-let form that gets you 99% of what you want. Specific
features of closures that you need at any given moment can easily be
built in dynamic lisps. If you want to program functionally, for
instance, you can whip up good curry and compose operations in any
of these languages and away you go.
I can't find the reference at the moment, but I recall reading from
one of the Factor maintainer's that they use lexical variables in less
than ten percent of the Factor code base. I suspect if you examine
any large Lisp project you will find that for the most part the actual
use of the semantics of lexical scope is small.
There is
absolutely an argument to be made that dynamic scope
introduces certain kinds of hard to find bugs, and I agree with that
assessment. I don't think dynamic scope is optimal by any means, but
the case against it is somewhat overstated. A dynamically scoped Lisp
is still extremely powerful and in some ways more flexible than an
efficiently compiled lexical lisp.
Actual Code
I work hard on making my Emacs Lisp code readable and clear, and so I
wanted to show a bit of the implementation of
Gazelle here, to drive
the point home that in a Lisp the exact feature set under the hood
doesn't make as big of a difference as it might seem.
(defun-match- prim:transcode ('_false)
(prim:insert "false"))
(defun-match prim:transcode ('_true)
(prim:insert "true"))
(defun-match prim:transcode ('_null)
(prim:insert "null"))
(defun-match prim:transcode ('_undefined)
(prim:insert "undefined"))
(defun-match prim:transcode ((non-kw-symbol s))
(prim:insert (prim:mangle s)))
(defun-match prim:transcode ((keyword k))
(prim:insert "\"")
(prim:insert (prim:mangle (prim:kw->symbol k)))
(prim:insert "\""))
(defun-match prim:transcode ((number n))
(prim:insertf "%s" n))
(defun prim:transcode-string (string)
(prim:insert "\"")
(loop for character in (coerce string 'list) do
(match character
(?\n (insert "\\n"))
(?\t (insert "\\t"))
(?\" (insert "\\\""))
(?\\ (insert "\\"))
(else (insert else))))
(prim:insert "\""))
(defun-match prim:transcode ((string s))
(prim:transcode-string s))
(defun-match prim:transcode ((list '_var (non-kw-symbol s) expr))
(prim:insert "var ")
(prim:transcode s)
(prim:insert " = ")
(prim:transcode expr))
(defun-match prim:transcode
((list '_return
(list (and which
(or '_for '_while '_try '_var '_=)) (tail body))))
(recur `(,which ,@body)))
(defun-match prim:transcode
((list '_return
(list
(and which (or '_throw '_continue '_break)) expression)))
(recur `(,which ,expression)))
(defun-match prim:transcode ((list '_return expression))
(prim:insert "return ")
(prim:in-parens
(prim:transcode expression)))
This is a small excerpt from the
prim module, which converts a
primitive representation of Javascript in s-expressions to Javascript
source code. That transcoder consists of one large function
definition, split up into several invocations of
defun-match (the
initial
defun-match- clears previous function bindings to the
provided function name), each of which matches its pattern against the
input and executes when the match succeeds. Matches are tried in the
order in which the
defun-match forms are compiled, in most cases
from top to bottom.
defun-match allows you to tail-recur to the same
function without growing the stack, so in many cases (two shown
above),
prim:transcode calls itself to perform the actual
transcoding.
I can only ultimately speak for myself, but I find that this code is
clear, readable (once you get used to the pattern matching) and self
documenting. If you can write code like this in Emacs Lisp, is it
really such a crime to do so?
And given that by really thinking about the different ways to program
in different Lisp and Lisp-like languages we can learn a great deal
about programming, even in our favorite dialects, is it really useful
to stifle conversation and development in alternative Lisps?
I really don't think so.