Thursday, August 11, 2011

A Survey of Syntactic Extension Techniques in the Lisp Family of Languages (Part 2)

Last time we began our survey of syntactic extension technqiues in Lisp with a rather airy meditation on syntax, quotation, evaluation and lambda, all by way of picoLisp's unusual dynamic, interpreter oriented semantics. Very briefly, we looked at the way picoLisp lets regular, first class, functions specify that certain arguments should be passed in unevaluated, which the function can then selectively evaluate. Because the language is dynamically scoped and interpreted, this gets you as far as macros, and a little farther (since even "special forms" are first class).

This time, we're going to attempt to be a little more direct in our survey. But we should still think a little bit about macros before we jump in. In the Lisps we'll talk about today, syntactic extension is handled via macros. Given the flexibility of the picoLisp approach (also known as fexpr's, and also supported by newLisp, which is itself an exotic species I might write about later), you might wonder why we'd want to use macros. The principal obvious disadvantage of macros is that they are not first class objects, they can't be stored in variables or passed around, whereas picoLisp style functions/special forms are. There are lots of reasons they were abandoned, but one easy one to see is that macros make separating different stages of program interpretation and execution a bit easier.

Consider executing an fexpr. In order to correctly calculate the value of such a first class function, we need to remember not just the flow of values from one function to another, we also have to know the syntax of all the inputs. Much of the efficiency of compilation is in the ability to disregard this information, and reduce the program to a series of simple, primitive, operations. If our language is lexically scoped, then, in addition to remembering the syntax of the inputs, we also need to know the context of that syntax in order to correctly evaluate it. I'm surely oversimplifying things here, but you can read about the problems with fexpr's here (the commentary on this subject at Lambda the Ultimate is also great)1. Kazimir Majorinc, by the way, has written a rebuttal of Pitman's original concerns. I have to admit, I find newLisp in particular to be a fascinating case study in alternative approaches to Lisp and computer programming languages in general. More people should try it out!

If we stage our syntactic extensions, as in more modern/typical Lisps, after program reading, but before execution, it simplifies the design of our compiler/virtual machine/interpreter. The macro systems we'll be discussing today operate in this way, transforming an intermediate form of Lisp code, produced by the reader, before finally handing the result to the compiler. The upside is the compiler need only concern itself with the behavior of basic lisp and its small set of special forms. The downside is that macros are stuck working only on source code: they cannot, in general, know anything about runtime, as the code has not yet been run!

Transformation

The Arc Documentation has something rather insightful to say about this:

Macros live in the land of the names, not the land of the things they refer to.

So, today we'll be looking at macro systems by which the user declares that certain symbols indicate to the intermediate processing step of the language that some portion of code should be transformed by a specified piece of code and the result inserted back into the code representation where the macro appeared.

Business Time!

Macro syntax can be a bit confusing when you see it for the first time. It is useful, then, to consider what a macro might do rather than how it is written, first. Macros should almost always transform code into code, with no side effects along the way, and so we can understand a macro by writing the code before and then the code after macro expansion.

If you know some Lisp, you know that variables are introduced with let epxressions. Each binding expression in a let is executed without any of the other bindings in scope, so:

(let ((x 10)
      (y (+ x 1)))
    (+ y x))

Will produce an error, because x is not visible until the body portion of the let expression. If you want to nest references to new variables, you need let*:

(let* ((x 10)
       (y (+ x 1)))
   (+ y x)) ; -> 21

Which does work. Let's write a let* macro as an example. Expressions like let* are all about [binding][], and that means we can use lambdas to express let*. We want:

A:

(my-let* ((x 10)
          (y (+ x 1)))
      (+ y x))

To expand to:

B:

(funcall (lambda (x) 
   (funcall (lambda (y) (+ y x)) (+ x 1)))
   10)

Take a second to think about it, if you don't see why. The basic insight is that lambda can be used to introduce a context where a variable is bound, and funcall binds that variable.

The transformation of some code A to some other code B is the action of all but a few exotic macros. This transformation happens, conceptually, before any code is run (in practice, because many Lisps favor interactive development, macro expansion might occur after the user has executed some code, but it is highly unusual, and probably an error, for the macro expansion itself to depend on the results of previous execution (except that the macro may use previously defined functions to effect the code transformation)).

Let's write the macro, this time in Emacs Lisp:

(defmacro my-let* (bindings &rest body)
  (cond 
    ((empty? bindings) (cons 'progn body))
    (t
     (let ((pair (car bindings))
           (leftover-bindings (cdr bindings)))
      `(funcall 
         (lambda (,(car pair))
           (my-let* ,leftover-bindings ,@body))
         ,(cadr pair))))))

(my-let* ((x 10)
          (y (+ x 1)))
     (+ y x)) ;-> 21

We need to say a bit about quasi-quotation to really understand how this works, but there are some things we can explain right away. Emacs Lisp represents code as a list of symbols, other atoms, and lists. When my-let* is encountered, the source code after my-let* in the code is passed, as a list, into a function (implicitly defined by the defmacro statement. This list is destructured by this function using regular old argument specification (so that the first item in the list is bound to bindings and all the subsequent items are bound to body, in a list) and then the body of the defmacro is executed. It's result must be a piece of code, which is inserted wherever the macro appeared in the source code representation.

Let's focus on the first possibility, when the macro is called with an empty set of binding forms. Our code is:

(my-let* () "Hey there.")

This is slurped up into the Lisp interpreter as:

(list 'my-let* '() "Hey there.")

The interpreter sees my-let* and knows that this indicates a macro. To expand this macro, it calls a function based on the defmacro expression. This function doesn't have a name, but we can write what it would look like:

(defun -my-let*-expander- (binders &rest body)
   (cond 
    ((empty? bindings) (cons 'progn body))
    (t
     (let ((pair (car bindings))
           (leftover-bindings (cdr bindings)))
      `(funcall 
         (lambda (,(car pair))
           (my-let* ,leftover-bindings ,@body))
         ,(cadr pair))))))

The macro expanding part of the lisp system calls this function with the following arguments:

(-my-let*-expander- '() "Hey There")

This function returns:

'(progn "Hey There")

And the macro expanding part of the lisp system inserts that expression, whole hog, so to speak, into the place where the original form starting with my-let* was located. It then moves on. Once all the macros are expanded, the code is passed to the compiler/interpreter/etc and then executed. Viola!

Viola!
Voila!

To understand the other branches, we need to cover a feature which appears in most lisps: quasiquotation.

Macros & Quasiquotation

Lisps, more or less, represent code internally as lists or/of atoms (things like numbers, strings, symbols). The code x represents, more or less, a piece of code that evalutes to the value of the symbol x. The list (+ x 1) is the piece of code that evaluates to the value of symbol x plus one. We covered last time how to enter these unevaluated pieces of code into a running lisp session using quotation. The quote operator tells a Lisp not to evaluate whatever it is applied to. So:

(setq x 10)
x  ;-> 10
'x ;-> x
(+ x 1) ;-> 11
'(+ x 1) ;-> (+ x 1)

Quote is ok when we want to write a tiny bit of code-as-data, but when we write macros, we often want to interpolate between data and code in a more dynamic way. We could, of course, use list to create our code:

(list '+ 'x 10) ;-> (+ x 10)

Because list is a function, each argument is evaluated - we quote manually whatever arguments we want to leave unevaluated as we construct our code.

Quasiquotation is kind of the opposite of list. A quasiquoted expression, by default, doesn't evaluate its inputs except when they are unquoted. Quasiquote is indicated by the back-quote character:

`

And, within a quasiquoted form, unquotation is indicated by a ,, which kind of makes sense. Quasiquotation works pretty much identically in Emacs and Common Lisp, so fire one of them up and try it out. You can use quicklisp to install Common Lisp quickly.

`(+ x ,(+ 10 11 12)) ;-> (+ x 33)

Should work in either Common Lisp or Emacs. Note, however, that different Lisps handle symbols slightly differently. In SBCL, a Common Lisp implementation, 'x will evaluated to X, and symbols are not case-sensitive by default. Symbols are case sensitive in Emacs Lisp.

Sometimes you want to interpolate the contents of a list into a piece of code, in which case you say ,@ instead of ,:

`(+ ,@(list 1 2 3)) ;-> (+ 1 2 3)

Without concerning ourselves with namespaces or packages (of which Emacs has neither) Clojure macros operate in the same way as Emacs/Common Lisp macros, but with some slight changes in the way operations are indicated:

(defmacro my-let* [bindings & body]
  (cond
   (empty? bindings) (cons 'do body)
   :else
   (let [[symbol value-expr & leftover-bindings] bindings]
     `((fn [~symbol] (my-let* ~leftover-bindings ~@body))
       ~value-expr))))

This piece of code reads just like the Common/Emacs Lisp except for a few details. The first is that the argument list to the macro is specified using a Clojure vector rather than a list. Unlike in Common/Emacs Lisp, the default Clojure reader is able to read more than lists and symbols. The [] syntax indicates a Clojure persistent vector. It is read into the code representation as a vector, rather than as some kind of list or atom. This is both an important and a trivial difference. Consider in Emacs Lisp:

(vector 1 2 3)

This indeed evaluates to a vector with elements 1 2 3. But it is read as a list whose head is the symbol vector. Only upon evaluation do we get a vector.

(vectorp '(vector 1 2 3)) ;-> nil
(listp   '(vector 1 2 3)) ;-> t

In Clojure, by contrast:

(vector? '[1 2 3]) ;-> true
(list?   '[1 2 3]) ;-> false

(But see this footnote:2).

(As an aside, Clojure also supports tables as part of its code representation, in keeping with its intent of expanding Lisp's philosophy to data structures other than lists.)

In Clojure, do is how you say progn, both of which introduce a form which evaluates all its parts, and returns the result of the last. Next, we follow the convention that redundant parentheses in Clojure should be elided. Because we know bindings contains a list of pairs, we just read that list by two, rather than as an actual list of pairs. You see this in Clojure's let form, which has one less layer of nesting.

Emacs:

(let* ((x 10) (y 11)) (+ x y))

Clojure:

(let [x 10 y 11] (+ x y))

(Note that picoLisp also elides redundant parentheses, but does not use vectors for binding, also note that Clojure doesn't have let*, let has that behavior).

Also by convention, we use vectors for the binding part of any form (although this macro doesn't check for that). We say fn instead of lambda in Clojure, and we use ~ for the unquote operation. In Clojure, ~@ is the way you say ,@.

Other than those differences, the Clojure macro has the same behavior as the Emacs/Common Lisp Macro. It creates an invisible function which is used to expand code tagged with that macro name during post-read processing, and then the code is passed to the Clojure compiler. It is worth noting here that Brian Goslinga has ported Scheme-style syntax-case/rules macros to Clojure, but it isn't clear to me without further study whether they truly are hygeinic. Claims of macro hygiene are often exagerrated outside of the Scheme universe.

Issues with Naive Code-Rewriting Macros

The best way to understand the motivation for Scheme hygeinic macros, as well as some of the differences between more or less conventional macro systems in other Lisps is to understand how things can go wrong. Last time we talked extensively about scope and how it effects the way we look at a piece of data that represents code. The upshot was that in languages where scope is dynamic, which means variables evaluate to whatever the current binding of the variable is at the moment of evaluation, can simply represent code (and particularly the meaning of symbols) as just lists of atoms and symbols. A piece of code "means" whatever it is you get by evaluating it in the context of the current bindings from symbols to values. Period.

Lexically scoped languages, on the other hand, impose more strenuous semantics on code. In a lexically-scoped language, variables refer to the lexical environment (the environment around them "on the page") when they are evaluated. Hence, in a lexically scoped language, a "naked" piece of code, such as the code produced by quotations, is "impoverished" - it doesn't record in any way the lexical environment in which the quotation was created, and so it is, in some sense, "meaningless," at least the extent that it contains symbols which aren't bound.

In a dynamic language, the code fragment consisting of the single symbol 'x means "Whatever value x has when you evaluate me." This representation is complete, but depends on the evaluator. In a lexically scoped language, where symbols, when they appear in code, are implicitly associated by the rules of evaluation with a lexical context they were originally created in, 'x is meaningless. It looks like a piece of code, but in a real sense it isn't quite one.

Emacs/Common Lisp/Clojure style macros, however, operate on pieces of "code" produced by ordinary quotation - that is, they operate on "hypocode," say, which just means something not quite code.

Making a Mess

Let's make a mess, by way of example. Suppose we have an object system wherin objects are collections of key -> value relations in a persistent data structure, like an association list3. We will work in Emacs/Common Lisp.

Preliminaries:

(defun empty? (x)
  (eq x nil))

(defun set-slot (obj symbol val &optional acc)
  (cond 
   ((empty? obj) 
    (cons (cons symbol val) (reverse acc)))
   (t
    (let* ((first (car obj))
           (o-key (car first))
           (rest (cdr obj)))
      (if (eq symbol o-key) 
          (append (reverse acc) (cons (cons symbol val) rest))
        (set-slot rest symbol val (cons first acc)))))))

(defun get-slot (obj symbol)
  (cond
   ((empty? obj) nil)
   ((eq (car (car obj)) symbol)
    (cdr (car obj)))
   (t
    (get-slot (cdr obj) symbol))))

(get-slot (set-slot (set-slot (set-slot '() 'x 10) 'x 11) 'y 14) 'y)

Methods will just be functions whose first argument is the self object.

(defun make-person (first last) 
  `((:first-name . ,first)
    (:last-name . ,last)))

(defun change-first-name (self new-first)
   (set-slot self :first-name new-first))

(defun change-last-name (self new-last)
   (set-slot self :last-name new-last))

(note the use of backquote outside of a macro.)

Our objects are "pure" in this example - the methods change-first-name and change-last-name don't modify self, they return a fresh self object with the appropriate changes. I prefer this behavior, but it makes chaining method application clumsy. We'll develop a macro to make method chaining for pure objects easier.

the Bionic Woman
I don't know the Bionic Woman, but she comes up when you google "Woman".

I know a woman who changed both her first and last names when she got married. She went from Ami Culbert to Amy Klein (she had always resented her parents for the unusual spelling of her first name). We've got to nest our method calls or use a let to affect both renamings:

(let ((new-self 
       (change-first-name (make-person "Ami"
                                       "Culbert") "Amy")))
   (change-last-name new-self "Klein"))
 ; -> ((:first-name . "Amy") (:last-name . "Klein"))

Let's write a macro which automatically threads an object through method invokation. It should expand:

(with-object o 
  (change-first-name "Amy")
  (change-last-name "Klein"))

into something like:

(let ((new-self 
       (change-first-name o "Amy")))
   (change-last-name new-self "Klein"))

People often complain, inaccurately, I think, that macros are "hard to debug" because they aren't regular functions. It is true they aren't invoked as regular functions in the ordinary course of events, but one can certainly write the "business end" of the macro as a regular function and just have the macro definition pass the quoted inputs to this function appropriately. Then you can interactively test the macro expansion, build unit tests, etc.

(defun with-object-expander (object-expr method-applications)
  `(let* ((object ,object-expr)
          ,@(mapcar 
             (lambda (methap) 
               `(object (,(car methap) object ,@(cdr methap)))) 
            method-applications))
        object))

(with-object-expander 'test-object 
 '((change-first-name "Amy")
   (change-last-name "Klein"))) ;-> 
  (let* ((object test-object) 
         (object (change-first-name object "Amy")) 
         (object (change-last-name object "Klein"))) 
     object)

If you are new to Lisp macro writing, this probably looks like it works just fine. If you are a hoary old Lisper (and I know some of you are), then this probably looks like one of those ridiculous examples from safety videos where a goofy guy climbs up a ladder, or some such, with an aerial around some high tension power lines, which is exactly what it is. Let's sally forth like Goofus, however:

(defmacro with-object (object-expression &rest method-exprs)
  (with-object-expander object-expression method-exprs))

(with-object (make-person "Ami" "Culbert")
  (change-first-name "Amy")
  (change-last-name "Klein")) ; -> 
   ((:first-name . "Amy") (:last-name . "Klein"))

Oh Goofus
Goofus probably programs in Perl. Just sayin'.

Indeed, apparent success! However, imagine if we are instead taking the new last name from an object representing Ami's fiance:

(let ((object (make-person "Jason" "Klein")))
  (with-object (make-person "Ami" "Culbert")
    (change-first-name "Amy")
    (change-last-name (get-slot object :last-name)))) ; ->
  ((:first-name . "Amy") (:last-name . "Culbert"))

What gives? Amy's last name should be "Klein" but it is stil "Culbert". What happened? A quick look at the macro expansion of the entire expression would help, but how do we get such a thing? Emacs/Common/Clojure Lisps provide macro-expansion tools which take an expression and expand macros inside of it:

(macroexpand-all 
 '(let ((object (make-person "Jason" "Klein")))
  (with-object (make-person "Ami" "Culbert")
    (change-first-name "Amy")
    (change-last-name (get-slot object :last-name))))) ;->

    (let ((object (make-person "Jason" "Klein")))
      (let* ((object (make-person "Ami" "Culbert"))
     (object (change-first-name object "Amy"))
     (object (change-last-name object 
               (get-slot object :last-name))))
     object))

Now we can see what the problem is. We bound Amy's fiance to the symbol object, but our macro expansion also used that symbol internally, to represent the object being threaded through the method applications, which means it was bound to the Amy person by the time the call to get-slot was executed on object. Amy's last name was just her last name, so the change-last-name call changed nothing!

Note that this is, essentially, an error of context or variable interpretation. Our mental model was that the object symbol used during macro expansion was unrelated to the object symbol used in the subsequent let expression. But the Lisp has no way of knowing that. Because we are manipulating naked code, a symbol is just a symbol, and the behavior of our complete expansion is just whatever behavior it looks like it would have if someone had written it by hand. Scheme's macro system is designed to resolve this problem by separating contexts more aggressively, so that macro expansion doesn't get in the way of later variable binding unless you specifically want it to.

We don't, however, have hygeinic macros in the Lisps we are talking about today, so how do we avoid the problem specified above? There are at least two ways which are in common use. The first is to force the invoker of a macro to provide any names which might be used during expansion, so they are forced to acknowledge that those names will be rebound by the macro. The second is to make sure that the macro uses symbols which you are certain will never be used by the programmer who invokes the macro.

The former strategy looks like this:

(defmacro with-object-bound-to (sym obj &rest meths)
  `(let* ((,sym ,obj)
          ,@(mapcar
            (lambda (m)
              `(,sym (,(car m) ,sym ,@(cdr m)))) 
            meths))
        ,sym))

Invoked as:

(let ((object (make-person "Jason" "Klein")))
      (with-object-bound-to amy 
        (make-person "Ami" "Culbert")
        (change-first-name "Amy")
        (change-last-name (get-slot object :last-name))))

Which has the intended effect. However, if you don't intend on using the value bound to amy at any time, you'd probably prefer regular old with-object.

We could implement with object by just trying to use a crazy symbol for object, something we'd hope the users of our macro will never think to use. For instance, we could prepend the letter s to the md5 hash of the word "object": sa8cfde6331bd59eb2ac96f8911c4b666:

(defun with-object-expander (object-expr method-applications)
  `(let* ((sa8cfde6331bd59eb2ac96f8911c4b666 ,object-expr)
          ,@(mapcar 
             (lambda (methap) 
               `(sa8cfde6331bd59eb2ac96f8911c4b666 (,(car methap)
  sa8cfde6331bd59eb2ac96f8911c4b666 
  ,@(cdr methap)))) 
            method-applications))
        sa8cfde6331bd59eb2ac96f8911c4b666))

And that would probably work. However, Lisps usually provide a facility to generate symbols guaranteed not to be used (subject to the understanding that if symbols are read at run-time, all bets are off - althouh it is still unlikely there will be a problem). In Common and Emacs Lisp, this is accomplished via the function gensym. In these Lisps, we'd write:

(defun with-object-expander (object-expr method-applications)
  (let ((object-name (gensym "object-")))
  `(let* ((,object-name ,object-expr)
          ,@(mapcar 
             (lambda (methap) 
               `(,object-name (,(car methap) ,object-name ,@(cdr methap)))) 
            method-applications))
        ,object-name)))

The critical points are that we've introduced a variable object-name which, at macroexpansion time, is bound to a fresh symbol. We then insert that symbol wherever object used to be by unquoting it into out expanded expression.

Turns out this feature is so commonly used that Clojure provides special support for it - during backquote expansion, symbols terminated with a # character are automatically bound to appropriately generated symbols. By appropriate we mean that object# will always be expanded to the same symbol within a single backquote. In this example, since some of the uses of the object symbol are generated programmatically, we'd still need to use gensym, which Clojure also provides.

Namespaces, Packages, Hygiene

All this gensym falderal is meant to help us ensure our macros are hygienic, which informally means that in the body of a macro, the code you write behaves the way you expect it to behave, modulo whatever changes in meaning the macro implies AND that when the macro expands, it means what the macro writer meant it to mean, regardless of what symbols mean where the macro is expanded.. Variable capture, the example we looked at above, is only one of many possible hygiene complications. Another possibility is that a macro (the dependent macro, lets call it) might depend on another macro, which itself is defined one way when the dependent macro is written, but might be redefined later when someone else tries to expand the dependent macro. It's very hard to write macros that are hygienic in this sense in the Lisps we've talked about so far, but some of them do provide the faculties to at least help a bit.

Emacs Lisp is the oddball in this department, however - it has neither packages or namespaces or any other module system4.

Both Clojure and Common Lisp provide some conception of modules or bags of symbols. In Common Lisp these are called packages, in Clojure they are called namespaces. In both cases, macro writers can use namespaces to help avoid macro expansion problems. Consider the case where some joker redefines the let* special form. Your macro wants to use let* as it is ordinarily defined by Common Lisp. He wants to use your macro, but wants to put code in it that depends on his crazy new definition of let*. If you write your macro (expander) as:

(defun with-object-expander (object-expr method-applications)
  (let ((object-name (gensym "object-")))
  `(cl::let* ((,object-name ,object-expr)
                ,@(mapcar 
             (lambda (methap) 
               `(,object-name (,(car methap) ,object-name ,@(cdr methap)))) 
            method-applications))
        ,object-name)))

You can have your cake and the user of your macro can eat it too. The key changes is that we've qualified our let* symbol with the package it lives in. cl::let* means we want Lisp to use the meaning of let* as defined in the package cl or common-lisp (packages can have nicknames in Common Lisp). If our user redefines let* in his own package, then he can use our macro without any problems.

Clojure takes this one step further. Backquote automatically expands symbols to their namespace-qualified names, so macros automatically expand to refer to the special forms as defined in whatever namespaces they are defined in when the macro itself was defined. Material unquoted into the macro expansion is expanded appropriate to the context it was introduced in. The net result is that someone redefines let in their own package, and calls your macro, it automatically expands to use the let from the base Clojure package:


`(let [x 10 ~'y 11] (+ x y)) ;->
 (clojure.core/let [user/x 10 (quote user/y) 11] 
    (clojure.core/+
       user/x 
       user/y))

I haven't specifically mentioned Arc throughout all this, but Arc macros basically follow the same design as those in Common/Emacs Lisp and Clojure. If they are more similar to any of the above than any other, it is Emacs Lisp, since Arc doesn't appear to have, at the moment, a namespace or package system.

Conclusions

Good Lisp programmers use macros sparingly, for the most part, and so it is difficult to think up examples when you'd need more macro hygiene then what you get in Lisps with gensym and packages and namespaces. In particular, Clojure seems to provide enough protection that you won't accidentally shoot yourself in the foot unless you wander off the beaten path.

However, you might wonder if we can't come up with a more formal syntactic extension scheme which really protects us in a simple to understand way. Something that makes macro hygiene as easy to conceptualize as variable hygiene for regular code? There are lots of ways to approach this problem (more than I know of, certainly) but next time we'll discuss the way Scheme does it. Scheme syntax-rules/case macros are very different compared to defmacro style forms, and their intent is to make hygiene the default behavior, so that complex macros (that, for instance, depend on one another) can be defined easily without much extra manual fiddling (gensym, packages, etc). As we'll see, the solutions are related to the difference between lexical and dynamical behaviors, a theme which keeps coming up!


1 See also Kernel.

2 I must point out that Emacs Lisp, but not Common Lisp, has a similar read syntax:

[a b c] ;-> [a b c]

Which is sort of like the vector syntax except that it reads the elements of the vector literally, as if they were quoted. The emacs evaluator will not evaluate the elements: when it encounters a vector in the code it is evaluating, it just returns the vector. The Clojure evaluator encounters a vector, and returns the vector containing the result of evaluating each element in the vector.

In emacs:

(let ((a 10) (b 11) (c 12)) [a b c]) ;-> [a b c]

In Clojure:

(let [a 10 b 11 c 12] [a b c]) -> [10 11 12]

I wish Emacs had the Clojure behavior (or an extensible reader, like Common Lisp).

3

An association list is a list of pairs (created with cons). Each pair consists of a symbol and a value.

4

I estimate that the emacs session I'm running right now, to write this piece, has about 25,000 functions and symbols bound. The fact that emacs works at all without namespaces, with just this giant soup of symbols, is amazing and maybe even indicative of some kind of blind spot in our understanding of software development. Maybe worse is better.

13 comments:

spacemanaki said...

This is a great series of posts! I think that commas are whitespace in Clojure though, and don't do any unquoting at all.

J.V. Toups said...

Great catch (I think it is right in most places, but I fixed the place I noticed).

spacemanaki said...

I was mostly referring to this sentence: "...and we use ~ for the unquote operation, although , also unquotes in a slightly different way, most of the time you want ~." I think you always want tilde in Clojure, as a comma will not work at all.

I feel a bit churlish even pointing out such a minor thing, since these posts are so good. So definitely keep it up. The stuff in the first post about dynamic scope and fexprs really clarified what I'd read in Lisp In Small Pieces.

J.V. Toups said...

Oh, I see what you are saying now. I meant to draw a distinction between ` and ', not ~ and ,. Ooops.

I'm envious - I can't seem to find a copy of Lisp In Small Pieces for less than 100 dollars! I've never read it.

John Cowan said...

ISLisp uses the same style of macros as Common Lisp, which is a bit anomalous consider how non-dynamic it is (redefinition is undefined, SYMBOL-VALUE and SYMBOL-FUNCTION aren't available, no run-time evaluation, etc.)

Hardly fair to Perl, by the way. Perl does its syntax extension entirely up front and supports arbitrary lexical syntax as well, very like Racket. See Lingua Romana Perligata for an extreme example (but no more so than Racket's Algol 60 implementation).

John Cowan said...

I forgot to mention that ISLisp does not have packages or namespaces, at least not in the standard, though colon is reserved in identifiers, so it is possible to provide them (and implementations tend to do so).

J.V. Toups said...

Mostly the dig was directed at Perl's tendency to accept idiosyncrasies in syntax and semantics in return for minor conveniences.

J.V. Toups said...

Of course, that isn't really fair to Perl, which is a kind of DSL for text processing. If someone organically grew a lisp dsl for text processing, it would probably end up looking somewhat like perl.

stassats said...

I don't understand this sentence "Unlike in Common/Emacs Lisp, the default Clojure reader is able to read more than lists and symbols."
Did you mean that the code is mainly represented by symbols and lists? Because the reader can read vectors, numbers, strings, etc. perfectly fine.

stassats said...

The part abour redundant parenthesis sounds like they're redundant in CL or ELisp, which they're not, because you can write (let ((x) y) ...)

stassats said...

Emacs works without explicit namespaces because every symbol is prefixed with the name of the package it resides in. E.g. foo-function.

stassats said...

It's better to use cl:let* instead of cl::let*, since it's exported.

stassats said...

And now that I've actually read what that cl:let* is used for, it's not needed all. The body of DEFMACRO is read at read time, so the LET* it uses will be CL:LET*, if the macro is later expanded in a package which has a different LET*, the macro-expnsion still will have CL:LET*, since there's no reading invovled during macroexpansion.