Tuesday, March 18, 2008

Creating Simple Macros in Matlab

Perhaps this is inadvisable but I recently found it useful to create simple macros in Matlab. Matlab does not encourage this behavior but it is technically possible, although somewhat ugly. As we shall see, there are some interesting things about the way I chose to implement them which might inform thinking about other macro systems in other languages. The Problem (as always): Tedium The issue is that I frequently find myself implementing functions with optional parameters. This necessitates some code that looks like this:

function x=increment(val,amount)
% X=INCREMENT(VAL) increments val by one
% X=INCREMENT(VAL,AMOUNT) increments val by AMOUNT
%    AMOUNT defaults to 1 if left off;
if ~exist('amount')
   amount = 1;
end

x = val + amount;


Which is fine. Except I find myself implementing lots of optional argument functions, frequently with many optional arguments, which means writing A LOT of if ~exist(str)... statements. In LISP, God bless it, this could be abstracted away with a simple macro (except that you can do it with defun so why bother) but in Matlab we have to do cartwheels to save keystrokes. Our salvation is that modern Matlab treats function calls without parenthesis as special. Consider the function function r = f(varargin) r = varagin; Then >> f(1, 2, 3) % -> {1,2,3} %but >> f 1 2 3 % -> {'1','2','3'} Which means Matlab "quotes" the arguments before passing them to the function. We can exploit this and Matlab's evalin function to evaluate the arguments selectively in the scope calling a function, rather than the function's scope itself. (For those readers not familiar with Matlab, its neither dynamically nor lexically scoped. Different functions have different scopes, but you can examine the enclosing dynamic scope with some functions.) Getting right to business define the function default in the following way:
function default(varargin)

wb = 1;
wa = 1;
vstr = [varargin{:} ';'];
ii = find(vstr=='=');
vstr = [vstr(1:ii(1)) '1;'];
wb = who;
eval(vstr);
wa = who;
v = setdiff(wa,wb);

str = [ '' ...
'if ~exist(''%s''),'...
'  %s;,'...
'end'];
varargin{:}
str = sprintf(str,v{1},[varargin{:}]);
evalin('caller', str);

Bear with mere here - some fancy trickery is going on. First, a roadmap - we are going to figure out the name of the variable whose value we want to declare a default for by evaluating a modified form of the passed in expression in the current function's scope. This will allow us to figure out the variable name by exploiting the Matlab parser. This way even complex initializations like x(1) = 10 can be parsed. Next we will construct a string to be executed in the scope of the calling function which tests for the existence of the variable and, if it does not exist, executes the definition passed into the default function. Because I am lazy, the code looks a bit mad - but its the quickest way to implement this and also likely the smartest.

So wa and wb simply initialize values so that who reports them when we use it later. vstr simply replaces everything after the = input expression (merged into one string in the previous line) with a 1. We then record the list of symbols with the who statement, eval the vstr and record the list again, and finally apply the setdiff function to get only the new value, which should be the name we are looking for. Woe be to the user who passes in multiple expressions. They will be ignored.

Now that we know our variable name, we can construct the string to be evaluated in the caller. The expression varargin{:} just concatenates all the arguments passed in into one string. We finally use an evalin expression to insert our macro code into the calling context, roughly.

Now we can test it out on the command line:

>> default x = 10
>> x
   x =
   10
>> clear x
>> x = 121
>> default x = 0
>> x
   x = 
   121

Voila! We can now employ this cute little macro in our functions and save ourselves a bit of time while more clearly expressing our intent.

One more note: Matlab passes everything except a trailing ; to the function as strings. This makes it hard to remove the parenthesis at the default declaration to cause Matlab to output the value. Since you can never pass a trailing parenthesis to the macro, the only solution is to create a separate macro called pdefault which prints its evaluation. I leave this as an exercise for the reader. I find this interesting because of the way Matlab segregates its scopes. I have to explicitly request values from the scope above me and explicitly evaluate things there. This makes it easy to separate the logic of the macro from the results of the macro. I wonder why no Lisps do it this way. Macros are sort of like functions with quoted arguments and access to their dynamic, rather than lexical scope. Why not formalize in this direction rather than formalize in the Scheme direction? Anyone know?

Monday, March 17, 2008

Common Lisp Reader Macros Pt 2: Packages and Complications

Last time I explained the basics of reader macros in Common Lisp, defining a special syntax for vector application of functions and a short lambda declaration as a bit of syntactic sugar. It's this latter construct:

[(+ X 1)] 
;; becomes/expands to (lambda (X) (+ X 1))
;; And
[a b c -> (* a (+ b c) (- b c))] 
;; becomes/expands to (lambda (a b c) (* a (+ b c) (- b c)))
which we will discuss further today. In the last entry, I also included the hopefully not too obscure warning "A word of caution: Macros are always tricky beasts, and extending Lisp this way may only seem like a good idea. In practice, it introduces all sorts of tricky issues." Today some of those tricky issues will become apparent as we approach the subject of Common Lisp Packages. Much of the information I am going to discuss is covered in The Complete Idiot's Guide to Common Lisp Packages. This article can be thought of as an extended example. Programming, as a practical problem, is often about marshaling complexity. Most constructs in programming languages (procedures, closures, objects, etc) are all at least partly ways of encapsulating little bits of code in the hopes that as the system grows, interactions between logically separate computations remain simple to understand. To support this at levels higher than functions and objects, Common Lisp provides the concept of packages. We will begin by thinking of packages as just lumps of code and then refine our understanding as we encounter problems. We want to create a package to contain all of our utility functions so we can use them in various projects without worrying much about name collisions. In Common Lisp we create a package (which is a first order value) with the defpackage macro.

CL-USER> (defpackage 
  #:my-utils 
  (:use #:common-lisp) 
  (:use #:common-lisp-user))
#
This creates a package named my-utils and declares that inside this package, all the symbols exported by the packages common-lisp and common-lisp-user will be available (these are the default symbols available at your average REPL). What does the #: macro character do? That is what I asked myself the first time I tackled packages in Common Lisp. We will get to the answer, but for now just think of it as a special way to introduce a symbol (bearing in mind that defpackage is a macro and its arguments are quoted). Now that we have created a package lets put some functions into it. We do that using the in-package macro.

CL-USER> (in-package #:my-utils)
MY-UTILS>
Your prompt will change to indicate that you now are working inside the my-utils package. Any symbols or functions introduced here will be visible only in this package. Lets add a function to the package. Suppose we frequently want to get the CDR of a list stored in a list of lists or symbols if its a list and its CAR is a particular symbol. That is, get-tagged-list-body should do the following:

MY-UTILS> (let ((descriptor 
                  '(dog (paws 
                          front-left
                          front-right 
                          back-left 
                          back-right))))
(get-tagged-list-body 'paws descriptor))
(FRONT-LEFT FRONT-RIGHT BACK-LEFT BACK-RIGHT)
We can define such a function easily:

(defun get-tagged-list (tag lst)
  (car (member-if 
         (lambda (el) (and (listp el) (eq tag (car el)))) 
         lst)))
(defun get-tagged-list-body (tag lst)
  (cdr (get-tagged-list tag lst)))
N.B. member-if Now we can use the function inside the package.

MY-UTILS> (get-tagged-list-body 
            'dog 
            '(x (dog a b c)
              y (cat q r s)))
(A B C)
Now we want to make it available to users outside the package with the export function.

MY-UTILS> (export 'get-tagged-list-body)
We can then use it outside the package thusly:

MY-UTILS> (in-package #:common-lisp-user) ;; Go back to the CL-USER package.
CL-USER> (my-utils:get-tagged-list-body 'dog '(x (dog a b c)))
(A B C)
Note that we just prefix the function call with the package name and : (:: accesses unexported symbols). Simple enough, neglecting the business with #: and exactly how packages manage this trick. Now, though, lets go back to our short function call macro. We might easily have written get-tagged-list in the following way:

(defun get-tagged-list (tag lst)
  (car (member-if 
        [el -> (and (listp el) (eq tag (car el)))] 
        lst)))
Using our (perhaps silly) short-lambda macro. Lets assume that we defined the macro and its associated functions in the (default) CL-USER package. As a reminder, here is the code:

(defun short-lambda-split (elements)
(let ((args
  (loop while (and
         (not
     (eq (car elements) '->))
         elements)
  collect
    (pop elements))))
  (if (not elements)
  ; There is no arg list
    `(lambda (x) ,@args)
    `(lambda (,@args) ,@(cdr elements)))))

(set-macro-character #\[
(lambda (stream char)
 (let* ((lst
        (read-delimited-list #\]
     stream t)))
  (short-lambda-split lst))))

(set-macro-character #\] 
  (get-macro-character #\)))
Since we defined these in the CL-USER package and MY-UTILS uses this package (see defpackage, then we should be able to go back into the MY-UTILS package and defun the above version of get-tagged-list and be good to go. If we do this, though, something surprising happens - something which will teach us a lot about the way packages (and the reader) work. Go back into MY-UTILS and redefine get-tagged-list as above. Then switch back to CL-USER and execute:

CL-USER> (my-utils:get-tagged-list-body 'dog '(x (dog a b c)))
When I do this, Slime under SBCL complains like this:

The variable MY-UTILS::EL is unbound.
   [Condition of type UNBOUND-VARIABLE]

Restarts:
 0: [ABORT] Return to SLIME's top level.
 1: [ABORT] Exit debugger, returning to top level.

Backtrace:
  0: ((LAMBDA (X)) #)
  1: (MEMBER-IF # (X (DOG A B C)))
  2: (MY-UTILS::GET-TAGGED-LIST DOG (X (DOG A B C)))
  3: (MY-UTILS:GET-TAGGED-LIST-BODY DOG (X (DOG A B C)))
  4: (SB-INT:SIMPLE-EVAL-IN-LEXENV (MY-UTILS:GET-TAGGED-LIST-BODY (QUOTE DOG) (QUOTE (X #))) #)
  5: (SWANK::EVAL-REGION "(my-utils::get-tagged-list-body 'dog '(x (dog a b c)))
     ")
  6: ((LAMBDA NIL))
So what is going on here? It actually isn't totally obvious where the problem might be from the somewhat terse error message and stack trace, so I will save you the trouble of process of elimination that I went through. The thing to do is to expand our reader macro in the MY-UTILS package, since that is where the expansion is happening when we evaluate the function. This may give us an idea of what the issue is. Remember, reader macros are expanded by the reader before any evaluation (during the read), so we can see the expansion of a Macro with just a quote.

MY-UTILS> '[(a b c) -> (+ a b c)]
(LAMBDA (COMMON-LISP-USER::X) (A B C) -> (+ A B C))
What has happened here is that the short-lambda-split function has somehow not found the -> symbol, and has decided that the form it read, which is manifestly (a b c -> (+ a b c)), is a short-lambda without an explicit argument list. So it inserts what it believes to the body (the entire expression) into the a lambda with only one argument (here somewhat mysteriously expanded into COMMON-LISP-USER::X). Turns out that COMMON-LISP-USER::X is the real hint we need to figure out what is going on, although our initial instinct might not point us there. What is happening is that the Reader is expanding the macro with a function written in the COMMON-LISP-USER package. Since we are not in that package, Lisp is telling us in this quoted expression that the symbol X produced with a quotating and inserted into the macro list to be exported is actually a symbol which belongs to the COMMON-LISP-USER package, not the MY-UTILS one. This is the key. The code testing whether a symbol in the macro expression list is EQ to the -> symbol is actually properly understood as comparing to COMMON-LISP-USER::->. I believe this is because the quote expression has been evaluated at definition (that is, lexically) or at compile time, resulting in the symbol being interned in the package the code is compiled/defined in. (I would love it if a Lisper could tell me which). What does interning mean? To understand, you have to understand that a package is a collection of symbols, and an interned symbol is one that belongs to a package. By default, Common Lisp interns all symbols entered into the reader with a quote into the current package. So the reader is interning the -> symbol into the package CL-USER, where the split-short-lambda function is defined. Hence, EQ is detecting that COMMON-LISP-USER::-> is not the same as MY-UTIL::->, which actually makes sense - they are different - that is the entire point of packages. (N.B. the MY-UTIL:: is implicit and not printed by the printer in our examples since we are "in" the MY-UTIL package, but the whole short-lambda list is made of symbols automatically interned by the reader into the current package.) So we want to force the split-short-lambda function to intern certain symbols at evaluation time, and into the current package, rather than at "definition" time. We can do this my modifying it in the following way:

(defun short-lambda-split (elements)
(let ((args
  (loop while (and
         (not
     (eq (car elements) (intern "->")))
         elements)
  collect
    (pop elements))))
  (if (not elements)
  ; There is no arg list
    `(lambda (,(intern "X")) ,@args)
    `(lambda (,@args) ,@(cdr elements)))))
Here we use calls to the function intern to force the creation of symbols interned in the current package. Now if we execute our macro, it gives us what we expect. Note that we also interned the symbol x in the no argument case for (basically) the same reasons.

MY-UTILS> '[a b c -> (+ a b c)]
(LAMBDA (A B C) (+ A B C))
What have we learned? We've learned that macros (reader macros, in this case) are hard to debug, and that they interact with the reader and the currently defined packages in strange ways. We've also learned that the package system works by grouping symbols into different packages, and it also interacts with the reader in perhaps non-intuitive ways. Given these complexities, the question to ask is obviously is it really worth five few keystrokes every time we want to write an anonymous function? That depends on how you feel about it and the problems you are trying to solve, but at least you have the choice. One more thing! I promised we would revisit the #: macro character and now that we have an operational understanding of packages, we can figure out what is going on. If you look on the sharpsign page of the hyperspec, we read that #: indicates an "uninterned symbol". The reason we use #: instead of regular symbol is so that we don't pollute our packages with symbols, which are otherwise automatically interned as they are encountered by the reader. Using #: seems to be the conventional way of being a little more tidy about your packages. Questions, comments and (most importantly) corrections are welcome.

Monday, March 10, 2008

Common Lisp Reader Macros: A Simple Introduction

I'm a computational neuroscientist by trade and the language of choice in our lab is Matlab. Despite being occasionally bizarre, and more frequently ad-hoc, and despite lacking some of the features a real language connoisseur might like (higher order functions, closures, some kind of class system that isn't wonky, real anonymous functions), Matlab is very good at what it does, which is manipulate vectors and matrices. Since I spend so much time writing code in Matlab, I frequently end up trying to write Matlab idioms in other languages. For those not familiar with Matlab's style, the main thing is that most functions are "vectorized", which means they automatically map over vector inputs, usually in a smart way. Most of the time, though, you are just doing simple things, like

>> x = [1 2 3 4 5];
>> x.^2
ans =
[1 4 9 16 25]
>>

The most direct way to translate this into Common Lisp is:

CL-USER> (defun ^ (b e) (expt b e))
^
CL-USER> (let ((x #(1 2 3 4 5)))
(map
  'vector
  (lambda (b) (^ b 2.0))
  x))
#(1.0 4.0 9.0 16.0 25.0)
CL-USER>

Which leaves something to be desired, since in Matlab style code, you might have several such vectorized operations on a single line, chained. I've encountered more than one Common Lisper who is suspicious of brevity for the sake of brevity, but the major advantage of Lisp is supposed to be that you grow your language up to meet your problem. Today, at least for pedagogical purposes, our problem is we want to make Common Lisp act a bit more like Matlab. And we are going to do it with reader macros, since our major aim is to express as concisely as possible vector operations. (N.B. - I am no Common Lisp master - there may be conceptual or other errors, but the code at least works!). The first thing is to recognize the pattern we want to abstract away. A vectorial function applications looks like:

(map 'vector *F* *V1* .. *VN*)

What we are going to do is replace this with the following syntax:

#v(*F* *V* .. *VN*)

Which will mean "give me a vector produced by applying *F* to *V1* through *VN*, where *F* has arity N". The CLHS defines, with characteristic obliqueness, a Reader Macro thusly: reader macro n.
  1. a textual notation introduced by dispatch on one or two characters that defines special-purpose syntax for use by the Lisp reader, and that is implemented by a reader macro function. See Section 2.2 (Reader Algorithm).
  2. the character or characters that introduce a reader macro[1]; that is, a macro character or the conceptual pairing of a dispatching macro character and the character that follows it. (A reader macro is not a kind of macro.)
I can only hope that the author intended to separate the subsurreal caveat "(A reader macro is not a kind of macro)" from 2, where it does not seem to belong. The main idea here is that we are extending Common Lisp's Reader - the algorithm that takes text and converts it into data/code for the Lisp Compiler/Interpreter to work with. A reader macro takes one or two characters and sets them up to trigger a function call during the read process wherein user defined code can grab things from the input stream and transform them into an object to give to the Reader. We want to cause the #v character to trigger a function which reads an s-expression from the input, and converts (f v1 .. vn) to (map 'vector (function f) v1 .. vn). We inform Lisp of our intent using the set-dispatch-macro-character function.

(set-dispatch-macro-character
  #\# #\v
  #'vector-map-transformer)

This tells the Reader to call VECTOR-MAP-TRANSFORMER whenever it see the pair of characters #v. All that remains is to define VECTOR-MAP-TRANSFORMER:

(defun vector-map-transformer
  (stream subchar arg)
    (let* ((sexp (read stream t))
           (fname (car sexp))
           (arguments (cdr sexp)))
     `(map 'vector
           (function ,fname)
           ,@arguments)))

The reader expects a function with three arguments, stream, which will be bound to the current input stream, subchar, which is the subchar which triggered the call (in this case \v) and args, which we wont be using, but represent arguments passed to the Reader Macro. The body of vector-map-transformer is pretty straightforward. It extracts the function name from the sexp read from the input stream, grabs the cdr of the sexp to get the argument list, and uses back-quote notation to build a new list which will passed back to the reader. Note we use the function function to get the function bound to fname. This way we don't have to use a pound-quote when typing in vector functions. With these two statements, we can now invoke our reader macro at the REPL.

CL-USER> #v((lambda (x) (+ x 1))
'(1 2 3 4))
#(2 3 4 5)

Sweet. Still, I find that lambda form a bit clunky. And we are always defining small lambdas to map across bodies. Let's create another Reader Macro to expand [x y z -> (+ x y z)] to (lambda (x y z) (+ x y z)). In this case we are going to use the #\[ character as the trigger for our macro. And since we want to stop reading at the #\] character, rather than at an s-expression, we have to take extra care to make sure Lisp is up with everything we want. Instead of using set-dispatch-macro-character we use set-macro-characterthe major difference being that this Reader Macro is just one character, so we pass it a function with only two arguments, instead of three. There is no subchar.

(set-macro-character #\[
(lambda (stream char)
 (let* ((lst
        (read-delimited-list #\]
     stream t)))
  (short-lambda-split lst))))
The function short-lambda-split does most of the lifting in this Macro. Note, though, that instead of using read we use read-delimited-list and terminate the read on the character #\]. This gives us the body of the short lambda form. short-lambda-split then converts this into the sexp to return to the reader:

(defun short-lambda-split (elements)
(let ((args
(loop while (and
       (not
   (eq (car elements) '->))
       elements)
collect
(prog1
   (car elements)
   (setf elements (cdr elements))))))
(if (not elements)
; There is no arg list
  `(lambda (x) ,@args)
  `(lambda (,@args) ,@(cdr elements)))))
This function basically takes the list, finds the -> symbol which separates the argument list from the body and then constructs the lambda expression. If there is no -> the function assumes one argument, named x and that the entire list passed in represents the body of the function. We can write an arity one short lambda like [(+ x 1)] without specifying the lambda argument list. Before we can use the short-lambda reader macro we have to do something about the #\] characters polluting the input stream or they will confuse the reader. The solution is to tell lisp to treat them like #\) (closing parenthesis) characters so that it won't complain (at least I believe this is what is going on). We do that this way:

(set-macro-character #\] 
  (get-macro-character #\)))
We can now use our new short lambda in conjunction with our vector application operator to write some very concise code for vector operations.

CL-USER> (let ((x #(1 2 3 4 5)))
#v([b -> (^ b 2.0)] x))
#(1.0 4.0 9.0 16.0 25.0)
CL-USER>
A word of caution: Macros are always tricky beasts, and extending Lisp this way may only seem like a good idea. In practice, it introduces all sorts of tricky issues. In addition to the regular issues of macro expansion, most of which apply more or less to reader macros, adding new forms to the language confuses tools which might be trying to understand Lisp syntax, like Emacs. I also find that my Lisp (SBCL) can sometimes return confusing errors when reader Macros are in use. So use this information wisely. I nevertheless hope this little tutorial has helped clarify a very interesting ability of Common Lisp. Using these macros, vectorized numerical code is much shorter and easier to understand. I used these macros to write some code for a class I TA where Matlab is the language of choice. Most of the time, these simple extensions put the Common Lisp implementation on the same footing, brevity-wise, as Matlab and I got to program in Lisp, which is a lot more fun. Comments and questions welcome. PS - I apologize for any confusing indentation - Blogger is being very disrespectful with my pre and code tags.

Sunday, March 2, 2008

Suburban Locale By Night

To walk at night in a suburb is to see the mundane transformed. While I ordinarily eschew the idea that even the most ordinary of things can be almost mythical under the right eye, the suburb transformed by night may be an exception. It is true, that some elements of the experience, such as the indistinct faces peering up from the shadows away from the streetlights (stumps, sandbags, fire hydrants), the flickering willow wisp floating along a distant lot (a woman, walking her dog with a flash light) , the occasional sound of a twig snapping, are the exclusive effect of the world shrouded in darkness, and might be experience in nearly one form or another anywhere. Yet the suburbs offer stranger visions to the late night wanderer. On a cool night, the air is clear, free from the haze of summer, and the living rooms, many of them lit either fully or with the dim glow of night lights, seem surreal in their immediacy, their existence somehow undiminished by the intervening distance to the sidewalk. Objects within are distinct, and appear almost transfixed - contextualized yet frighteningly devoid of meaning. One living room was filled with balloons, hanging motionless like ghosts - guests at a party for no one. A trick of perspective conspires with the neutral color of the ceiling to make it seem as though they float, in mid air. Coat racks become, in the low light, lovers in embrace, or lonely men, leaning in sorrow against the wall. Those living rooms which have been left fully lighted by the onus of their owners present a subtly different kind of imagery. Mundane decorative flairs take on the aspect of consuming obsessions in the unforgiving clarity of the bay windows. A room with an entire wall of clocks, of every shape and size, all their faces pointing at the window, speaks of a ragged mania to the night outside. The legs of couches like the claws of animals, along with brass and stone statuary, grimace in another room. In what might be relaxing during the day one perceives an almost living tension. There is the fear that if these rooms are left too long empty, maddeningly lit, their manic aspect might cease to vanish with the day, and leak out into the world you stand in. One finds himself rounding the top of a hill, upon which the most expensive houses were perched. The street, well lit and untrafficked, is surrounded on all sides by carefully manicured lawns, lightly humming street lights, recessed lighting on brick paths winding to front doors. Then this imminently human world ends abruptly, and one is suddenly faced with a stand of trees or a forest, it is impossible to gauge the depth of the woods - beyond the first few trees the world is engulfed in darkness. One must resist the voyeuristic urges which it is impossible to avoid on these walks. Living rooms, often abandoned, are flanked by lit windows, open or covered tantalizingly with blinds or curtains. The motion of humans within is rare, late at night. More often you see the strange blue shifting light produced only by televisions. Occasionally a bedroom window reveals a face nude of expression, staring at a computer screen - the most disturbing and least erotic kind of nakedness. From outside, on the street, moving at a regular pace, the dream like nature of television, that window inside windows, becomes shockingly apparent. The images move silently on the screen with an almost terrifying disregard for continuity - and removed from sense they flit in and out of your consciousness, unable to be held. And like dreams, they vanish as soon as you pass the window showing them, waking and walking again into the night.