var link = document.getElementById("logo-link");link.setAttribute("href","https://www.google.com/reader/view/#overview-page");Soon to be replaced with an Elaborate Platinum Mechanism. Ok, but seriously the blog is about programming and technology.
var link = document.getElementById("logo-link");link.setAttribute("href","https://www.google.com/reader/view/#overview-page");I am two people - one of them is a numerical scientist who spends most of his time writing code in Matlab to do data analysis. The other is a freelance Lisp programmer, where I do regular software engineering. I've long been a Lisp enthusiast, and I do a fair amount of Lisp programming in Emacs Lisp as part of my scientific work, since Emacs is my Matlab IDE, but I didn't start really programming in Lisp for "real things" until recently. As the number of hours dedicated to Lisp increased in my life, I began to really miss the features in Lisp that I didn't have in Matlab; in particular s-expression motion and editing and metaprogramming.
I can't abandon Matlab, however. Matlab's support for numerical programming is exceptional, true, but the real value is that its plotting system is rich, well integrated and customizable. Plus, I have 7+ years of Matlab utilities which it would take significant time to reproduce in another numerical analysis language, even if it was feature complete otherwise.
So I brought the mountain to Muhammad.
Parenlab is a Lisp which compiles to Matlab by way of Emacs Lisp. What this means is that the syntax is Emacs Lisp and the Semantics is Matlab. Metaprogramming is possible via macros which transform the elisp representation of Matlab code before the code is translated to Matlab. Because I use Emacs as my Matlab IDE, this is a reasonable solution - just use the Emacs Lisp interpreter as the macro language. One day, Parenlab may be self-hosting.
The parenlab code:
(let* ((x (range 0 (* 2 pi) 100))
(y (sin x)))
(plot x y :color "g"))
Is converted to:
funcall(...
@(x)funcall(@(y)plot(x, y, 'color', 'g'), sin(x)),...
range(0, mtimes(2, pi), 100))
Or:
(defun (a b c) some-function (a b c)
"Documentation."
(setq a (++ a b c))
(setq b (++ a b c))
(setq c (++ a b c)))
Is converted to a file, called someFunction.m which contains:
function [a,b,c] = someFunction(a,b,c)
%Documentation.
'Documentation.';
a = plusplus(a, b, c);
b = plusplus(a, b, c);
c = plusplus(a, b, c);
defuns can be sprinkled throughout your script files - they will be
transcoded as they are encountered and produce no output in the
currently building function. One can also say:
(script some-script-name
<parenlab-code>)
Which indicates the code beneath should be transcoded to a script file rather than "in place."
Parenlab is still a work in progress, but it is rapidly approaching the point where any functionality you wish can be expressed in Parenlab.
Parenlab, like Parenscript, tries to make life easy by letting you write lisp-style identifiers. In doing so, it mangles names during translation. Eg, a parenlab symbol:
some-function
Will be rendered as:
someFunction
eg, dashes followed by letters are converted to camel case. Other "special" characters are transcoded according to the following table:
("+" "plus")
("-" "minus")
("*" "mtimes")
("<" "lessThan")
(">" "greaterThan")
("$" "cash")
("=" "equal")
("!" "bang")
("?" "who")
(":" ":")
("/" "divide")
("\\" "mdivide")
("#" "hash")
("@" "at"))
These choices are made to improve transcoding, so that many operators
don't need special cases. Eg (* 10 10) translates to
mtimes(10,10) - conveniently using Matlab's built in mtimes
function, which is equivalent to Matlab's * operator.
One wrinkle is that : is reserved to preserve simple matrix
construction expressions. For instance,
x:some-variable:y
transcodes to
x:someVariable:y
This syntax is limited by how the lisp reader reads symbols. For
complex generation expressions, use the (: ) macro. Eg:
(: start step stop)
Keywords, that is symbols starting with :, are transcoded to mangled
strings, so that :a-keyword becomes 'aKeyword'.
Since many matlab functions use strings as keyword arguments, this lets you use keywords for them instead.
nil transcodes to []. There are no true and false values in
Matlab.
Scope in Matlab is funny - only limited lambdas are allowed in scripts and functions, although nested functions with full definitions and full lexical scope behavior are allowed within functions (though not anonymously).
Sort of like in Python, lambdas are restricted to
single-expression-only bodies. While they do form closures over
their lexical environment, those closures are static, that is, they
do not permit side effects of any kind.
That is,
(setq y 10)
(setq f
(lambda (x)
(setq y x)))
Will compile fine in Parenlab, but produce an error, because the inner
setq is not an expression, but a statement, which isn't allowed.
Parenlab does provide a progn form, so you might try:
(setq y 10)
(setq f
(lambda (x)
(progn (setq y 10)
nil)))
This will generate Matlab code which uses eval to produce code which
expresses this intent, but you'll still get an error, saying that the
lambda tries to modify its static environment, which is not allowed.
let and let* are implemented using lambda, so they create static
lexical environments. It is conceivable that major cross compilation
tricks could be used to simulate dynamic scopes, but I'd prefer to
keep things simple.
You shouldn't be using side effects anyway.
The rule of thumb is write Matlab code with s-expressions, not Lisp code that calls Matlab functions. Parenlab tries to bridge the gap but in some ways Matlab is too limited.
If in Matlab is "flat", in that each branch is a series of statements or expressions and the if statement itself doesn't return a value. Parenlab if statements are not flat - they return the value of whatever branch is evaluated.
This rule is sometimes inconveneint for the Matlab idiom, so it can be
disabled if the branches are blocks. A statement of the form:
(if condition
(block
<some-code>)
(block
<some-other-code>))
Transcodes to:
if condition
some code
else
some other code
end
Ordinarily, if expands to
fif(@()condition, @()true-code, @()false-code)
Here we use the trick of using lambda to delay evaluation. This if
expansion cannot perform side effects, unfortunately. Use (block)
legs if you want regular matlab semantics. Both legs must be blocks.
Indexing is identical function calls, syntactically, so the index expression:
x(1:10)
Is written as
(x 1:10)
In parenlab. Matlab also has cell array indexing, which looks like this:
x{1:10}
Which is expressed like this in Parenlab:
({} x 1:10)
These are macros, not functions, and so the use of end as an
identifier inside them is fine, eg:
({} x 1:end)
Will convert to:
x{1:end}
Structure access can be written as symbol.name since the parenlab
mangler leaves dots in symbol names. Programmatic access, where a
string is used, is written with ->, eg:
(-> s :field)
Which is equivalent to s.('field'). -> supports nested access, so
you can say;
(-> s :f1 :f2 :f3)
Which is like this: s.('f1').('f2').('f3') which may not actually
be valid Matlab syntax. Parenlab implements this as a function call.
In elisp, parenlab macros can be defined with pl:def-pl-macro.
During transcoding, the form defmacro causes a new parenlab macro to
be defined and generates no output. Parenlab macros have the same
semantics as regular Macros. You get syntax passed to the macro as
arguments, you transform it into valid parenlab, and you return the
result.
For instance, a macro which introduces a lexical variable binding
called with, like this:
(with x 10 (* 2 x)) ;-> 20
Is implemented like this:
(pl:def-pl-macro with (symbol value expression)
`(funcall (lambda (,symbol) ,expression) ,value))
Or, inside some Parenlab code:
(defmacro with (symbol value expression)
`(funcall (lambda (,symbol) ,expression) ,value))
Right now you use parenlab by invoking pl:transcode on an
s-expression. This will transcode in the current buffer.
pl:transcode-to-string will transcode to a string instead.
Parenlab requires matlab-mode so that it can indent its outputted
code correctly.
I have a highly idiosyncratic Matlab setup, so I put some integration
code in auxilliary.el. If you have a Matlab process running in a
buffer called *evalshell*, then this code will let you load a
parenlab file and press C-c C-c to "compile" and execute the code
therein. This process will write any script and defun forms to files
before executing the code. C-x C-e will evaluate the last
s-expression as Matlab code in the interpreter.
Parenlab depends on Shadchen-el my pattern matching library.
Lately I've been working on an Emacs Lisp library I'm pretty proud of,
Shadchen. It implements extensible pattern matching, somewhat
like Racket's Match facility for Emacs Lisp. In this
tutorial/introduction I'll explain how to use Shadchen's various
facilities and, most importantly, how to extend Shadchen itself with
new patterns, which is an interesting subject in and of itself,
combining compile time and run-time execution in interesting ways. If
you already know about pattern matching, feel free to jump to the end,
where I talk about writing non-trivial patterns using Shadchen's
defpattern.
Shadchen solves several problems you may not know you may have. In
one sentence, Shadchen lets you concisely express both destructuring
and type checking for complex data structures. It can be thought of
as combining the features of cond, case and assert into one nice
package. My experience is that this collection of features helps me
write better code, since it encourages me to dilineate exactly the
kind of data a function or form expects before doing anything with
it.
Shadchen is also conservative - unless you provide a pattern that matches the input data, it will fail with a match error, so you know something is wrong before something strange happens.
For instance, suppose we were writing an interpreter for Lisp. It might look like:
(defun eval (form env)
(cond form
((symbolp form) (eval-symbol form env))
((listp form)
(case (car form)
(if (handle-if (cadr form)
(caddr form)
(caddr form) env))
(let (handle-let
(cadr form)
(cdddr form)))
...))))
Note that we have both a cond and a case here, and that after we
test our data, we destructure it. Here is a similar piece of code
using Shadchen:
(defun eval (form env)
(match form
((p #'symbolp s) (eval-symbol s env))
((list) nil)
((list 'if pred true-branch false-branch)
(handle-if pred true-branch false-branch env))
((list-rest 'let (list-rest pairs) body)
(handle-let pairs body env))
...))
(Notes: the pattern p passes when the predicate as its second
argument is true on the match value, and then matches against the
third argument. So (p #'symbolp s) matches only when form is a
symbol and then binds s to that symbol. list matches when the
input is a list and each pattern in the the list pattern matches each
corresponding element in the list. list-rest is similar but any
leftover parts of the list are matched against the final pattern.)
This code is more concise, and yet it is also much more explicit, both
in that it provides better naming for values and it provides more
explicit error checking. For instance, this version will only match
if with three expressions, where as the previous evaluator would
have been fine with the expression (if a b c d e f). This version
also asserts explicitly that the binding part of the let needs to be a
list. With a custom pattern we could also ensure it was a list of
symbol/expression pairs in almost the same space.
It takes all kinds, but I found that once I got used to programming with pattern matching, it was hard to go back.
Shadchen wants to let you program in a functional style. To that end,
in addition to the regular pattern matching form match, it also
provides some other, nice features. Many algorithms involve examining
some intermediate data, checking its structure somehow, and then
recursively processing the next step. Shadchen allows this kind of
thing with the Scheme-flavored match-let form.
The form match-let can be used exactly like let:
(match-let
((x 10)
(y 11))
(+ x y))
But in each "binding" pair, the symbol may be replaced with any Shadchen pattern. Eg:
(match-let
((x 10)
(y 11)
((list q r s) (list 1 2 3)))
(+ x y q r s))
Will give you 27. If any pattern fails, the form produces a match
fail error, which means you can use match-let as a let form with
tidy type checking.
Finally, a match-let form allows tail recursion. Invoking recur
in a tail position inside the form causes the match-let to be
re-entered without growing the stack. For instance:
(match-let
(((list x y) (list 0 0)))
(if (< (+ x y) 10000)
(recur (list (+ x 1) (+ x y)))
(list x y)))
Results in (141 9870) and can't blow the stack. It is an error to
invoke recur in a non-tail position, but because of limitations in
Emacs Lisp, it is difficult to enforce this statically.
The form defun-match lets you write functions which pattern match on
their arguments and split their calculations across multiple bodies,
in a bit like the style of Shen or Haskell.
For instance, suppose we have an animal simulator, where each animal is represented by a list, the first element of which is a symbol representing the animal name. We can say:
(defun-match- vocalize ((list-rest 'cat properties))
"Cat vocalization."
(message "Meow"))
(defun-match vocalize ((list-rest 'dog properties))
"Dog vocalization."
(message "Woof"))
Then:
(vocalize '(cat :name tess))
(vocalize '(dog :name bowzer))
Functions defined with defun-match can also use recur to re-enter
themselves without growing the stack. Consider a function which
causes a list of animals to vocalize:
(defun-match- vocalize-list (nil) nil)
(defun-match vocalize-list ((cons animal animals))
(vocalize animal)
(recur animals))
recur can dispatch to any of the bodies defined for the function and
it doesn't grow the stack. It must be invoked from tail position,
though non-tail calls can be affected by simply calling the function.
(N.B. defun-match- with that dangling minus sign causes previous
bodies to be expunged before defining the indicated body.)
Shadchen is an extensible pattern matching facility. We can define
new patterns much in the way we define new functions, although
patterns are more like macros than functions. Let's look at a simple
example, and then I'll guide you through a more complex example I just
added to the library using the defpattern.
A quirk of Common and Emacs Lisp is that (car nil) is nil even
though nil is not a cons cell, and so does not have a car or a
cdr. I hate this behavior, because its quite evident that (cons
nil some-list) is different from nil, but car can't tell that -
the user has to do more inspection to find this out. Bugs waiting to
happen, let me tell you.
However, I'm nothing if not accommodating, and so the cons pattern in
Shadchen will, in fact, match against nil. So:
(match nil
((cons a b) (list a b)))
Will be '(nil nil). Let's define a pattern which is like cons,
but only matches against actual cons cells, into which category
nil fails to fall.
(defpattern strict-cons (car cdr)
`(p #'consp (cons ,car ,cdr)))
A defpattern body must evaluate to a legal shadchen pattern. Each
argument to the defpattern is also a shadchen pattern. So this
pattern reads "define a new pattern strict-cons, which first checks
that the match value is a cons cell using the p pattern, and then
matches the car and cdr of that cons cell against the patterns car
and cdr.".
During the expansion of a shadchen pattern matching form, user defined
patterns are looked up and their expansions are inserted into the
macro expansion. In short, defpattern allows you to define new
patterns in terms of old patterns.
This might seem very restrictive, but Shadchen provides primitive patterns that allow you to write arbitrarily complex pattern matchers that can perform rich computations on their way to rejecting or accepting a match.
concat, a non-trivial patternI just used defpattern to implement a pretty complex pattern,
concat and it was something of a learning experience. Writing
complex patterns definitely takes some thought and practice, but
hopefully this tutorial will bootstrap users to a point where their
own patterns can be implemented without too much pain.
What is so complicated about a concat pattern? Well, we want
concat to match the concatenation of patterns which match strings.
Eg:
(concat "dog" "cat")
Should match "dogcat". Writing a pattern that has this behavior is easy:
(defpattern concat (&rest strings)
(reduce #'concat strings))
This pattern can't match subpatterns that are anything other than strings, however. We'd really like to be able to match, for instance:
(concat (and (or "dog" "cat") which) "dog")
against either "dogdog" or "catdog", binding which to whatever the
initial string contents actually are. How can we do this?
concat's semantics.We want concat to function this way:
If the initial pattern is not a string, then try matching that pattern
against larger and larger substrings until either you run out of
string to match against, or you match. If you match, then match,
again using concat with the unused patterns, against whatever is
left of the string after you've removed the part that matched.
Repeat until all patterns are exhausted and then make sure the string
has been completely consumed too.
If the initial pattern is a string, then just cleave off the same length of characters from the input, and if they match, recursively match the rest. Here is the entry point:
(defpattern concat (&rest patterns)
(cond
((length=0 patterns)
"")
((length=1 patterns)
`(? #'stringp ,(car patterns)))
(:otherwise
(cond
((stringp (car patterns))
`(simple-concat ,@patterns))
(:otherwise
`(full-concat 0 ,@patterns))))))
The :otherwise has all the meat, but we defer it to to other
helper-patterns; simple-concat and full-concat. Simple concat
looks like this:
(defpattern simple-concat (&rest patterns)
(cond
((length=0 patterns)
"")
((length=1 patterns)
`(? #'stringp ,(car patterns)))
(:otherwise
(let* ((the-string (car patterns))
(static-len (length the-string)))
`(and
(p #'stringp)
(p (lambda (s)
(>= (length s) ,static-len)))
(p
(lambda (s)
(string= (substring s 0 ,static-len) ,the-string)))
(funcall (lambda (s)
(substring s ,static-len))
(concat ,@(cdr patterns))))))))
Look at the backquoted expression. It is an and pattern, which only
succeeds if all the patterns beneath it also succeed. These patterns
are (p #'stringp), which asserts that the input is a string, (p
(lambda (s) (string= (substring s 0 ,static-len) ,the-string))) which
asserts that the input is at least long enough to contain the string
we want to match against. The next form asserts that the substring of
the input equal to the pattern string in length is equal to the
pattern. If this is true, then the pattern matches, and we use the
funcall pattern to match against the rest of the string with the
leftover patterns.
The funcall pattern takes the input to the match, applies a
function to it, and then matches the output of that function
application to the pattern provided as its third slot.
full-concat is more complex. Note that when we invoke
full-concat we provide it an numerical first argument. This number
tells the pattern how far into the string to match we've looked, so it
starts at zero. After all the first pattern could match the empty
string. full-concat looks like this2:
(defpattern full-concat (pivot &rest patterns)
(assert (numberp pivot)
()
"Pivot should be a number.")
(cond
((length=0 patterns)
"")
((length=1 patterns)
`(? #'stringp ,(car patterns)))
(:otherwise
`(and
(p (lambda (s)
(>= (length s) ,pivot)))
(or
(and (funcall
(lambda (s)
(substring s 0 ,pivot))
,(car patterns))
(funcall
(lambda (s)
(substring s ,pivot))
(concat ,@(cdr patterns))))
(full-concat ,(+ pivot 1) ,@patterns))))))
Here we use and again. We first check that the input string is long
enough to grab the substring indicated by pivot. If this isn't true,
the match fails. We then use the or pattern to indicate a branch.
Either of the or patterns might succeed, but the first to do so is
the only one that will happen. The first pattern to or uses funcall
to peel off the substring of the input from 0 to pivot. If the
initial pattern matches, then we use funcall again to get the rest of
the string, and invoke concat again.
If this match fails, then we invoke full-concat again, but increment
the pivot by one, indicating that we want to check against a larger
substring.
If this is confusing, and it is understandible if it is, remember the
following: when writing defpatterns, or is used for flow control,
and is used to assert multiple things about the input, p is used
to assert individual arbitrary conditions on the input, and funcall
is used to transform the input for further matching. Recursive
pattern expansion is used for iteration 1.
And feel free to contact me with questions, if they come up.
1 It is a lot like writing prolog, actually. Pattern matching is a significant distance from lisp to prolog.
2 After writing this I realized we can do better. If we get a match for the initial pattern, and then check the rest of the patterns, its possible they will fail because the initial match didn't consume enough of the string. It is simple to say, "if the subsequent match fails, keep increasing the pivot and trying again." I leave it as an exercise to the reader to figure out how to represent this trivial backtrackingish thing - but you can always check the source for the solution.