Tuesday, May 22, 2012

Parenlab Updates!

Nothing makes software grow like using it every day. I've been doing all my Matlab coding in Parenlab (read about it here and get it here) for the past few weeks and the code base has developed significantly. Now that there is a functioning "require" system its time to document some of the developments.

setq extensions

Matlab is not very functional, whereas Lisp is. In Lisp I've come to see the use of setq as code smell, but its hard to avoid in Matlab. In fact, lots of my Matlab code involves long stretches of assignment. Parenlab lets me escape some of this by allowing more functional idioms, but I decided it would be best to add better support for assignment. To that end, setq can now be written as either:

(setq x 10)

or

(:= x 10)

And can contain an arbitrary number of name/value pairs, eg:

(:= x 10 y 11 z 13)

Setting occurs sequentially, from left to right. Matlab provides the deal function for parallel setting, eg:

(:= [x y z] (deal 10 11 12))

Sets x,y and z "at the same time" with respect to the evaluation environment.

Implicit Single Return Value Functions

Matlab highly encourages functions with multiple return values, and part of that idiom is forcing all returned values from a function, even if there is only one, to have a name. The function interface definition on the first line of a function file, eg:

[a,b,c] = function someFunction(q,r,s)

Specifies the names of the output arguments (a, b, and c, in thsi case), and setting their values in the function body determines how they are returned.

In parenlab, therefore, one writes defuns like this:

(defun (a b c) some-function (q r s) ...)

And in the body, uses setq to assign return values. This is fine, but feels a bit odd coming from Lisp. So now, if you write:

(defun some-function (a b c) ...)

and the last expression in the body of the function is a value producing expression, rather than a statement, parenlab automatically handles assigning a single return value, the last value in the body. That last qualifier about value producing statements is a bit of a drag, since many Matlab and parenlab forms don't produce values, but I am working on resolutions to this issue. Of which more later!

Nested function definitions

At the top level, defun creates a new function in a new .m file, which is the idiom for defining functions in Matlab. Inside functions, however, one can define internal functions using the same defun syntax. These internal functions are not subject to the restrictions of regular anonymous functions: they can have full bodies and side-effect the lexically scoped environments they are created in. Parenlab now supports such nested function definitions and, via this feature, much more functional anonymous functions in this context.

For instance, using the single-return value feature above and this new capability, the following code:

(defun demo-nested (a)
 (defun add-a (b)
  (+ a b)))

transcodes to:

function [o111218] = demoNested(a)
%
o111218 = @addA;

function [o111219] = addA(b)
%
  o111219 = plus(a, b);

  end

end

Inside a function definition, a defun can appear anywhere. It is transcoded as a reference to the function to be defined, which is transcoded at the bottom of the enclosing function, where it can capture the lexical scope. Anonymous functions inside other functions are expanded into defun's rather than @ style anonymous functions, eg we could write instead:

(defun demo-nested (a)
  (lambda (b)
    (+ a b)))

And get the same effect. Parenlab will assign a gensymed name to internal anonymous functions.

Better let, let* and progn support inside functions

Parenlab used to always expand let and its brethren in terms of matlab lambda (@) expressions, which can only contain in their body a single value producing expression, and which capture a static copy of their lexical scope, which forbids side effects. Inside functions, this is no longer the case, and lambda, and all the terms that depend on it, like let and let* now expand into full function calls, which means they can have arbitrary bodies with side effects on their lexical scope.

To remind people of this distinction, lest they be surprised that this doesn't work at top-level, the bodies of lambda expressions and lets still require a progn to express the intent for multiple expressions. Matlab's somewhat onerous restrictions on using value-bearing expressions anywhere a value needs to be still apply, however. You can't say:

(lambda (a)
  (setq b a))

Because setq doesn't have a return value. This turns out not to be something you frequently want to do, but bear it in mind.

require expressions.

Parenlab now allows you to express simple project dependencies via require. For instance, the expression:

(require
"~/src/elisp/parenlab/monadic-parser-combinators.parenlab")

Causes parenlab to transcode the contents of the specified file and add its location to the Matlab path. The file is not executed, however. Parenlab maintains a dictionary of hashes for each require file and only recompiles a file when its been changed since the last invokation for require on it. This means that require statements are cheap, generally, as compilation is only induced when needed.

direct execution of Emacs Lisp

Parenlab macros are defined in Emacs Lisp, rather than in Parenlab itself. Complex macros often require a library of utility functions to do their work. Parenlab lets you define these Elisp functions inline via the elisp form. elisp executes its body in Emacs Lisp during compilation, so you can write:

(elisp 
  (defun valid-bindingp (o)
    (and (listp o)
         (= 2 (length o))
         (symbolp (car o)))))

(defmacro my-with (binding &rest body)
   (assert (valid-bindingp binding)
           () "Binding must be a 2 el list whose car is symbol.")
   `(funcall 
      (lambda 
        (,(car binding)) (progn ,@body)) ,(cadr binding)))

Stupid Language Tricks:

Keywords, which ordinarily transcode to strings, now behave specially if they are in the function location during an application or in a (function :x) form. When this happens, they are transcoded to struct access expressions. So, you can say:

(:x (struct :x 10)) -> 10

Or

(funcall #':x (struct :x 10)) -> 10

I admit, this is a silly feature, but remarkably convenient.

Mostly arbitrary expressions in the function position:

If a non-symbol or non-keyword is encountered in the function position of an expression, it is evaluated and its value is used as a function, which you can't do in Matlab. So:

((lambda (x) (+ x 1)) 10) -> 11

All Sorts of Awesome Standard Macros

There a lots of iteration related macros, for instance forcell

(forcell (index value) cell-array 
         ...)

Takes some of the pain out of working with cell arrays with the Matlab loop feature, which gives you a one element cell array for each value in the array by default. Poke around in the code for other iteration expressions that make life more pleasant.

The macro capture collects the current environment into a struct and returns it as a value. This is handy for debugging functions. There is also now support for try and catch

Extended standard library.

Lots of handy little functions like directory-files, and the like.

Further refinements to parenlab-aux.el

This is still specific to my Matlab setup, but parenlab-aux now does a better job of simulating the Lisp experience. For instance, parenlab-eval-last-sexp now makes an honest attempt to print the result to the mini-buffer if that makes sense. If anyone actually wants to work with parenlab, contact me and I'll help set things up.

Monadic Parser Combinators

Part of my long term plan is to eventually make parenlab self hosting, the largest obstacle to which is a good Lisp reader. I like writing parsers in a monadic style, so now parenlab comes with a parser combinator library, which you can use by saying:

(require
"~/src/elisp/parenlab/monadic-parser-combinators.parenlab")

This builds the library and adds it to the path. You can then write a simple vector parser like this:

(:= =vector
     (parser ((ignore (=>string "["))
              (numbers (=>zero-or-more #'=number))
              (ignore (=>string "]")))
         (cell2mat numbers)))

There is rudimentary support for parsing a subset of Lisp via the =sexpression parser in this library. S-expressions are parsed into nested cell arrays. There isn't support for quotation or sharp quotation yet. And the error messages aren't useful. Might eventually move to a different monad for better error reporting.

Conclusions

Parenlab is definitely a usable Lisp, for me - I'm already a lot more productive in Matlab than I would be in the base language, and I haven't run into any performance issues, though your mileage will vary if you use this with Octave. Please let me know if anyone starts using the library. Criticism, suggestions and contributions are more than welcome!


1 comment:

Isaiah said...

http://docs.julialang.org/en/latest/manual/metaprogramming/

JIT'd Lisp in matlab clothing, built on LLVM.