Monday, March 26, 2012

Update to Shadchen-el and question for any Users

Shadchen is my pattern matching library, simultaneously developed in Emacs and Common Lisp. I've updated quite a few things in the Emacs Version which will probably be of interest. But I also want to get some Emacs User feedback about the library.

New Features

Patterns

There are a few new patterns in the library:

(one-of PATTERN)

Will match the first element in a list which matches PATTERN.

(list% P1 ... PN)

Will match a listt of N elements, but without concern for order. All N patterns must match a unique element, but they can do so in any order. When a pattern is matched, the value is removed from the list that subsquent patterns see.

Similarly:

(list% P1 ... PN)

Matches N patterns from a list the match will succeed even if all elements from the list are not matched. This is perfect for matching an association list, where associations are operative but order is unimportant.

Eg:

(match '((:x . 10) (:y . 11))
 ((list%+ (cons :y y-value)) y-value))

Will match to 11, regardless of the order of associations in the list or how many there are.

Forms

There is one new form: match-let behaves as a let expression with two distinctions. Each place a symbol to bind can appear in a let a full Shadchen pattern can be used instead.

The second feature is that within the body of a match-let a call to recur in a tail position will result in a re-entry into the match-let expression with the new values provided. So you can write simple, recursive iterations using match-let:

(match-let 
 (((list x y) (list 0 0)))
 (if (< (+ x y) 100)
     (recur (list (+ x 1) (+ y x)))
   (list x y)))

Note that recur expects as many arguments as there are binding expressions in the let. Failure to provide them will result in a match-fail.

Questions for any Users

Question: should match-let ensure at read time that recur only appears in "tail" positions?

Match-let, unlike my library recur, uses a simple, but robust trampoline scheme to implement recursion optimization. That is, recur returns a tagged list (I'm probably going to change this to a struct-type to improve type errors) which an enclosing "trampoline" checks. If the value returned from a match-let body is any value but the specifically tagged list, that value is returned, but if the value is the tagged list, then the match-let body is re-entered.

Recur, in this case, is a local function furnished by labels. It simply returns the tagged list. That means it can be called anywhere in the body of the match-let expression. This, however, will almost certainly fail to provide a meaningful behavior.

A codewalker can check whether recur is used in the correct locations, but only at some computational expense and only after the body has been macroexpanded. This may result in subtle bugs and performance deterioration. One the other hand, the result will be more robust, especially for users not entirely clear on what "tail" position means.

Should I implement static checking for tail position?

2 comments:

Ryan said...

As long as the check happens when the macro is expanded, there should only be a one-time performance penalty, and if the match-let form is compiled, the check would happen at compile time and have no runtime penalty at all, right?

J.V. Toups said...

If you byte-compile the file, then yes, although this doesn't resolve the issue of mucking about with the order of macro expansion, which may or may not be significant.

But in regular elisp, rather shockingly, macro expansion happens repeatedly every time a form is evaluated, even if its already been read. That is, if you use a macro in a loop, it will be expanded as many times as the loop executes, or so I've been given to understand.