Wednesday, May 23, 2007

Making A Mess with Unix Pipes

Tarnation is Java ever slow - particularly if you ever need to use it over a network via X-Windows. I do a lot of work with Matlab, and I like it, but I never, ever use the built-in GUI and/or editor. For one thing, I have a happy and long-term relationship with Vim (which is much more flexible than any proprietary editor) but besides that, I do a lot of my work from my laptop away from the office. Running the Matlab GUI on the machine might be ok, but you definitely can't get away with running it over a wireless internet connected in Carborro, NC. So - just for the sake of uniformity - I find myself using Matlab almost exclusively from the command line in text mode:

matlab -nodesktop -nosplash
One nice feature of Matlab is that the interpreter is integrated with the ide. If you are using vanilla Matlab with a UI typing edit filename automatically identifies the file you are interested in (by searching the currently active Matlab path) and opens it for you. Since everyone who is reading this knows about the efficiency of having an exposed read-eval-print loop in terms of development, it can be seen how useful this feature is. If you identify something wrong with a file you just type edit and its name and it opens up. This does not work on the command line. Typing edit will just open up the GUI editor. You can set up the system to open up a text-mode editor, but this blocks the interpreter session until you are done. This is not good for incremental development. What I wanted was some way of keeping the interpreter open while I edited a file. One option was to set the built-in editor to open an xterm with pico or vim in it but this would get messy pretty fast if you wanted to have a lot of files open at once. I did this for awhile because I couldn't think of any better solution. A separate problem ultimately led to an excellent solution. Sometimes I wanted to launch a long job (simulating a 5000 neuron idiot in a hopelessly impoverished universe, for example), leave (maybe even shut down my laptop) and then come back and pick the job up later. The systems administrators suggested the utility GNU Screen. Screen is a terminal multiplexer, which means that you can use it to open several "virtual terminals" inside a screen session. The session itself is attached to one or more real terminals. Invoking screen just gives you a new shell which looks just like your old one. But you can now trigger commands by typing -A to do all sorts of new things. -A C creates a new terminal in your session. You can switch with -N and -P (next and previous) and kill a terminal with -K, although just exiting the terminal will also do the job. Pressing gives you a little command line where you can issue more complex commands. Executing screen from a command line within a screen session does not create a new session - it just creates a new window with whatever follows the screen invocation in it. If you were inside screen and you typed screen vim "test" you'd get a new window with Vim editing test inside of it. Finally, pressing -D detaches the screen. It hangs out in memory for you to come back and pick it up with screen -r. Even if you log out. Astute readers can see that Screen solves both of my problems. I could log into a computer, start a screen session, launch a job, detach the session, log out, eat dinner, log back on and re-attach the session. Just what I wanted. I eventually realized I could shadow Matlab's built-in edit command so that it would be aware of screen and let me open new Vim windows with it. I copied the edit.m file from the Matlab built-in directories and put it on my path in front of the built-in function. I then added the following lines:

function edit(varargin)
%EDIT Edit M-file.
%   EDIT FUN opens the file FUN.M in a text editor.  FUN must be the
%   name of an M-file or a MATLABPATH relative partial pathname (see
%   PARTIALPATH).
%
%   EDIT FILE.EXT opens the specified file.  MAT and MDL files will
%   only be opened if the extension is specified.  P and MEX files
%   are binary and cannot be directly edited.
%
%   EDIT X Y Z ... will attempt to open all specified files in an
%   editor.  Each argument is treated independently.
%
%   EDIT, by itself, opens up a new editor window.
%
%   By default, the MATLAB built-in editor is used.  The user may
%   specify a different editor by modifying the Editor/Debugger
%   Preferences.
%
%   If the specified file does not exist and the user is using the
%   MATLAB built-in editor, an empty file may be opened depending on
%   the Editor/Debugger Preferences.  If the user has specified a
%   different editor, the name of the non-existent file will always
%   be passed to the other editor.

%   Copyright 1984-2004 The MathWorks, Inc.
%   $Revision: 1.1.6.7 $  $Date: 2004/02/01 22:02:31 $

if desktop('-inuse') % I add this if
    % MATLAB does a lot of stuff to figure out and format the name of the file
    % associated with the arguments.
else 
    for i=1:nargin
        xedit(varargin{i}); % pass to a utility which launches vim inside screen
    end
end
This detects to see if I am operating in a Desktop (I rarely do this, but when I do it seems ok) and if I am not it dispatches the arguments given to edit to xedit, which looks like this:

function mvim(filename, editor)
% FUNCTION MVIM(FILENAME) launches a VIM window to edit the specified file
% in the the current working directory in a new xterm process.

str = which(filename);

if ~exist('editor')
    editor = 'vim';
    opt = '';
else
    editor = 'emacs';
    opt = '-nw';
end
    

if strcmp(str, 'variable') | isempty(str)
    ii = find(filename=='.');
    if isempty(ii)
        str = [filename '.m'];
    else
        str = filename;
    end
    disp(sprintf('Opening %s...', str));
elseif strcmp(str,'build-in')
    disp(sprintf('%s is a built-in function...',filename));
    return;
else
    disp(sprintf('Opening %s...', str));
end

ii = find( str == '/' );
if length(ii)>1
    ii = ii(end-1);
    tstr = str(ii:end);
else
    tstr = str;
end
%system(sprintf('xterm -font 8x16 -T "%s: %s" -e %s %s %s &', editor, tstr, editor, opt, str));
system(sprintf('mvim.py %s', str));

This code in turn calls a Python program located on my Unix path which dispatches the command to open vim. Why do this? Because Matlab is bad at handling strings, Python was available and is good at it, and I believe in using the best tool for the job when there isn't any reason. This script, in turn, looks like this:

#!/usr/bin/python

import sys
import os

filename = sys.argv[1];

print 'Editing ', filename
os.system('screen -t "vim:' + filename + '" vim "+syntax on" "+SpellAutoDisable " ' + filename);
os.system('screen -X windowlist');

This code just launches vim with the filename passed as a parameter, turns syntax highlighting on, and displays a list of the windows so I can quickly navigate to the file I just opened. This system was almost everything I wanted for a few years. I could edit from anywhere in the world, even over poor connections, and I had access to all my terminals with a few keystrokes and they stayed put when I logged out. I wrote a little script which launched Matlab in a screen for when I arrived in the morning and got to work. BUT there was one feature I wanted to replicate from the GUI which I could never quite get to work. While using the Matlab GUI one can highlight a portion of their source file and press a key to evaluate it in the running interpreter. My boss uses this to great effect. He almost never touches the interpreter directly: instead he just edits a scratch file and edits, evaluates and deletes as he goes. The equivalent procedure in screen is somewhat tedious. -[ opens copy and paste mode. In this mode you can copy anything that is on the screen into a buffer by moving around the cursor, pressing enter at the start of the thing you are interested in, navigating to the end of it, and then pressing enter again. Then you press -0 (which switches to screen 0, where I run my interpreter) and then type -] to paste. If you didn't copy a newline, you have to also strike enter to execute the code. Not exactly friendly. Plus, if you want to execute something that takes more than a screen's worth of text, you are out of luck. Since my advisor left town for a few days and its summer, I set out to finally accomplish the task of getting this to work for me in my set up. I started to attack the problem at the level of screen. I figure others had tried to do something similar. Maybe even screen had a scripting system that I could use to customize its behavior more smartly. None of these things seemed to be true. I though about trying to integrate Lua into the screen source code but a few emails to the current maintainers didn't bear fruit and the project seemed daunting without at least a few hints. I got as far as binding -A space to past to window zero, saving a few strokes, but the whole thing was still cumbersome. I knew that I wanted to somehow takeover the standard input to the Matlab process for a second, pipe some code there, and then hand control back over. I never quite got there, but I did hit on a solution which is probably even better: using named pipes to create a command line utility which evaluates whatever it reads from the standard input inside a running Matlab interpreter. Someone suggested named pipes to me. I had never used them before and didn't really get how they worked so here is the story for people not in the know. We create a named pipe with mkfifo mkfifo "name" creates a named pipe called "name". In many respects it behaves just like a file except if you want to write to it, you have to open it for reading somewhere and if you want to read from it, you have to open it for writing somewhere. We can use redirection commands with it like any other file:

mkfifo mlpipe
matlab -nodesktop < mlpipe &
This snippet creates a named pipe and then starts matlab on the command line with mlpipe for its standard in. Someone trying to read from a named pipe with hang until something is written to the pipe, so here matlab will just freeze until someone opens and writes to mlpipe. Lets open a python session in another xterm:

python
>>> f = file('mlpipe','a'); # open file for appending
>>> f.write("ls\n");
>>> f.flush()
...
When we type f.flush() above our Matlab suddenly responds:

mkfifo mlpipe
matlab -nodesktop < mlpipe &
...
>> ls
>>    . .. a_file.m
Unfortunately, if you try this you will notice that as soon as we close the file in Python, Matlab gets an EOF and closes. This is unfortunate, since Python closes all files upon exiting. I wanted something which I could just run with some matlab code and which would return the result - without closing matlab. The solution I came up with (hence the name of the entry) was to have a second process just sleep in the background which kept mlpipe open all the time. I could kill this process when I was done or cause it to close automatically when I log out or close the screen session I was using, for example. But as long as it was sleeping, the pipe would never send an EOF character. This code looks more or less like this:

import time

f = file('mlpipe','a');
while 1:
    time.sleep(1000);

I then, finally, wrote the following python script (called m for brevity).

#!/usr/bin/python

import os
import time
import sys

def s(inp):
    os.system(inp);

s('cp log oldlog');

f = sys.stdin;

lines = f.readlines();

for line in lines:
    sys.stdout.write(line);

mlin = file('mlpipe','a');
for line in lines:
    mlin.write(line);


mlin.write('\n');

mlin.flush();

time.sleep(.2);


s('diff log oldlog > difflog');
fd = file('difflog');

lines = fd.readlines();
fd.close();

sys.stdout.write('%>>{ Matlab Output\n');

for line in lines:
    if line[0] == '<':
        sys.stdout.write(line.replace('<','%>>'));


sys.stdout.write('%>> ... Maybe More %}\n');

This code reads from the stdin until a newline. It then echos what it has read, sends it to Matlab via mlpipe, and then reads out the output and prints that, commented up in a special way. Reading the output is pretty low-tech. It definitely might not work for other people. But the idea is that Matlab is started with an option to keep a log. I then copy the log to a file before I evaluate the code passed in and then extract the differences between the old and current log (necessarily the results of the evaluating the just executed code) and prints them. This works fine for me. I don't redirect the standard output of Matlab, so it stays open in Window 0 of my screen session so I can see exactly what it is doing. Finally, I wrote a set of shortcuts which let me pass a selection of text (or a line) through m in Vim. This way I can write:


disp('Hello World');

put my cursor on that line and press F2 and suddenly:


disp('Hello World');
%>>{ Matlab Output
%>> Hello World
%>> ...Maybe More %}

Note the bracketing and special comment leader. This way I can fold up the output easily in Vim or delete it entirely (I have a second set of keybindings that do just these things.) Finally, I generated a vim dictionary file of all the functions and scripts in my Matlab path so that I can use vim autocompletion on my text files. All and all the system is a god-awful mess which a smarter man might have implemented in a far more elegant way. But! It works and its done! I hope this is useful to anyone out there who wants to do something similar.

Tuesday, May 15, 2007

LPP - Lua Preprocessor

This week in reinventing the wheel I present the Lua preprocessor. This is a more or less finished project which allows you to use Lua as a text preprocessor. The basic idea is that we create a file of some kind (my preferred use is C code) and it contains <#lua#>/<#endlua#> blocks.


#include

int main( int argc, char ** argv ){
int n = 1000;
<#lua#>
    malloced_vals = { "var1", "var2, "var3" };
    map(function(str) write("double * "..str..";) end, malloced_vals);
    map(function(str) write(str..." = malloc(sizeof(double*n));") end, malloced_vals);
<#endlua#>

/* Do lots of stuff */

<#lua#>
    map(function(str) write("free("..str..")) end, malloced_vals);
<#endlua#>
}

(Note, here the function map is defined elsewhere) This may seem like a stupid idea - but its not. The issue I was having was that I was writing model neurons in C. Biophysically realistic neurons are constructed by measuring and characterizing the voltage and ion concentration dependence of the channels in the cell wall. A sophisticated model might have tens of different channel types, each with their own set of variables which need to be kept track of and integrated separately. In addition to this, there are variables for synaptic currents and strengths which also must be taken care of. Vincent four years ago might have tried to represent each neuron as a struct such as:

typedef struct neuron {
    double * voltage;
    double * channel1;
    ...
    double * channelN;
 } neuron;
But using arrays of structs like this can be computationally expensive - you have to pay for those pointer dereferences. In smaller models, I just use malloced straight arrays for each value and hope to name them in a way which makes writing a custom integrator easy. But as I faced more complicated tasks the complexity threatened to overwhelm the meager abstractions I was using. I wrote the lua pre-processor so that I could marshal the added abstraction of Lua and still control how my models compiled to C. Happily, the Lua preprocessor also makes it easy to write Matlab interfaces for C code 1. The preprocessor is pretty well behaved. When you use the "write" function, the output is automatically indented to match the indentation of the <#lua#> block that produced it (I use only spaces for this, since I never use tabs). Additionally, it supports the ability to log the output into a variable available to the preprocessor code. This allows you, for example, to detect function definitions and automatically generate header files. As with any macro language, it makes debugging somewhat difficult, and I plan to eventually add a feature which keeps track of the line number which generated a line of code and appends it to the beginning of each output line. To make things a little easier, the LPP outputs the current code block in the event of any error with line number and error information. It is a unix-command line friendly application which accepts standard in and prints to standard out in the absence of arguments, so it can be used within VIM (for example) to dynamically use Lua to produce text files. Without further ado, here is the code (under the GPL). It is proof of how sweet Lua is that it is this short.


#!/usr/local/bin/lua

-- LUA Preprocessor Copyright J.V. Toups 2007
--
--This program is free software; you can redistribute it and/or modify it under
--the terms of the GNU General Public License as published by the Free Software
--Foundation; either version 2 of the License, or (at your option) any later
--version.
--
--This program is distributed in the hope that it will be useful, but WITHOUT
--ANY WARRANTY; without even the implied warranty of MERCHANTABILITY or FITNESS
--FOR A PARTICULAR PURPOSE.  See the GNU General Public License for more
--details.
--
--You should have received a copy of the GNU General Public License along with
--this program; if not, write to the Free Software Foundation, Inc., 51
--Franklin Street, Fifth Floor, Boston, MA  02110-1301, USA.
--


stdout = io.output();
stdin  = io.input();

i = 1;
while arg[i] and string.sub(arg[i],1,1) == '-' do
-- Parse command line arguments if we ever care about them
if arg[i] == '--noautoindent' then
    _autoindent = false;
    i = i + 1
elseif arg[i] == '--noautoendline' then
    _autoendline = false;
    i = i + 1;
elseif arg[i] == '--baseindent' then
    _baseindent = string.rep(' ',arg[i+1]);
    i = i + 2;
elseif arg[i] == '--output' or arg[i] == '-o' then
    ofname = arg[i+1];
    i = i + 2;
else
    i = i + 1;
end
end
if not _autoindent then
_autoindent = true;
end
if not _autoendline then
_autoendline = true;
end

if not _baseindent then
_baseindent = '';
end

_replacements = {};
_buffer = {};
_keep_buffer = false;
_indent = '';

--print('*************************************************************')
--print('**                                                         **');
--print('** luacpp, a Lua c preprocessor copyright J.V. Toups 2006  **');
--print('** send questions or comments to toups@physics.unc.edu     **');
--print('**                                                         **');
--print('*************************************************************');

--print('\tThere are '..table.getn(arg)..' argument(s).');
if arg[i] then
print('\tInput file is '..arg[i]);
else
--print('\tInput is standard input.');
end

ifname = arg[i];
--if not ofname then
--    ofname = string.gsub(arg[i],'.luacpp','');
--    if ifname == ofname then
--        ofname = ofname..'luacpp_output'
--    end
--end

if ofname then
print('\tOutput file is '..ofname);
else
--print('\tOutput file is standard output.');
end

--print('');
--print('** Beginning Parse **');
--print('');
if ofname then
outfile= assert(io.open(ofname,'w'));
else
outfile= stdout;
end

if ifname then
infile = assert(io.open(ifname,'r'));
else
infile = stdin;
end

function clear_buffer()
_buffer = {};
end

function write(...)
if _autoindent then
    io.write(_baseindent.._indent);
end
for _,v in ipairs(arg) do
    io.write(v);
end
if _autoendline then
    io.write('\n');
end
end

io.output(outfile);

if ifname then
io.write('/* This file produced from '..ifname..' by luacpp */\n');
io.write('/* which is copy right 2006 J.V. Toups */\n');
end

lines = infile:lines();
for line in lines do
if string.find(line,'<#lua#>') then
    --print(' *** FOUND LUA BEGIN *** ');
    --print(line)
    _indent = string.sub(line,string.find(line,'%s*'));
    --print('Length of indent is '..string.len(_indent));
    execlines = '';
    numbered_lines = '';
    line = lines();
    i = 1;
    while not string.find(line,'<#endlua#>') do
        execlines = execlines..line..'\n'
        numbered_lines = numbered_lines..i..' :'..line..'\n';
        line = lines();
        i = i + 1;
    end
    exec_chunk, errmsg = loadstring(execlines);
    if exec_chunk then
        exec_chunk();
    else
        print('I found an error in this chunk:');
        print(numbered_lines);
        assert(nil,errmsg);
    end
    --execlines = assert(loadstring(execlines));
    --execlines();
else
    rline = line;
    for k, v in pairs(_replacements) do
        repstr = '([%s%p]-)'..k..'([%s%p]-)';
        print(k..': '..v);
        rline = string.gsub(rline,repstr,'%1'..v..'%2');
    end
    io.write(rline..'\n');
    if(_keep_buffer) then
        table.insert(_buffer,rline);
    end
end
end

1: I wrote this before getting into Lisp. Lisp's Macro system allows you to do exactly what I wrote this to do but in just one language.