Saturday, November 5, 2011

Understanding the Haskell IO Monad.

By Building a Toy Model in Scheme

Somehow or another, this blog ended up being about two basic things: Lisp and Functional Programming. I've talked about almost every Lisp species there is, but there is an Elephant in the Functional Programming room, which has only come up every so often, which is the programming language Haskell.

Lisp!
Lisp is the Tao, at least to awesome hippies like Mark Tarver. Image borrowed from here.

I'm quite fond of Haskell, in principle. If Lisp is the Tao -- a kind of informal, flexible cloud around the locus of the ideal programming language -- then Haskell might be a rather rigidly specified diamond right near that locus' center. Haskell trades a lot of the looseness of Lisp for a rigid semantics, but it seems there is a good argument that these semantics are much closer to the ideal of Functional Programming.

I don't have much opportunity to program in Haskell, and so I don't write a lot about it. I do program in Lisp a lot, and I think I've subconsciously been engaged in a long term project to improve my understanding of Haskell by implementing toy models of parts of it in Lisp. Hence all the posts about monads. I've recently started writing a book about pure functional programming in Scheme/Racket and initial exploratory writing has produced a few insights about Haskell's IO Monad.

The IO Monad seems to be among Haskell's most mysterious monads for new programmers. Monad tutorials and explications rarely talk about it. Haskell tutorials introduce it as an ad-hoc: they show you how to use it syntactically, but its rarely made naked. And there is a reason for this: whatever it is, precisely, that the IO Monad does isn't expressable in regular old safe Haskell.

But its pretty important, and not just because its the only kosher way for a Haskell programmer to communicate with the outside world. In a sense, a Haskell program can be viewed as a description of some IO actions. The Haskell run time can be thought of as constructing that IO action, and then, somehow, executing it. If you've heard programmers say that, even when using the IO Monad, one is still writing "pure" code, this is what they mean: the Haskell program itself simply constructs an IO action from smaller, and ultimately primitive IO actions. The actions are never run in the code itself, just constructed. Eventually the Haskell runtime executes them, but the construction of the IO commands is itself purely a matter of functional programming, without side effects.

(Note that the programmer can bail out of this paradigm using thigns like unsafePerformIO, but that is frowned upon, for the most part).

So its important, I think, to develop a mental model of what is going on here.

Enter Scheme

I Love Lisp/Scheme, because using Lisp is a great way to understand language constructs. Lisp is a kind of programming language lab, where a crappy, ad-hoc and slow implementation of a language feature is always a few hundred lines of code away. It takes serious effort to design a non-standard language on top of Lisp for an "industrial strength" application, but if you just want to understand something, Lisp can take you there. It is a great way to get to know unusual semantics.

Today we're going to use Racket, which really isn't a Scheme anymore, so much as a Scheme superset, to build a toy model of the IO monad. The bird's eye view is that we'll build a lexicon of IO actions, which will be regular old functions which take the IO state and maybe do something with it, and return a value, and then we'll build some machinery to construct more complex IO actions from simpler ones.

We'll also build an external library to execute these IO actions once they've been built, but the intent is that the user is segregated from this execution. They don't know anything about it, and all they do is build IO actions and hand them off.

Bundling Up IO

Let's start with our "low level" IO. One possibility is that we bundle up standard-input and standard-output into a type of some kind. IO actions will just, then, be functions which take this pair of standard input and output and return values. I'd like to do something a little more fun. Let's build a graphical user interface that has an input window and and output window and build an IO Monad over them, instead. This will give us a change to play around with Racket's GUI classes, which is a nice thing to get acquainted with.

Sketch

We'll write a function run-io which will take an IO action, pop up a window, and execute that action. It will have the type:

type io-action = io -> a
run-io :: io-action -> a

Where we mean by a that the function returns the type a, which can be anything. Our function takes a single value of type io-action, which we will get specific about in a bit.

This function has to do a few things. It needs to initialize a frame%, put some editor's in it with a layout of some kind, and then it needs to create the io object, pass it to the action, and then return the result of the io action. The window will be broken into an output area and an input area. The output area will have no special behavior, except, of course, that we'll be able to write to it and clear it. The input area will be a kind of "command line," - we want it to only accept one line at a time, which it detects by looking for a newline. Somehow, we want each command to end up being readable from within the monad.

Text windows are represented by two classes working in tandem in Racket, a text% class and an editor% class. We can accomplish our goals by subclassing the text% part.

(require racket/gui
     (only-in srfi/13 string-trim-both))

(define (string-contains-newline? str)
  (let loop ((chars (string->list str)))
    (match chars 
      [(list) #f]
      [(cons #\newline rest) #t]
      [(cons x rest)
       (loop rest)])))

(define input-text% 
  (class text% 
    (field (command #f))
    (define/public (text-substring start len)
      (let loop ((acc '())
                 (i 0))
        (if (< i len) (loop (cons (send this get-character (+ start i)) acc) (+ i 1))
            (list->string (reverse acc)))))
    (define/augment (after-insert start len) 
      (let ((string (send this text-substring start len)))
        (if (string-contains-newline? string) 
            (let ((a-command (send this text-substring 2 (+ start len))))
              (set! command (string-trim-both a-command))
              (send this erase)
              (send this insert "> "))
            #f)))
    (define/public (grab-command) 
      (let ((c command))
        (set! command #f)
        c))
    (super-new)
    (send this insert "> ")))

We've added a field to store the current command, with #f no command, and augmented the after-insert method to detect command entry and act appropriately. The new method grab-command fetches the current command and sets the internal command to #f again.

There is no easy way to make setting up a GUI look nice, as far as I know, so here is the gory details:

(define (setup-window)
  (let* ((frame (new frame% 
                     [label "io toy"]
                     [width 640]
                     [height 480]))
         (pane (new vertical-pane% 
                    [min-width 640]
                    [min-height 480]
                    [parent frame]))
         (output-text (new text%))
         (output (new editor-canvas% 
                      [parent pane]
                      [style '(no-hscroll)]
                      [editor output-text]
                      [min-width 640]
                      [min-height 400]))
         (input-text (new input-text%))
         (input (new editor-canvas%
                     [parent pane]
                      [style '(no-hscroll)]
                      [editor input-text]
                      [min-width 640]
                      [min-height 40]))
         (io (lambda (input)
               (match input
                 [(? string? x) (send output-text insert input)]
                 ['clear (send output-text erase)]
                 ['fetch-command
                  (let loop 
                    ((command (send input-text grab-command)))
                    (yield)
                    (if command command (loop (send input-text
                                                    grab-command))))]))))
    (send frame show #t)
    io))

The IO Toy
The IO Window in All Its Splendor

This does the business of setting up our window and constructing our "io" value, which is apparently a function which can take either arbitrary text, which it inserts into the output buffer, or the symbols clear and fetch-command. clear, straightforwardly enough, clears the output region with the erase method. fetch-command enters a loop and doesn't return until a command is entered by the user, although it does call yield in order to let the GUI thread check for new events.

With this function, we can easily define:

(define (run-io action) 
 (let ((io (setup-window)))
    (action io)))

Now our users can concern themselves entirely with IO action construction. They trust us to execute the action later. What does an action look like?

Well:

(define (clear io) (io 'clear))
(define (get-command io) (io 'fetch-command))
(define (insert text) (lambda (io) (io text)))

Are three such commands. In fact, they are going to be our "atomic" commands, from which we'll build everything else.

One wrinkle: insert isn't itself an io action, it is a function which returns an io action which inserts the indicated piece of text. insert is a simple kind of io-action builder. We'll see how to integrate io actions together in a bit.

Let's wrap this all up in a module, only exporting our low level interface:

(provide run-io clear get-command insert)

Saving the file as toy-io.rkt will create a module. You can also download this code from my github account.

You've got to place that file somewhere where Racket can get to it, and then you can fire up Racket and say:

(require blog/toy-io)
(run-io (insert "Testing, 1, 2, 3!"))

And press go. What this should do is pop up a window with an interaction area and an output area with "Testing, 1, 2, 3!" in it. Great. Now we should focus on building up IO actions.

The IO Monad

Monads always have an associated monadic value of a particular type. In this case, our monadic values are functions which accept an IO object and return any old thing. The get-command is a monadic value - it accepts the io object and returns the command received from the user. What if you wanted a get-two-commands which read two commands and returned them as a list?

(define (get-two-commands io)
  (let ((c1 (get-command io))
        (c2 (get-command io)))
    (list c1 c2)))

Simple enough. get-two-commands is, itself, a monadic value.

Now for the monad part: what if you want to define a monadic value which gets one command, and then determines what to do based on the result of that command. We can represent the choice we'll make about what to do next with a function which is waiting for the previously read command and will eventually return an io action. Think of a lambda expression as a computation with some unbound variable which will calculate a value when that variable is provided or bound. Hence, consider the function:

(define (io-bind io-action io-action-producer)
  (lambda (io)
    (let* ((result (io-action io))
           (next-io-action
            (io-action-producer result)))
     (next-io-action io))))

io-bind is our io action construction work horse. It builds up a new io action from an io action and an io-action-producer, which is waiting for the result of the previous io action before it can decide what to do. io-bind returns an io action, of course. That io action executes the first argument to io-bind when it receives the io object. It then passes the result of that action to the producer, and executes the resulting io action on the io object.

This is sufficient to build up a large family of complex io actions which depend on the results of querying the input. We need one more function to build our monad, though:

(define (io-return value)
  (lambda (io)
     value))

This function makes the simplest kinds of io actions. They don't do anything, but they do return whatever value was specified when io-return was called. io-return, in other words, inserts values into the io monad.

Building IO Actions

Let's build an io action that checks for the command "exit." If "exit" is encountered, then the io action simply returns, but if it isn't encountered, it reads another command and so on.

(define exit-flagger 
  (io-bind get-command 
           (lambda (command)
             (if (equal? command "exit") 
                 (io-return 'exit-detected)
                 exit-flagger))))

If you then execute (run-io exit-flagger), you'll get a window which will keep accepting inputs until you type "exit," at which point, control will return to the command line.

Here we use io-bind to construct a new io action directly, but Haskell provides special syntax for monadic behavior. Let's build a version of that syntax here.

"Do" Notation

We'll create a do-io form which accepts a list of expressions of two varieties. An expression must either evaluate to an io action, or it must specify that a value should be bound to the result of an io action in the following way:

(command <- get-command)

We'd rather not get into a deep discussion about macros in Scheme, so we'll use regular old syntax-rules. Not much error handling, but its pretty simple to understand.

Consider:

(define-syntax do-io 
  (syntax-rules (<-)
   [(do-io (id <- io-action) expr ...)
     (io-bind io-action 
        (lambda (id)
          (do-io expr ...)))]
   [(do-io io-action expr ...)
    (io-bind io-action 
     (lambda (_) 
       (do-io expr ...)))]
   [(do-io io-action) io-action]))

Our syntax-extension has three cases. The most specific appears first, where we detect an expression indicating a monadic binding. We pass the monadic value io-action to io-bind along with a lambda expression consisting of the rest of the do-expression in a context where the id is bound to the result of the action. When we don't have a binding expression of the form (id <- io-action), expand the macro the same way, but don't even bother creating a variable visible in the rest of the expressions. If we only have a single expression, then we just return its value.

We can now write our exit-flagger io action as:

(define exit-flagger
 (do-io 
  (command <- get-command)
  (if (equal? command "exit") 
      (io-return 'exit-detected)
      exit-flagger)))

This resembles Haskell notation, where the same thing might look something like:

exitFlagger = do
  command <- getCommand
  if equal? command "exit" 
     then
       return "exit-detected"
     else
       exitFlagger

A Simple Example

Let's write some brains into our command interpreter.

(define (read-command str)
  (with-handlers ([exn:fail? (lambda (e) 'error)])
    (with-input-from-string 
     (string-append "(" str ")")
     (lambda () (read)))))

This function converts a string into a list of values. If it doesn't receive a string which, when wrapped with "()" evaluates to a piece of Scheme code, it just returns error. We use it to make dispatching on commands easier:

(define symbolic-command
  (do-io 
    (cmd <- get-command)
    (io-return (read-command str))))

Now we can write:

(define (repeat-char n char)
  (let loop ((n n)
             (acc '()))
     (if (<= n 0) (list->string acc)
         (loop (- n 1) (cons char acc)))))

(define dispatch
  (do-io
   (c <- symbolic-command)
   (match c
     ['error 
      (do-io
       (insert "Error reading command string.")
       dispatch)]
     [(list 'stars (? number? n))
      (do-io 
       clear
       (insert (repeat-char n #\*))
       dispatch)]
     [(list 'exit) (io-return 'exit-detected)]
     [_ (do-io
         clear
         (insert "unknown command.")
         dispatch)])))

dispatch is pretty easy to understand. It is an io action which grabs a command, converts it to a list of atoms and lists, matches that list for known commands, and loops, unless the command read is (exit). Note that dispatch is defined without ever explicitely referencing the io state that is threaded through all its parts. We've liberated the programmer from thinking about what that state is or how it is managed. They simply build up io actions like any other kind of data.

dispatch in action
Dispatch in action!

Conclusions

Ok, so we built a low level interface which non-functionally handles input and output operations for us. Because input and output is inherently stateful, there is no real way to avoid side effects in this part of our code.

However, we then realized that we could "encapsulate" input and output operations in a few primitive io actions and provide a monadic context wherein the user can build more complex actions from simpler ones. Up until the moment the user calls run-io, she maintains complete purity. Constructing io actions can be done completely without side effects.

For bonus, we implemented a syntax mimicing Haskell's do notation in terms of the function io-bind. This gave us a toy model of Haskell's io monad, which we used to build some simple io actions.

Glazed over in all this discussion were the implications of Haskell's lazy semantics, type system, and error handling mechanisms. We might also begin to wonder how we combine this stateless io programming with other, apparently stateful tasks. We'll see how we can do that in a later post (or in the book, if I ever write it), and how these considerations relate to monad transformers!

This code is up on my github. If you are interested in monads, you can also check out my better-monads library on Planet Racket. It doesn't include an IO monad because that is kind of ridiculous in Scheme!

Thanks for reading!

No comments: