Monday, May 12, 2008

On Inform 7, Natural Language Programming and the Principle of Least Surprise

I've been pecking away at Inform 7 lately on account of its recently acquired Gnome front end. For those not in the know, Inform (and Inform 7) is a text adventure authoring language. I've always been interested in game programming but never had the time (or more likely the persistence of mind) to develop one of any sophistication myself. Usually in these cases one lowers the bar, and as far as interactive media goes, you can't get much lower, complexity wise, than text adventures.

Writing a game in Inform amounts to describing the world and it's rules in terms of a programming language provided by Inform. The system then collects the rules and descriptions and creates a game out of them. Time was, programming in Inform used to look like:

Constant Story "Hello World";
Constant Headline "^An Interactive Example^";
Include "Parser";
Include "VerbLib";
[ Initialise;
  location = Living_Room;
  "Hello World"; ];
Object Kitchen "Kitchen";
Object Front_Door "Front Door";
Object Living_Room "Living Room"
  with
      description "A comfortably furnished living room.",
      n_to Kitchen,
      s_to Front_Door,
  has light;

Which is recognizably a programming language, if a bit strange and domain specific. These days, writing Inform looks like this: (from my little project):

"Frustrate" by "Vincent Toups"
Ticks is a number which varies.
Ticks is zero.
When play begins:  
Now ticks is 1.

The Observation Room is a room. "The observation room cold and
surreal. Stars dot the floor underneath thick, leaded glass, cutting
across it with a barely perceptible tilt. This room seems to have been
adapted for storage, and is filled with all sorts of sub-stellar
detritus, sharp in the chill and out of place against the slowly
rotating sky. Even in the cold, the place smells of dust, old wood
finish, and mildew. [If ticks is less than two] As the sky cuts its
way across the milky way, the whole room seems to tilt.  You feel
dizzy.[else if ticks is less than four]The plane of the galaxy is
sinking out of range and the portal is filling with the void of
space. It feels like drowning.[else if ticks is greater than 7]The
galactic plane is filling the floor with a powdering of
stars.[else]The observation floor looks out across the void of space.
You avert your eyes from the floor.[end if]"

Every turn: Now ticks is ticks plus one.
Every turn: if ticks is 10:
decrease ticks by 10.

As you can see, the new Inform adopts a "natural language" approach to programming. As the Inform 7 website puts it

[The] Source language [is] modelled closely on a subset of English, and usually readable as such.

Also reproduced in the Inform 7 manual is the following quote from luminary Donald Knuth:

Programming is best regarded as the process of creating works of literature, which are meant to be read... so we ought to address them to people, not to machines. (Donald Knuth, "Literate Programming", 1981)

Which better than anything else illustrates the desired goal of the new system: Humans are not machines! Machines should accommodate our modes of expression rather than forcing us to accommodate theirs! If it wasn't for the unnaturalness of programming languages, the logic goes, many more people would program. The creation of interactive fiction means to be inclusive, so why not teach the machine to understand natural language?

This is a laudable goal. I really think the future is going to have a lot more programmers in it, and a primary task of language architects is to design programming languages which "regular" people find intuitive and useful. For successes in that arena see Python, or Smalltalk or even Basic. Perhaps these languages are not the pinnacle of intuitive programming environments but whatever that ultimate language is, I doubt seriously it will look much like Inform 7.

This is unfortunate, because reading Inform 7 is very pleasant, and the language is even charming from time to time. Unfortunately, it's very difficult to program in1, and I say that as something of a programming language aficionado. It's true that creating the basic skeleton of a text adventure is very easy, but even slightly non-trivial extensions to the language are difficult to intuitively get right. For instance, the game I am working on takes place on a gigantic, hollowed out natural satellite, spinning to provide artificial gravity. The game begins in a sort of observation bubble, where the floor is transparent and the stars are visible outside. Sometimes this observation window should be pointing into the plane of the Milky Way, but other times it should be pointing towards the void of space because the station's axis of rotation is parallel to the plane of the galaxy. The description of the room should reflect these different possibilities.

Inform 7 operates on a turn based basis, so it seems like it should be simple enough to create this sort of time dependent behavior by keeping track of time but it was frustrating to figure out how to "tell" the Inform compiler what I wanted.

First I tried joint conditionals:

  When the player is in the Observation Room and
the turn is even, say: "The stars fill the floor."

But this resulted in an error message. Maybe the system doesn't know about "evenness" so I tried:

  When the player is in the Observation Room and
the turn is greater than 3, say "The stars fill the floor."

(Figuring I could add more complex logic later).

Eventually I figured out the right syntax, which involved creating a variable and having a rule set its value each turn and a separate rule reset the value with the periodicity of the rotation of the ship, but the process was very frustrating. In Python the whole game might look, with the proper abstractions, like:


while not game.over():
    game.describe_location(player.position);
    if (player.position == 'The Observation Room' and
         game.turn() % 10):
        print "The stars fill the floor."

Which is not perhaps as "englishy" as the final working Inform code (posted near the beginning of this article) but is much more concise and obvious.

But that isn't the reason the Python version is less frustrating to write. The reason is the Principle of Least Surprise, which states, roughly, that once you know the system, the least surprising way of doing things will work. The problem with Inform 7 is that "the system" appears to the observer to be "written english (perhaps more carefully constructed that usual)". This produces in the coder a whole slew of aassumptions about what sorts of statements will do what kind of things and as a consequence, you try a lot of things which, according to your mental model, inexplicably don't work.

It took me an hour to figure out how to make what amounts to a special kind of clock and I had the benefit of knowing that underneath all that "natural English" was a (more or less) regular old (prolog flavored) programming environment. I can't imagine the frustration a non-programmer would feel when they first decided to do something not directly supported or explained in the standard library or documentation.

That isn't the only problem, either. Natural english is a domain specific language for communicating between intelligent things. It assumes that the recepient of the stream of tokens can easily resolve ambiguities, invert accidental negatives (pay attention, people do this all the time in speech) and tell the difference between important information and information it's acceptable to leave ambiguous. Not only are computers presently incapable of this level of deduction/induction, but generally speaking we don't want that behavior anyway: we are programming to get a computer to perform a very narrowly defined set of behaviors. The implication that Inform 7 will "understand you" in this context is doubly frustrating. And you don't want it to "understand," you want it to do exactly.

A lot of this could be ameliorated by a good piece of reference documentation, spelling out in exact detail the programmatic environment's behavior. Unfortunately, the bundled documentation is a big tutorial which does a poor job of delineated between constructs in the language and elements of it. It all seems somewhat magical in the tutorial, in other words, and the intrepid reader, wishing to generalize on the rules of the system, is often confounded.

Nevertheless, I will probably keep using it. The environment is clean and pleasant, and the language, when you begin to feel out the classical language under the hood, is ok. And you can't beat the built in features for text based games. I doubt that Inform 7, though, will seriously take off. Too many undeliverable promises.

1 This may make it the only "Read Only" programming language I can think of.

13 comments:

Cory said...

When people say they want to program in English, I say, "Good idea! We should also paint and play music in English."

In other words, writing programs is a medium of expression which doesn't necessarily map to anything other than a programming language, and probably isn't easier in anything other than a programming language.

(To be sure, Inform 7 *is* a programming language, but it's also a little like wiring a robot up to a paint brush and using english commands to tell him how to construct your masterpiece.)

Anonymous said...

I think these natural languages are the way of the future. Everyone will be able to write programs, and the amount of programs (including useful programs) will explode.

Speed will be less and less of a concern, beauty, readability, creativity is.

I have however a few things to note here:

- I think it is overly verbose. I would like to NOT use proper english names and still create something.

- I think the system *should* be fault tolerant. Yes, that means, there should be NO mistakes (at least on default, unless you set a "strict" mode, but even then i think errors should simply not EXIST. The computer picks a default always.)

- The biggest advantage will be when we use COMPONENTS: we can exchange parts of our programs and have them work.

Consider the C language right now. You bundle a lot of CRAP (which is important) into .h files, including structs, or colour codes or rgb values etc..

These things should live OUTSIDE the C language, so that other languages can use them too.

I noticed personally that if data is untied from a programming language, the data itself is what matters, what will evolve (to hopefully something better)

Anyway. Nice readup!

Anonymous said...

Btw I agree that the python version is more concise, however it is 0 fault tolerant. If you make a writing error, python will scream and force you to correct it.

This may be ok, but I think it should simply not happen for a really CREATIVE language.

Church said...

I had a similar take on Inform in the one session I spent mucking with it.

OTOH, it reminds me a bit of HyperTalk, and once I got the hang what the syntax actually was, I was very happy using it. If you're an occasional programmer, it's easier to remember "put it into the first button."

So, possibly, the advantage is in *remembering* the syntax, rather than in learning it. That would be a considerable advantage for something like this, which very few people are going to use daily.

J.V. Toups said...

Many commentators are suggesting that fault tolerance is a valuable feature of programming languages but 1) Inform 7 is not fault tolerant and 2) most of the time doing nothing is much safer than doing the wrong thing. I would not want my programming language to decide on its own that I meant to delete all the files in my root directory because I accidentally typed a piece of code somewhere. I want it to hang/not compile/tell me.

That said, I use strongly, dynamically typed programming languages almost exclusively (Matlab, Scheme, Python) so obviously I don't want the language system to be so rigid that it takes a long time to get anything done.

I really don't see natural language programming taking off until computers are much smarter - at least smart enough to handle jobs where precision is not critical, like picking out a nature themed desktop image or sorting all articles in a folder by their semantic content.

But computers are unable to do even these simple tasks and until they can, every natural programming environment will be a frustration to program in. If you want expressive code, try Lisp, Haskell or Python. These are languages for describing programs.

Brant Sears said...

I almost completely agree with you. I have long hated the language AppleScript for precisely the reasons you cite. AppleScript and Inform 7 are so similar to English that my knowledge of English interferes with trying to program.

The redeeming thing about I7 is that when you are working on the game, there is a nice continuity between writing the game using English and debugging the game which is also done in English. Because what I'm trying to do in my task is create beautiful grammatical English prose, I find that I7 helps rather than hinders this. And for this reason I am willing to put up with the facitrs you are talking about which is that it is a very difficult language to write procedural logic with.

Anonymous said...

fully agree Dorophone -- the promise of Inform 7 is great, but the learning curve is more of a step function.

great walkthrough too.

I'm currently writing a wiki adventure game platform -- and in the comment of a recent review of existing text adventure game platforms someone asked why I didn't review Inform 7.

Other commentors chipped in with some great "Anti-Natural Language Programming Quotes"

EPIGRAMS IN PROGRAMMING (from Alan J Perlis), number 93:

"When someone says 'I want a programming language in which I need only say what I wish done,' give him a lollipop."

John McCarthy (from http://en.wikiquote.org/wiki/John_McCarthy) --

"It's possible to program a computer in English. It's also possible to make an airplane controlled by reins and spurs."

and:

"Projects promoting programming in natural language are intrinsically doomed to fail."
(Edsger Dijkstra)

cheers
lb (secretGeek.net)
(sorry for the length of the comment)

Anonymous said...

Having just finished my first game in I7, I am tremendously impressed.

Yes it's true that it's quite particular how you phrase your sentences but that's no different to a traditional language. I still find I struggle to find the right phrase in Ruby after years of programming in Java and C but when I do it's generally far more concise than Java.

I find the same with I7, it is actually *more* concise than I6 once you get the hang of it because you can pack a lot of description into an English sentence and because of the inference features.

OK it's not English but it's one of the most productive languages that I've used once you're over the learning curve.

Taradino C. said...

Don't forget that Inform 7 also features some concepts at the language level that traditional programming languages don't, like descriptive sets and relations between objects, and those concepts are common in natural language.

Thanks to the pseudo-English syntax of adjectives and relative clauses, you can write "if the player can see a woman who owns a brown dog" or "if most of the flammable things in the basement are in closed containers" and let the compiler figure out how to implement it. This tends to be useful in interactive fiction.

Confuseki said...

hmmm... I've been struggling with Inform7 too. It seems greatly readable, but as others have mentioned, there seems to be no summative overall view on planning the pieces of code. ADRIFT is a great example of a program that is just point and click and make/erase objects at your whim. However, for a multiplayer InteractiveFiction experience like GUNCHO, you need to write in Inform7.

I wonder if anyone has plans on making a "piggyback" program with the ease of use of ADRIFT that turns code into Inform7 ease of readability?

*sigh* I just want to play with objects... but the code keeps getting in the way... Have any of you written templates on Inform7 object creation?

Confuseki said...

Hey, I found a really big pdf document about Inform 7.

Located at:
http://ifarchive.jmac.org/if-archive/infocom/compilers/inform7/manuals/I7_3R85.pdf

I just Google searched "Inform 7 taxonomy" and selected the 3rd choice at the time of this post.

I hope the link isn't broken for you.

Matt Wigdahl said...

Inform 7 is definitely harder to write in than to read, and it's harder to write than natural English. That's pretty much a given. That said, I think it's highly unrealistic to expect to just jump into a new language and get it to do complicated things right away.

My experience was that for the first dozen hours or so of programming in Inform 7 I found it very slow, cumbersome, and error-prone due to exactly the type of "well, it should understand that!" errors you describe, with a lot of resultant compile problems and suboptimal algorithm implementations. After that initial learning period, my error rate drastically decreased and I found it much easier to make the language do exactly what I wanted it to do.

I'm a programmer by profession, with fairly wide experience using different languages, and Inform 7 was about the quickest I've felt like I've gotten up to speed on a new language. That may be more of a reflection on the complexity of the application domain than the ease of writing Inform 7, however.

The bottom line is that the familiarity to English made it easier for me to get started, led to greater early frustration due to incorrect default assumptions, but ultimately allowed quicker mastery than a more machine-friendly language.

Anonymous said...

Within Inform7, you can fall back to Inform6 if you need to -- this generally helps in situations where you need more control over the bits under the hood.