cl-arith: Combinatory Logic Interpreter with Arithmetical Combinator Basis

version 1.1

$Id: oame-cl.html,v 1.8 2009/11/15 04:55:49 bediger Exp $

Bruce Ediger

Table of contents

  1. Introduction
  2. Starting
  3. Using
  4. Building and installing
  5. Bibliography

Introduction

This document describes how to build and use cl-arith v1.1. cl-arith interprets a programming language with a lot of similarities to "Combinatory Logic" (CL) formal systems with a basis set of O, A,M, E combinators, or a basis set of A, M,E, N combinators. It doesn't exactly interpret any "Combinatory Logic" in that it runs on computers with finite CPU speed and a finite memory. Most or all formal systems fail to take these limits into account.

Another page documents the design and implementation of an earlier version of a combinatory logic interpreter, one that has a different set of built-in primitives. That document still holds as a good description of the internals of cl-arith.

Starting the interpreter

After building the interpreter's executable, you can start it from the command line:

 7:57AM 87 % ./cl-arith
OAME>

The interpreter uses "OAME>" or "AMEN>" as its prompts for user input. cl-arith has a strict grammar, so you must type in either a term for reduction, or an interpreter command, or a command to examine an expression or change mode.

You have to use keyboard end-of-file (usually control-D) to exit cl-arith.

Giving the interpreter a CL term causes it to execute the usual functional language interpreter's read-eval-print loop. The interpreter prints the input as a minimally-parenthesized expression, reduces it to normal form, then prints the normal form. It shows a prompt, then waits for user input.

cl-arith uses a leftmost-outermost reduction scheme.

Command line options

-B algorithmUse algorithm as default for bracket abstraction. One of amen or oame
-cenable reduction cycle detection
-ddebug reductions
-eelaborate output
-L filenameLoad and interpret a file named filename
-mon exit, print memory usage summary
-M modestart in mode, one of "oame", "amen", "stenlund", "hancock"
-N numberperform up to number reductions
-pDon't print an interactive-use prompt.
-ssingle-step reductions
-T numberevaluate an expression for up to number seconds
-ttrace reductions

The -e or -s options have no use without the -t option, but -t alone can perform handy tasks.

-L filename can occur more than one time. cl-arith will interpret the files in order. After the last (or only) file, it prints the OAME> or AMEN> prompt, then waits for input from the user.

Using the interpreter

Interactive input

I designed cl-arith for use as an interactive system, with a user typing CL expressions at a prompt. The interpreter reduces the expression to a normal form (if it has one), or hits some other limit, like a pre-set timeout, count of allowed reductions or the user's patience.

The interpreter's prompt for input is ether the string "AMEN>" or the string "OAME>". A prompt appears when after the interpreter starts, or has finished reducing whatever expression the user gave it last, or it has executed an interpreter command.

You have to type an end-of-file character (almost always control-D) to quit, as cl-arith has no built-in "exit" or "quit" command.

A keyboard interrupt (almost always control-C) can interrupt whatever long-running reduction currently takes place. The interpreter will display the "OAME>" or "AMEN>" prompt after interruption, depending on the current mode.

A keyboard interrupt at either prompt will cause the interpreter to exit.

Non-interactive input

The -p command line option causes the interpreter to not print a prompt. This probably only has use for non-interactive input. The interpreter does read from stdin and write to stdout. You can use it as a non-interactive "filter", with input and output redirection. When doing exhaustive searching, I've found it best to leave the prompt on, to distinguish inputs from outputs.

Grammar, briefly

Grammatically, expressions consist of two terms, and terms consist of either a built-in primitive or a variable or an expression. Parentheses can surround expressions.

Each built-in primitive has a name consisting of a single upper-case letter: O, A, M, and E, or A, M, E, and N. Variables (which can also serve as abbreviations) can look like C or Java style identifiers: a letter or underscore, followed by zero or more letters or underscores or digits. Non-abbreviation variables act as inert placeholders.

An abstracted variable looks like any identifier surrounded by square brackets. For example, [abc] abstracts a variable named "abc", not 3 variables "a", "b" and "c". The interpreter performs bracket abstraction upon finding an abstracted variable, and uses the resulting expression (which has the variable removed) as a term.

Parentheses

Users can parenthesize input expressions as much or as little as they desire, up to the limits of left-association and the meaning they wish to convey to the interpreter. The grammar used by cl-arith does not allow single variables or primitives inside paired parentheses. It treats inputs like "(A)" as syntax errors. You have to put at least two primitives and/or variables inside a pair of parentheses, and each parentheses must have a matching counterpart.

The interpreter treats terms as "left associative", the standard in the Combinatory Logic literature. That means that an expression like: A a b c d gets treated as though it had parentheses like this: ((((A a) b) c) d)

To apply one complex term to another, the user must use parentheses. Applying A (N N) (N N) to (N N) would look like this: (A (N N) (N N)) (N N).

The interpreter prints out normal formas with minimal parentheses. Users can cut-n-paste output into the input, as output has valid input syntax. No keyboard shortcuts exist to re-use any previous input or output.

Built-in Primitives

I built-in two basis sets of primitive combinators.

OAME Basis

O a b → b
A a b c d → a c (b c d)
M a b c → a (b c)
E a b → b a

These primitives work when the interpreter is in "oame" or "stenlund" mode. The interpreter will display "OAME>" as the prompt. This is the interpreter's default mode.

AMEN Basis

A a b c d → b c (a c d)
M a b c → b (a c)
E a b → b a
N a b → b

These primitives work when the interpreter is in "amen" or "hancock" mode. The interpreter will display "AMEN>" as the prompt. You muse use the -M amen command line switch or issue a mode amen command to use this basis.

Built-in combinators require a certain number (two to four) of arguments to cause a reduction. Without the required number, no reduction takes place.

N (for "naught", apparently) and O (for "zero") act exactly the same. They have different names, but perform the same action.

The {O, A, M, E} basis M (for "multiplication") combinator gets called B in most works on combinatory logic. The {A, M, E, N} basis has M as what Smullyan calls the "Queer Bird" in his book To Mock A Mockingbird.

{O, A, M, E} and {A, M, E, N} bases have different combinators A and M, sharing common names. Using the mode command indiscriminately can lead to great confusion.

Basis Rationale

Both bases have arithmetic in Church Numerals as their motivation. O or N can represent the number zero. O O or N N can represent the number one. A adds two Church Numerals, M multiplies, and E exponentiates.

A given number does not have a unique normal form representation using the basis combinators. For example, 6 has at least these representations (in OAME basis):
M (A (O O) (O O)) (A (O O) (A (O O) (O O)))
A (A (O O) (O O)) (A (A (O O) (O O)) (A (O O) (O O)))

I suppose this is analogous to ordinary arithmetic representations like (2 × 3) or (2 + 2 + 2). The usual rules for arithmetic expressions cause you to perform the operations, while multiple representations in an arithmetic combinator basis can have normal form - they don't reduce.

Bracket Abstraction

"Bracket abstraction" names the process of creating a CL expression from an original expression, without a specified variable. Evaluating the expression with an appropriate argument (the specified variable) gives the original expression.

The cl-arith interpreter uses the conventional square-bracket notation. For example, to create an expression that will duplicate its single argument, one would type:

OAME> [x] x x

The x constitutes a "dummy" variable: you can use any valid variable name inside the square brackets. The variable name gets abstracted away, and will not appear in the term that cl-arith calculates, even if it appears in the subject term.

Bracket abstraction creates an expression, so you can use a bracket abstractioner where ever you might use any other expression: defining an abbreviation, in a sub-expression of a much larger expression, as an expression to evaluate immediately, or inside another bracket abstraction. For example, you could create Turing's fixed-point combinator like this:

AMEN> abstraction amen
AMEN> def U [x][y] (x (y y x))
AMEN> def Yturing (U U)

The interpreter uses the expression it calculates for the bracket abstraction of x and y for the definition of the abbreviation U.

Note the use of nested bracket abstractions. The abstraction of y occurs first, then x gets abstracted from the resulting expression.

Algorithms

cl-arith offers two different bracket abstraction algorithms, one for the stenlund/oame mode, another for the hancock/amen mode. In the following, I use an x to denote any abstracted variable. Abstracted variables can have any lexically valid format.

These algorithms are of my own devising, I cannot pretend that they're the most compact possible, the most efficient, or have other desirable properties.

You can specify an algorithm in the abstraction itself: OAME> [x]amen (x (K x))

Changing mode changes the default algorithm.

cl-arith's grammar allows mixing algorithms, one for each variable abstracted. The resulting expression may not make any sense.

The K Combinator

In the OAME basis, the smallest combinator that acts like the traditional K is A E (M O). I didn't use it in the OAME bracket abstraction algorithm as A E (M O) x is already in normal form. Given a single argument, the K-action combinator M (M (E O) M) E (and many others) reduces to the three-atom form above.

In the AMEN basis, I found M E (M N) as the shortest equivalent of K by exhaustive search.

The S Combinator

The above algorithms look very similar. As a practical exercize, the standard combinator S, S a b c → a c (b c), works out like this:

AMEN> [p][q][r] p r (q r)
M (M (A E) M) (M E)
OAME> [p][q][r] p r (q r)
M (E (M M E)) (M M (M E (M (E E) A)))

I don't know why such a difference in size exists between the two bases' S-like combinator. By exhaustive search, I can say that no closed S-like combinator exists in the {O, A, M, E} basis with 8 or fewer atoms.

Again by exhaustive search, I can say that the four shortest closed S-equivalents in the {A, M, E, N} basis are:

All four of these terms end up giving me the same bracket abstraction rule for [x] P Q, x ∈ FV(P), x ∈ FV(Q) as above, when evaluated with two arguments.

Abbreviations

define and def allow a user to introduce "abbreviations" to the input. Each time the abbreviation appears in input, cl-arith makes a copy of the term so abbreviated, and puts it in the input. No matter how complex the expression, an abbreviation still comprises a single term.

def makes an easy-to-type abbreviation of define.

The reduce command actually constitutes an expression, just like a bracket abstraction. Unlike define or def, you can use reduce anywhere an expression would fit, as part of a larger expression, as part of an abbreviation, or as part of a bracket abstraction.

Information about expressions

print lets you see what abbreviations expand to, without evaluation.

The "=" sign lets you determine lexical equality. All combinators, variables and parentheses have to match as strings, otherwise "=" deems the expressions not equivalent. Note that you can include a reduce keyword in either or both of the expressions you compare with "=". Otherwise, it will compare the two expressions without any reduction.

size gives one measure of an expression, the number of atoms, variables, and application nodes in its parse tree. For instance, N (N N) has a size of 5, 3 N atoms, and two applications: (N N) and N applied to (N N).

Intermediate output and single-stepping

You can issue any of these commands without an "on" or "off" argument to inquire about the current state of the directive.

trace, debug and elaborate provide increasingly verbose output. trace shows the expression after each reduction, debug adds information about which branch of an application it evaluates, and elaborate adds to that information useful in debugging memory allocation problems.

step on causes the interpreter to stop after each reduction, print the intermediate expression, and wait, at a ? prompt for user input. Hitting return goes to the next reduction, n or q terminates the reduction, and c (for "continue") causes it to not single-step. cl-arith will carry out any remaining reductions without asking permission.

detect on only has an effect if trace on. It causes the trace output to mark reducible atoms with an asterisk.

Reduction information and control

You can turn time outs off by using a 0 (zero) second timeout. Similarly, you can turn reduction-count-limited evaluation off with a 0 (zero) count.

timer on also times bracket abstraction.

Reading files

You only have to double-quote filenames with whitespace in them. You can use absolute filenames (beginning with "/") or you can filenames relative to the current working directory of the cl-arith process.

Mode

Use the {A, M, E, N} basis:

Use the {O, A, M, E} basis:

The mode command switches between the {O, A, M, E}basis, and the {A, M, E, N} basis. That is, the M and A names get used for different combinators depending on the "mode" setting of the interpreter.

The interactive prompt reflects interpreter's current "mode", appearing as OAME> or AMEN>. The interpreter's mode also chooses a default bracket abstraction algorithm.

Example code

Directory tests.in has lots of examples of combinatory logic expressions in action, unfortunately mixed in with simpler code that merely exercizes interpreter features, or checks to ensure that a bug doesn't reappear.

Building and installing

  1. Get the source code. You can:
  2. Unpack source code. From the command line: tar xzf cl-arith-1.1.tar.gz
  3. Change directory. From the command line: cd cl-arith-1.1
  4. Compile source code. From the command line: make gnu.
    That target ("gnu") should work for almost every linux or BSD computer. Other make targets might work better. make cc uses traditionally-unix-named tools, and may work better on Solaris.
  5. At this point, you can test the newly-compiled executable. From the command line: ./runtests. Most of the tests should run quite rapidly, in under a second. At least two of the tests run for 30 seconds or so, and at least one of the tests deliberately provokes a syntax error message from the interpreter.
  6. Install the interpreter if you want, or you can execute it in-place. To install, use the cp or mv commands to move or copy the executable to where ever you want it. It does not care what directory it resides in, and it does not look for configuration files anywhere.

I have built this on x86 Slackware 12.0 Linux, SPARC/Solaris 9 using GCC in both 32- and 64-bit modes, and I've built it on a 64-bit, Opteron-based Red Hat Enterprise Linux 3 system, GCC v3.2.3 I expect it will build easily on any *BSD-based system, but I haven't actually tried that.

I did not have the patience to get it to build on a "PA-RISC" HP-UX system: the ghastly "C89" compiler hated non-ANSI-standard signals, and the "cc" compiler hated function prototypes.

Bibliography

In-print books do not make any mention of an arithmetic combinator basis for Combinatory Logic. You mostly have to scrounge for papers on the world wide web.