This document describes how to build and use `cl` v1.6. `cl` interprets
a programming language with a lot of similarities to various "Combinatory Logic" (CL)
formal systems. It doesn't exactly interpret any "Combinatory Logic" in that
it runs on computers with finite CPU speed and a finite memory.
Most or all formal systems fail to take these limits into account.

Release 1.6 fixes a few bugs, tries to improve performance by eliminating code from the most common execution path and adds a way to stop a reduction when the result matches a pattern.

I started writing this interpreter as a way to test the "graph reduction"
implementation of a lambda calculator. Along the way, I bought a copy
of Raymond Smullyan's To Mock A Mockingbird.
While working that book's problems,
I ended up making `cl` into
a reasonably complete and usable system.

Another page documents the design and implementation of
an earlier version of `cl`.
That document still holds as a good
description of the internals of `cl`.

After building the interpreter's executable, you can start it from the command line:

7:57AM 87 % ./cl CL>

The interpreter uses "`CL>`" as its prompt for user input.
`cl` has a strict grammar, so you must type either
a term for reduction, or an interpreter command,
or a command to examine a term.

You have to use keyboard end-of-file (usually control-D) to exit `cl`.

Giving the interpreter a CL term causes it to execute a read-eval-print loop. After reading and parsing the input, the interpreter prints the input in a minimally-parenthesized representation, reduces it to normal form, and prints a text representation of the normal form. It prints a prompt, and waits for another user input.

`cl` does "normal order" evaluation: the leftmost outermost redex
gets evaluated first. This seems like the standard for CL, unlike
for lambda calculus, where a lot of ink gets expended distinguishing between
"normal order"
and "applicative order".

-c | enable reduction cycle detection |

-d | debug contractions |

-L filename | Interpret a file named filename before reading user input |

-m | on exit, print memory usage summary |

-N number | perform up to number contractions |

-p | Don't print any prompt. |

-s | single-step contractions |

-T number | evaluate an expression for up to number seconds |

-t | trace contractions |

-C combinator | treat combinator as a non-primitive. Combinator one of S, K, I, B, C, W, M, T, J |

-B algorithm | Use algorithm as default for bracket abstraction. One of curry, curry2, church, tromp, grz, btmk, turner |

The `-s` option has no use without the `-t` option, but `-t` alone can perform
handy tasks.

You can use one or more `-C combinator` options to confine yourself to a given
basis, or turn off the K combinator to work in λI calculi.

`-L filename` can occur more than one time.

I designed `cl` for use as an interactive system, with a user typing
CL expressions at a prompt. The interpreter reduces the expression
to a normal form (if it has one), or hits some other limit, like a pre-set
timeout, count of allowed contractions or the user's patience.

The built-in prompt for input is the string "`CL>`".
It appears when the interpreter starts up, or has finished reducing whatever
expression the user gave it last, or it has executed an interpreter command.

You have to type an end-of-file character (almost always control-D) to quit, as it has no built-in "exit" or "quit" command.

A keyboard interrupt (almost always control-C) can interrupt whatever long-running
reduction currently takes place. A keyboard interrupt at the "`CL>`"
prompt will cause the interpreter to exit.

The `-p` command line option causes the interpreter to not print a
prompt. This might only have use for non-interactive input. The interpreter
does read from stdin and write to stdout. You can use it as a non-interactive
"filter", with input and output redirection.

Expressions consist of either a single term, or two (perhaps implicitly parenthesized) terms. Terms consist of either a built-in primitive or a variable, or a parenthesized expression.

Built-in pimitives have names consisting of a single upper-case letter. Variables (which can also serve as abbreviations) can look like C or Java style identifiers: a letter, followed by zero or more letters or underscores. You cannot define a variable (or an abbreviation) with the same name as a built-in pimitive.

The interpreter treats primitives and variables as "left associative",
the standard in the Combinatory Logic literature.
That means that an expression like this:
`I a b c d` ends up getting treated as though it had parentheses
like this: `((((I a) b) c) d)`

To apply one complex term to another, the user must use parentheses.
Applying `W (W K)` to `C W` would look like this:
`(W (W K)) (C W)`.

Users can parenthesize input expressions as much or as little as they desire,
up to the limits of left-association and the meaning they wish to convey
to the interpreter.
The grammar used by `cl` does not
allow single terms inside paired parentheses. It considers strings
like "`(I)`" as syntax errors. You have to put at least two terms
inside a pair of parentheses, and parentheses must have matches.

The interpreter prints out normal forms in minimal-parentheses style. Users have the ability to cut-n-paste output back into the input, as output has valid syntax. No keyboard shortcuts exist to re-use any previous output.

I built-in 9 primitive combinators. They contract like this:

I a → a |

K a b → a |

S a b c → a c (b c) |

B a b c → a (b c) |

C a b c → a c b |

W a b → a b b |

T a b → b a |

M a → a a |

J a b c d → a b (a d c) |

This set of built-ins lets you use `{S, K}`, `{S, K, I}`,
`{B, W, C, K}`, `{B, M, T, K}` bases for λK calculi,
or
`{J, I}`, `{B, C, W, I}` and `{I, B, C, S}` as bases for λI calculi.

Built-in combinators require a certain number (one to four) of arguments to cause a contraction. They just sit there without that number of arguments.

You can "turn off" any of the nine combinators as built-ins with a `-C X`
command line option (

"Bracket abstraction" names the process of creating a CL expression without specified variables, that when evaluated with appropriate arguments, ends up giving you the original expression with argument(s) in the place of the specified variables.

The `cl` interpreter uses the conventional square-bracket
notation. For example, to create an expression that will duplicate
its single argument, one would type:

`CL> [x] x x`

You can use more than one variable inside square brackets, separated with commas:

`CL> [a, b, c] a (b c)`

The above square-bracketed expression ends up working through three
bracket abstractions, abstracting `c` from `a (b c)`,
`b` from the resulting expression, and `a` from
that expression. You can nest bracket abstractions: `[a][b][c] a (b c)`
should produce the same expression as the example above.

A bracket abstraction makes an expression, so you can use them where ever you might use any other simple or complex expression, defining an abbreviation, a sub-expression of a much larger expression, as an expression to evaluate immediately, or inside another bracket abstraction. For example, you could create Turing's fixed-point combinator like this:

CL> def U [x][y] (x(y y x)) CL> def Yturing (U U)

Note the use of nested bracket abstractions. The abstraction of `y`
occurs first, then `x` gets abstracted from the resulting expression.

You could express `[x][y] (x(y y x))`
as `[x,y] (x(y y x))`. The same nested abstraction occurs.

`cl` offers seven bracket abstraction algorithms:

`curry`- classic, minimalistic three-rule system for {S,K,I} basis.`curry2`- four-rule system for {S,K,I} basis. Hindley and Seldin's Definition 2.18`tromp`- John Tromp's nine-rule system for compact results in the {S,K,I} basis.`turner`- Simplified Turner algorithm, using S,B,C,K,I combinators.`grz`- the Grzegorczyk algorithm for {B,W,C,K} basis.`btmk`- algorithm for {B,T,M,K} basis.`church`- algorithm for {I,J} basis.

You can set a "default" algorithm with the this command:
`abstraction name`. For

You can specify the abstraction algorithm next to the abstracted variable:
`CL> [x]btmk (x (K x))`

`cl's` grammar allows mixing algorithms, one for each
pair of square brackets. The resulting expression may not make any sense.

- [x] x →
**I** - [x] N →
**K**N*x does not appear in N* - [x] M N →
**S**([x] M) ([x] N)

- [x] x →
**I** - [x] N →
**K**N*x does not appear in N* - [x] M x → M
*x does not appear in M* - [x] M N →
**S**([x] M) ([x] N)

Hindley & Seldin call this "Definition 2.18" in the 2008 edition of
*Lambda Calculus and Combinators: An Introduction*.
This algorithm has the distinct advantage of giving back a single combinator
when you ask for that combinator's definition as a bracket abstraction.
`[p,q,r] p r (q r)` works out to **S**, and so forth.

I put in an algorithm that uses **B**, **T**,
**M** and **K** combinators. I devised it,
but presumably someone
else has come up with this same algorithm in the past.
The {B, T, M, K} basis is not unknown.

- [x] x →
**B (T M) K** - [x] Z →
**K**Z*x not appearing in Z* - [x] Q x → Q
*x not appearing in Q* - [x] Q P →
**B**Q ([x] P)*x appears only in P, not in Q* - [x] Q P →
**B**(**T**P) ([x] Q)*x appears only in Q, not in P* - [x] Q P →
**B**(**T**(**B**(**T**[x]P) (**B****B**[x]Q))) (**B****M**(**B****B****T**))*x appears in both Q and P*

`cl` includes an implementation of
M. Price and H. Simmon's reconstruction of Grzegorcyzk's "g" algorithm, using
**B**, **C**, **K**, **W**, **I** combinators.

- [x] x →
**I** - [x] Z →
**K**Z*x not appearing in Z* - [x] Q x → Q
*x not appearing in Q* - [x] Q P →
**B**Q ([x] P)*x appears only in P, not in Q* - [x] Q P →
**C**([x]Q) P*x appears only in Q, not in P* - [x] Q P →
**W**((**B**(**C**([x]Q)))([x]P))*x appears in both P and Q*

If you consider **W K** as a replacement for **I**,
you can get away with using **B**, **C**, **K**
and **W**.

Abstraction for I,J basis from A. Church,
*A Proof of Freedom From Contradiction*,
Proceedings of National Academy of Sciences, Vol 21, 1935, page 275.

- [x] x →
**I** - [x] U V →
**J**(**J****I****I**) ([x]V) (U**I****J**)*where x appears only in V* - [x] U V →
**J**(**J****I****I**) V [x]U*where x appears only in U* - [x] U V →
**J**(**J****I****I**) (**J****I****I**) (**J****I**(**J**(**J****I****I**) (**J****I****I**) (**J**(**J****I****I**) [x]U (**J**(**J****I****I**) [x]V**J**))))*where x appears in U and V.*

The {I,J} basis constitutes a "λI" basis. All well-formed abstractions
contain the abstracted variable in the "body" of the abstraction.
The above algorithm will not work given input like `[x,y]x`,
where the abstracted variable `y` does not appear in the expression
out of which `y` gets abstracted.
In fact, that input crashes the interpreter.

John Tromp created a nine rule algorithm
(in Binary Lambda Calculus and Combinatory Logic)
which optimizes for the minimum number of
**S** and **K** combinators in the abstracted
expression. Tromp wants to get a small self-interpreter, and he confines
himself to just **S** and **K** combinators.
For example, the "S K M → S K" rule could change to "S K M → K I",
if you have an **I** primitive.

- [x](
**S****K**M) →**S****K***(For all M)* - [x] M →
**K**M*(x not appearing in M)* - [x] x →
**I** - [x] M x → M
*(x not appearing in M)* - [x] x M x → [x] (
**S****S****K**x M) - [x] (M (N L)) → [x] (
**S**([x] M) N L)*(M, N combinators)* - [x] ((M N) L) → [x] (
**S**M ([x] L) N)*(M, L combinators)* - [x] ((M L) (N L)) → [x](
**S**M N L)*(M, N combinators)* - [x] M N →
**S**([x] M) ([x] N)

Phrases like *M, N combinators* in rules 6, 7 and 8
mean "M, N have no variables, free or otherwise". This abstraction algorithm has the
goal of smallest resulting expression. For Tromp's purposes, any variable exists
for the purpose of abstracting it out of the expression. These rules don't produce
the smallest resulting expression if you abstract another variable from them.

I implemented rule 8 by doing a depth-first traversal of parts of parse trees to determine if the potential "L" portions of the rule equate. A faster method of doing that may exist.

- [x] x →
**I** - [x] N x → N
*x not appearing in N* - [x] N →
**K**N*x not appearing in N* - [x] M N →
**C**([x]M) N*x appears only in M, not in N* - [x] M N →
**B**M ([x] N)*x appears only in N, not in M* - [x] M N →
**S**([x]M) ([x]N)*x appears in both M and N*

`define`*name**expression*`def`*name**expression*`reduce`*expression*

`define` and `def` allow a user to introduce "abbreviations"
to the input. Each time the abbreviation appears in input, `cl` makes a copy
of the expression so abbreviated, and puts that copy in the input.
No matter how complex the expression, an
abbreviation still comprises a single term. Effectively, the interpreter puts
the expanded abbreviation in a pair of parentheses.

`def` makes an easy-to-type abbreviation of `define`.

A name in interpreter input has the same format as a Java or C identifier: a letter, followed by zero or more letters, digits or underscores.

The `reduce` command produces an expression, just
like a bracket abstraction. Unlike `define` or `def`,
you can use `reduce` anywhere
an expression would fit, as part of a larger expression, as part of an
abbreviation, or as part of a bracket abstraction.

`size`- print the number of atoms in*expression**expression*.`length`- print the number of atoms plus number of applications in*expression**expression*.`print`- print human-readable representation, with abbreviations expanded, but without evaluation.*expression*`printc`- print canonical representation, with abbreviations substituted, but without evaluation.*expression*`redexes`- print a count of possible contractions in*expression**expression*, regardless of order of evaluation.- determine lexical equivalence of any two expressions, after abbreviation substitution, but without evaluation.*expression*=*expression*

`print` lets you see what abbreviations expand to, without evaluation, as does `printc`.
The "=" sign lets you determine *lexical* equality. All combinators, variables and parentheses
have to match as strings, otherwise "=" deems the expressions not equivalent.
You can put in explicit `reduce` commands on both sides of an "=",
otherwise, no evaluation takes place.

`size` and `length` seem redundant, but authorities
measure CL expressions different ways. These two methods should cover the
vast majority of cases.

`trace on|off``debug on|off``step on|off`

You can issue any of these commands without an "on" or "off" argument to inquire about the current state of the directive.

`trace` and `debug` make for
increasingly verbose output. `trace` shows the expression
after each contraction, `debug` adds information about which
branch of an application it evaluates.

`detect` causes `trace` to also print a count
of possible contractions (not all of them normal order reductions),
and mark contractable primitives with an asterisk.

`step on` causes the interpreter to stop after each contraction,
print the intermediate expression, and wait, at a `?` prompt
for user input. Hitting return goes to the next contraction, `n`
or `q` terminates the reduction, and `c` causes it
to quit single-stepping.

`timer on|off`- turn on/off per-reduction elapsed time output.`timeout 0|N`- stop reducing after`N`seconds.`count 0|N`- stop reducing after`N`contractions.`cycles on|off``detect on|off`

You can turn time outs off by using a 0 (zero) second timeout. Similarly, you can turn contraction-count-limited-evaluation off with a 0 (zero) count.

`timer on` also times bracket abstraction.

Some CL expressions end up creating a cycle: `M M` or `W W W`
or `B (W K) (W (W K)) (B (W K) (W (W K)))`. After a certain number of
contractions, the interpreter ends up with an expression it has already encountered.
If you issue the `cycles on` command, the interpreter keeps track of
every expression in the current reduction, and stops when it detects a cyclical
reduction.

`detect on` causes the interpreter to count and mark primitives eligible
for contraction (with an asterisk),
regardless of reduction order. It does "normal order" reduction, but ignores that
for the contraction count. This only has utility with `trace on`.

Turning cycle detection on will add time to an expression's reduction, as will possible contraction detection.

`cl` v1.6 adds a way to control reduction: stop when the (partially)
reduced expression matches a pattern.
`cl` can perform pattern matching after each contraction, when
the user sets a pattern using the `match` command.
`unmatch` destroys the internal state created by `match`,
so the user can't recall a previous `match` except by complete
re-entry of the command.

`match`- stop reduction if and when*pattern**pattern*appears.`unmatch`- relieve the interpreter of checking for a pattern match during reduction.

Patterns look like expressions with one exception. An asterisk ('*') acts as a wildcard, matching any expression. Any other variable or built-in primitive ocurring in the pattern matches itself literally.

For example, issuing the command `match S K K` would cause the reduction of
`S (I K) (S K) K` to stop after a single contraction.

`load "`*filename*"

You have to double-quote filenames with whitespace or
non-alphanumeric characters in them.
You can use absolute filenames (beginning with "/") or you can
filenames relative to the current working directory of the `cl`
process.

The `printc` command, and cycle detection output
use a canonical form of representing a CL expression.

In this case, "canonical" means: pre-order, depth-first, left-to-right traversal of the parse tree, output of a period (".") for each application, and a space before each combinator or variable.

A simple application (`K I`, for example) looks like this: `.K I`

A more complex application (`P R (Q R)`) looks like this: `..P R.Q R`

The advantage to this sort of notation is that every application appears explicitly, and variant, semantically-equivalent parentheses don't appear.

I developed this program on Slackware and Arch Linuxes. I used versions of the C compilers GCC, LCC, PCC, TCC and Clang. I tried to make it as ANSI-C as possible by using various compilers, not allowing any compiler warnings when building, and using as many warning flags for GCC compiles as possible.

I expect it will build on any *BSD-based system, but I haven't tried that.

- Get the source code. You can:
- Download via your browser.
- Download using wget:
`wget http://www.stratigery.com/cl-1.6.tar.gz`

- Unpack source code. From the command line:
`tar xzf cl-1.6.tar.gz` - Change directory. From the command line:
`cd cl-1.6` - Compile source code. From the command line:
`make gnu`.

That target ("gnu") should work for almost every linux or BSD computer. Other make targets might work better.`make cc`uses traditionally-unix-named tools, and may work better on Solaris. Traditional, AT&T derived`lex`will require a minor edit to file`lex.l` - At this point, you can test the newly-compiled executable.
From the command line:
`./runtests`. Most of the tests should run quite rapidly, in under a second. At least two of the tests run for 30 seconds or so, and at least one of the tests provokes a syntax error message from the interpreter. - Install the interpreter if you want, or you can execute it in-place.
To install, use the
`cp`or`mv`commands to move or copy the executable to where ever you want it. It does not care what directory it resides in, and it does not look for configuration files anywhere.