One way to avoid the tricky interaction between free and bound names in the substitution operator is to pick a representation for expressions that doesn’t have any names at all! Intuitively, we can think of a bound variable is just a pointer to the that binds it. For example, in , the points to the first and the points to the second .
So-called de Bruijn notation uses this idea as the representation for expressions. Here is the grammar for expressions in de Bruijn notation: Variables are represented by integers that refer to (the index of) their binder while lambda-abstractions have the form . Note that the the variable bound by the abstraction is not named--i.e., the representation is nameless.
As examples, here are several terms written using standard notation and in de Bruijn notation:
Standard
de Bruijn
To represent a -expression that contains free variables in de Bruijn notation, we need a way to map the free variables to integers. We will work with respect to a map from variables to integers called a context. As an example, if maps to 0 and to 1, then the de Bruijn representation of with respect to is 0 1, while the representation of with respect to is . Note that in this second example, because we have gone under a , we have shifted the integers representing and up by one to avoid capturing them.
In general, whenever we work de Bruijn representations of expressions containing free variables (i.e., when working with respect to a context ) we will need to modify the indices of those variables. For example, when we substitute an expression containing free variables under a , we will need to shift the indices up so that they continue to refer to the same numbers with respect to after the substitution as they did before. For example, if we substitute 0 1 for the variable bound by the outermost in we should get , not . We will use an auxiliary function that shifts the indices of free variables above a cutoff up by : The cutoff keeps track of the variables that were bound in the original expression and so should not be shifted as the shifting operator walks down the structure of an expression. The cutoff is 0 initially.
Using this shifting function, we can define substitution as follows: Note that when we go under a we increase the index of the variable we are substituting for and shift the free variables in the expression up by one.
The rule for terms in de Bruijn notation is as follows: That is, we substitute occurrences of 0, the index of the variable being bound by the , with shifted up by one. Then we shift the result down by one to ensure that any free variables in continue to refer to the same things after we remove the .
To illustrate how this works consider the following example, which we wrote as in standard notation. We will work with respect to a context where and . which, in standard notation (with respect to ), is the same as .
Combinators
Yet another way to avoid the issues having to do with free and bound variable names in the -calculus is to work with closed expressions or combinators. It turns out that just using two combinators, S, K, and application, we can encode the entire -calculus.
Here are the evaluation rules for S, K, as well as a third combinator I, which will also be useful: Equivalently, here are their definitions as closed -expressions: It is not hard to see that I is not needed—it can be encoded as S K K.
To show how these combinators can be used to encode the -calculus, we have to define a translation that takes an arbitrary closed -calculus expression and turns it into a combinator term that behaves the same during evaluation. This translation is called bracket abstraction. It proceeds in two steps. First, we define a function that takes a combinator term possibly containing free variables and builds another term that behaves like , in the sense that for every term : Second, we define a function that maps a -calculus expression to a combinator term: As an example, the expression is translated as follows: We can check that this behaves the same as our original -expression by seeing how it evaluates when applied to arbitrary expressions and . and