Function Officium

[**Part One**] / [Part Two]

William F. Barnes

Copyright © 2014-2024 by William F. Barnes. All rights reserved. Unauthorized retention, duplication, distribution, or modification is not permitted. [Legal]

*function* is an abstract "action" that receives an input and produces an output. Depending on one's path of education, the notion of "functions" tends to first arise in one of two disciplines: mathematics or computer programming.

In mathematics, one learns early about graphing equations, such as straight lines:

y = mx + b

or circles:

x^2 + y^2 = r^2

It's all candy and cupcakes until the teacher introduces the dreaded **f(x)** notation, where the notion of a "function" **f** replaces the **y**-variable in *some* equations, such as that of a line, but not *all* equations, such as that of a circle. In the notation **f(x)**, the quantity **x** is input, also known as the *argument*. The *output* of the function is just **f(x)**.

When it comes to sorting out just which mathematical equations correspond to functions, always stick to the definition: *one output for one input*. This means all non-vertical lines are functions:

if (m <> +/- infinity):

y = f(x) = mx + b

**y**, giving:

y = +/- sqrt(r^2 - x^2)

**y**-equation, one positive and the other negative. This means that for a given **x** on the circle, there are *two* **y**-values, so **y** as written cannot be a function. To deal with this, the circle must be described by two functions, one for each side of the **x**-axis:

f(x) = +sqr(r^2 - x^2)

g(x) = -sqr(r^2 - x^2)

**s** is called a *parameter*. (Recall that outside the function, the value that **s** represents was an *argument*.) The body of the function uses QB64 language primitives (also functions) to perform operations on **s** (without editing it!), storing the result the output string. In practice, one may use the above function to write **h$("hello")** to receive "**Hello!**" back.

Another structure for code reuse is the "subroutine", or "subprogram". Using a *call* statement, control is passed away from the main program, into the subprogram, and then back to the main program under the call statement. The main program should call the subroutine more than once to truly claim "code resuse", however subroutines also appeal to code organization if this isn't case.

The similarity between functions and subroutines stops here, however, or at least it should. It is tempting (but wrong) to believe, whether by one's own devices or through institutional misinformation, that functions should be used just like subroutines, especially when you want a value returned. On the contrary, functions should be regarded as additional "keywords" that the programmer adds to the language.

Going back to the mathematical view of a function, **f(x)** is considered a *pure function*, which is to (re-)state the fact that the same input always corresponds to the same output. It must not matter how long the calculation takes, what else is going on in the program, which phase is moon is in - nothing.

A pure function is logically equivalent to a lookup table. Such a lookup table would be infinitely tall if the inputs **x** are unconstrained, but nonetheless we can imagine a pure function as addresses (**x**-values) pointing to data (**f(x)**-values).

*impure function* in a mathematical framework, but this is a no-brainer in computer programming. (Emphasis on "no-brainer". We will develop a bias against impure functions as we go on.) Starting off easy, the implest example of an impure function is one that returns the time:

**t$()**, which simply reutrns the system time, is not a pure function. The input is defined as always empty (no problem yet), however the output is always changing. If this function were in a pipeline of otherwise pure functions, the entipre pipeline loses purity. Impure functions can get a lot worse than this, too. Time to talk about "state".

*state* of a program refers to everything the program has in memory. Every variable, function, array, stack, heap, and so on, is part of the state. Now we can make an important distinction: in computer programming, a pure function *cannot* change the program state. On the other hand, an impure function *might* change program state. Let's see a real impure function:

In the above we have **c** defined as a global variable. Inside the function however, the line **c = x * y** assigns a new value to **c**. After the function closes, **c** remains changed. Any other part of the program depending on **c** needs to contend with **c** changing whenever **impure(x,y)** is called. The function **impure(x,y)** thus changes program state, which is to say the function has *side effects*, in technical jargon. While plenty of BASIC programmers use side effects as a feature, we pursue a discipline that classifies side effects as spaghetti code.

To convert the above to a pure function, simply remove any reference to **c** from the function body and replace the helper variable with something privately-scoped:

Soon we'll step into the weeds by working with *high-order functions*. By definition, a high-order function is a function that either (i) takes a function as input, and/or (ii) writes a function as its output. This seems innocent enough, perhaps mundane or odd, until you try to think of an example.

In ordinary algebra, it's awkward to think of passing a function to a function. For instance, writing **f(cos)** doesn't seem to mean anything at all. You'd rather see **f(cos(x))** with a numerical argument. In fact, the first mathematical application of "passing functions to functions" occurs in the *calculus of variations*, beyond the scope of this study.

Passing around a construction like **f(cos)** is slightly more natural in computer programming, but not in all languages. No BASIC-family language would know what to do with **f(cos)**. For the remainder of this study, in fact, we must leave QB64 behind and embrace more abstract means and methods. In fact, let us momentarily leave computer languages alone for a moment and think abstractly.

**f** that receives an object (synonym for "piece of data") **x** as an input, and outputs **f(x)** as the same *type* of object. Calling the output object **y**, we may write:

y = f(x)

**y** is an object of the type that **f** "wants" as input, we calculate **f(y)** to produce a new output **z**:

z = f(y)

**z** all along, it turns out **y** was just a middleman, so it's perfectly valid to write **z** in terms of the original **x** by layering two calls of **f** as shown:

z = f(f(x))

Kicking things up a notch, we now introduce the *compound function*:

g(x) = f(f(f(x)))

h(x) = f(f(g(x)))

**g(x)** is the same as above. By subsituting **g(x)** for its equivalent in layers of **f(x)** (namely three), we discover **h(x) = f(f(f(f(f(x)))))**, which is *five* applications of **f** to the argument **x**.

Continuing the above example, it's easy to see that the function **h(x)** can also be made by swapping the order of calls to **f** and **g**. For instance, an equivalent representation is **h(x) = f(g(f(x)))**, or just as equivalently, **h(x) = g(f(f(x)))**. Each of these resolve to **h(x) = f(f(f(f(f(x)))))**.

To put a name to what we're doing, let's say that functions like

k(x) = g(g(g(x))) = f(f(f(f(f(f(f(f(f(x)))))))))

**g(x) = f(f(f(x)))**. While this construction for applying **f** three times is efficient enough, wouldn't it be nice to build in the "three-ness" some other way? What if we need (far) more than three layers of **f( )**? To begin handling all of this, let me propose a new way to write **g(x)**:

g(x) = e(f,x) = f(f(f(x)))

**e(f,x)** is a high-order function. The second argument sent to **e** just carries **x** into the function. The first argument is *only* a function with no **(x)**-appendage. In practice, this means that the **f**'s appearing in **f(f(f( )))** are be treated as a variable, and may be swapped out for another. To see the power of this, watch how easily we write **k(x)**:

k(x) = e(g,x) = g(g(g(x))) = f(f(f(f(f(f(f(f(f(x)))))))))

**e(f,x)** breaks the linearity ceiling. To keep going, we find **e(k,x)** contains **27** applications of **f**. The complexity is growing in powers of three:

e(k,x) = f(f(f(f(f(f(f(f(f(f(f(f(f(f(f(f(f(f(f(f(f(f(f(f(f(f(f(x)))...)

**f** had *really* ought to be a pure function, otherwise the side effects will get out of hand.

To solve a real problem with this apparatus, consider the function:

f(x) = x - tan(x)

... along with the high-order function:

e(f,x) = f(f(f(x)))

Now, watch what happens when we evaluate **e(f,2)** (the **2** came out of nowhere, its only a guess).

e(f,2) = f(f(2-tan(2)))

e(f,2) = f(2-tan(2)-tan(2-tan(2)))

e(f,2) = 2-tan(2)-tan(2-tan(2))-tan(2-tan(2)-tan(2-tan(2)))

e(f,2) = 3.26618627756911

**f**- and **e**- functions used together apply a recursive algorithm for approximating digits of **pi** by Newton's method. To hone in more accuraccy, we could feed the result **3.26618627756911** back into **e(f,x)**, but that's old news. Let's do it in one line by taking the **e**-function, and sending it **g** instead of **f**:

e(g,2) = g(g(g(2)))

e(g,2) = f(f(f(f(f(f(f(f(f(2)))))))))

e(g,2) = 3.141592653589793

*as such* to test these constructions. A programming language must qualify as *functional* to properly handle high-order functions.

There is a saying among programmers that goes something like "*style* is the part of your program ignored by the computer". While this statement is literally true, I point out that the computer doesn't let *you* ignore the consequenses of style, particularly bad style. The programmer is welcome to make his own mess and eat it too.

Certain languages foresee bad coding style and "engineer it out" to a certain limit. For instance, the Java language attempts to quell spaghetti code by *not* allowing the **goto** statement among its keywords. Or, the Haskell language avoids side effects by *not* allowing variables to be changed once they're defined. Meanwhile, BASIC-family languages are notorious for installing no safeguards, allowing all kinds of bad habits to be compiled and shipped by a clueless programmer.

While it's useful to classify individual languages on the spectrum of "pureness", we can't always gravitate to the purest languages to write the best programs. Sometimes, especially from an engineering perspective, you simply "need" to work in a paradigm that supports mutable objects, pointer math, or *goto*. There isn't always room for eggheaded purism. The language itself can't always protect us, so the burden falls doubly back on the programmer to write the best code possible, despite having potential to make a huge mess.

One way to avoid making a mess is by employing *functional style*, which is the main "political leaning" in this study. This means to treat functions as the most important, and perhaps the *only* building blocks of a program. Complexity, if it must exist, should arise from compounding small functions rather than bulking up one large function. Each function should do one job, and do it perfectly well with no side effects.

**zero** as the initial input, and (ii) the notion of the "**+1**" operation, which adds **one** to any integer. This means we begin with:

x_0 = 0

f(x) = x + 1

**x_0** inside **n** layers of **f(x)**, the result is the integer **x_n**:

x_1 = f(0) = 1

x_2 = f(f(0)) = 2

x_3 = f(f(f(0))) = 3

x_4 = f(f(f(f(0)))) = 4

x_5 = f(f(f(f(f(0))))) = 5

**f(x)** alone gets the job done, one quickly sees the problem with writing all those **f**'s, especially for high integers. In fact, this approach is old news (linearly complex), and doesn't take advantage of any functional trickery. To update this, introduce the high-order function:

e(f,x) = f(f(x))

Also, let's write down a flurry of ordinary functions that employ **e**:

h(x) = e(g,x) = f(f(f(f(x))))

i(x) = e(h,x) = f(f(f(f(f(f(f(f(x))))))))

j(x) = e(i,x) = ... (16 layers)

k(x) = e(j,x) = ... (32 layers)

As we list the functions **g**, **h**, **i**, etc., the number of **f**-layers doubles with each iteration. This is surely because the original high-order **e** function contains two calls to **f**, and we can expect a certain "two-ness" feeling as we proceed.

Now we're ready to re-create the list of integers. Carefully puzzling out each case, the following list emerges (stopping at 40):

**f**'s, **g**'s, **h**'s, etc. represent the corresponding integer **x_n** in binary!

*data*. The way the "data" is represented is strongly dependent on the function(s) beneath. For example if instead of using **e(f,x)=f(f(x))** with two layers, we could instead use **e(f,x)=f(f(f(x)))** with three layers, and the resulting set of integers would have a base-three representation in the symbols **f**, **g**, **h**, etc.

Now we jump into some deep water. So far, we've witnessed the utility of high-order functions in their capacity to condense data and algorithmic complexity into a unified notation. The next thing we might wonder is just how extreme this kind of thinking can go. That is, can we write a set of functions that can accomodate *any* data structure, and represent *any* algorithm? How many functions would be needed to do this?

It was this kind of thinking that ultimately gave rise to Turing machines, lambda calculus, and the whole field of computation as we know it. Before all of that though, another universal model of computation was proposed, called the *S and K combinators*, also known as the *SK calculus*. An excellent video presentation on the history of the S and K combinators was given by Stephen Wolfram on Dec 7, 2020, the 100th anniversary of their introduction. The corresponding article combinators: A Centennial View explores this topic using Wolfram Language constructions.

The S and K combinators, in their full majesty, occur as follows:

s[x][y][z] = x[z][y[z]]

k[x][y] = x

Looking at the **k** combinator first: **k** is a function that receives two arguments **x** and **y**, and all **K** does is "forget" about **y**, returning only **x**. Note that unlike the ordinary **f(x,y)** notation, multiple arguments are not enclosed in brackets. The quantiy **k[x][y] = x** is not equivalent to **k[[x][y]] = x**.

The symbols **x**, **y**, **z** can stand for *anything*: numbers, strings, trees, functions, objects, entire programs - really anything. The square brackets aren't like ordinary parentheses in the way they simplify. Never assume that something like **[[x]]** is equivlent to **[x]** in the same way that **((7))** simplifies to **(7)**.

The **s** combinator takes three arguments, and returns an awkward combination **x[z][y[z]]**.

...And that's it. Amazingly, this setup is all we need to compute *anything* imaginable!... Well, this is technically true, but keep in mind we're going down this road for conceptual purposes. Only a real madlad would actually use this stuff to build something.

Some introductions to SK calculus actually call it *SKI* calculus, where **I** stands for *Identity*. Indeed, there is a third combinator called the "identity", taking its name from linear algebra. The Identity is the combinator that simply returns its argument in the same way it was received. It's the abstract version of "multiplying by one" to not change a number.

The Identity combinator is in fact not fundamental, but can be expressed in terms of **S** and **K** combinators. (This had better be the case!) It turns out that the **I** combinator, or *identity combinator* is written:

i = s[k][k]

Testing this out on the enclosed object **[a]**, we see the **I** combinator does its job:

i[a] = s[k][k][a]

i[a] = k[a][k[a]]

i[a] = a

We shall explore the utility of S and K combinators by retracing an example detailed by Wolfram (link above). In this case, suppose we are handed the abstract object **[a][b][c]**, without specifying anything else about this object, S and K combinators allow any other object made from **a**, **b**, and **c**. To be definite, suppopse we start with **[a][b][c]** and need to end up with **a[b[a][c]]**.

To acheive this, take a look at the rather ugly operator:

o = s [s[k[s]][s[k[k]][s[k[s]][k]]]] [s[k[s[s[k][k]]]][k]]

... which has the qualitative format **s[x][y]**. Since there is no third argument, the operator **o** will remain static until something is appended to the right.

The plan now is to append **[a][b][c]** to the right of **o**, and simplify according the S and K rules. Letting a machine do the actual work, the computation goes as follows:

In case the above is hard to follow, Wolfram offers this visual representation of each step:

**[a][b][c]**, and then the operator **o** was applied to the left. After applying the S and K combinator rules recursively, the string boils all the way down to **a[b[a][c]]**. As a side comment, it seems that the meaning of the word "combinator" can be lifted from this example. The S and K combinators generate different "combinations" of the input data. With the right operations, the S and K combinators can in fact represent any data structure or any algorithm.

Let us conceive of some way to represent the integers using S and K combinators.

**s[k][s][k]**, which readily reduces to **k[k][s[k]]**, and then to simply **k**. For reasons that will become clear(ish) momentarily, let us interpret the number **zero** as the concatenation of the "zero-prefix" **s[k]** and the suffix **[s][k]**.

0 = s[k][s][k] = k

**one**, let us anticipate **1=s[k]** as a final form. To build this form, we begin with:

1 = p[s[k]][s][k]

**p** acting on the so-called zero-prefix, and then the whole quantity is suffixed by **[s][k]**. The obvious question, of course, is what should **p** look like? This takes some time to figure out, but with enough persistence, one finds:

p = s[s[k[s]][k]]

**p=s[s[k[s]][k]]** will be recycled for each new integer we make. In other words, **p** is analogous to the **+1** operation in a traditional paradigm.

**1=s[k]**, we build **2=s[s[k]]** by applying the increment operator **s[s[k[s]][k]]** to the one-prefix **p[s[k]]**, and then appending **[s][k]** to the right:

2 = p[p[s[k]]][s][k]

This calculation takes twelve steps to settle down, but it works:

**[s][k]** as a visual aid to separate the number-prefix from the (universal) suffix.

1 = p[s[k]] [s][k]

2 = p[p[s[k]]] [s][k]

3 = p[p[p[s[k]]]] [s][k]

4 = p[p[p[p[s[k]]]]] [s][k]

5 = p[p[p[p[p[s[k]]]]]] [s][k]

...

p = s[s[k[s]][k]]

Once simplified, the above reduces down to:

0 = k

1 = s[k]

2 = s[s[k]]

3 = s[s[s[k]]]

4 = s[s[s[s[k]]]]

5 = s[s[s[s[s[k]]]]]

...

a + b = (add. op.)[a prefix][b prefix][s][k]

The addition operator would be very hard to guess on the first try, so here it is:

(add. op.) = s[k[s]][s[k[s[k[s]]]][s[k[k]]]]

**1+2**, which ought to simplify to **3=s[s[s[k]]]**. Plugging everything into the formula above, we have, after 30 steps:

**s[s[s[k]]]** as expected. It follows that any two integers can be added this way, but the number of steps grows in turn. To calculate **3+4=7** requires 50 steps, and to calculate **30+40=70** requires 365 steps.

The product of two integers can be calculated using the same setup as the sum calculation:

a * b = (prod. op.)[a prefix][b prefix][s][k]

The product operator is much simpler than the addition operator. Still not trivial, so here you go:

(prod. op.) = s[k[s]][k]

**2*3**, which ought to simplify to **6=s[s[s[s[s[s[k]]]]]]**. Plugging everything into the formula above, we have, after 50 steps:

S and K combinators can deal with exponents in a similar spirit:

a ^ b = (exp. op.)[a prefix][b prefix][s][k]

The exponent operator is:

(exp. op.) = s[k[s[s[k][k]]]][k]

**3^2**, which ought to simplify to **9=s[s[s[s[s[s[s[s[s[k]]]]]]]]]**. Plugging everything into the formula above, we have, after 116 steps: