Friday, April 27, 2018

Symbolic Computation

So what is symbolic computation? To better understand what it is, we need to understand the two terms that make up the name: symbolic and computation.

The use of the word symbolic is ancient -- from the Latin symbolicus, Greek συμβολικός meaning "of, or belonging to a symbol"; and thence to symbol (Ancient Greek συμβάλλω), meaning "a sign by which one infers something". And, most interestingly, its meaning has not changed! To quote from Wiktionary:

Any object, typically material, which is meant to represent another (usually abstract) even if there is no meaningful relationship.
     The dollar symbol has no relationship to the concept of currency or any related idea.
In the same way, the symbol π (pi) denotes a particular real number, but has no concrete relation to that number, other than through convention. It is important to note that "2" and "53" are likewise symbols, as are "+", "*" and so on. They do not have intrinsic meaning, even though it is harder for people trained in mathematics (and computer science) to remember so.

Computation is actually significantly harder to define. For our purposes, it suffices to say that it is something implementable (as a procedure) on a computer. (Procedure is used here even for what functional programmers would call 'pure functions', to separate them from 'functions' which, for the purposes of this blog post, are reserved for 'mathematical functions')

So symbolic computation is computation in which symbols feature prominently.

On one level, as most everything done on a computer involves at least one level of encoding, one could claim that all computation is symbolic. While strictly true, this isn't necessarily a very useful way to look at things. What we want to focus on here are those computations that involve symbols in a crucial way.

So an example is in order. Assuming that x is a symbol denoting some real number, we know that sin(x)^2+cos(x)^2=1. This equation is true for all values of x. Note that there are way too many real numbers -- the vast majority of them cannot be written down. And yet this equation holds. It is a prime example of a symbolic equation. So what do we really mean by it? We mean that "given any actual real number r, if we substitute r for x in sin(x)^2+cos(x)^2=1, then evaluate the resulting expression, the result will be a tautology". Evaluation is often pictured as a form of computation. However, as used above, it is different, as that computation cannot actually be carried out on a computer, as arbitrary real numbers are non-representable. But it is perfectly fine as a evaluation that happens in the field of real numbers. So we have evaluation, which happens in the space of denotations of our expression (here, the field of reals), and computation, which happens over (symbolic!) expressions.

In a sense, symbolic computation is symbol shuffling. If that's really all it was, it wouldn't be very useful. Most symbolic computation, however, obeys a very strict set of rules. Those rules are usually rather far from arbitrary. But if that is so, where do those rules come from? Denotational semantics! In other words, our symbols have meaning, and the objects in the space of denotations obey some rules. And, somewhat miraculously, we can write down many of these rules. We give them names and render them in phrases such as "addition is commutative" and "multiplication is associative".

So symbolic computation is really symbol shuffling with a purpose, and that purpose is to respect the rules that hold in the space of meanings we intend. When you attempt to compute a closed-form for the derivative of an expression e, what you are really asking is for the meaning of the resulting expression to denote a function, which is related to the input expression e by differentiation. The derivative of e is usually computed by various elaborate symbol shuffling operations, while derivatives are defined via limits. What is desired is that the symbolic manipulation preserves the meaning of 'derivative'.

One of the biggest difficulty with properly understanding symbolic computation stems from several issues:

  1. All human mathematics is written down, and so we cannot easily see the difference between syntax (symbols) and semantics (their denotation).
  2. We never write down 'denotation brackets' in our equations.
  3. We treat functions and the name of the function as being largely equivalent.
  4. We silently accept unevaluated functions as symbols, but are happy to evaluate them when we can.
That last point needs elaborated on: no one will blink if sin(0) is reduced, without mention, to 0, but sin(1/3) can stay 'as is' without difficulty. But this is incoherent: either sin is a function and it should reduce in all cases, or it is a symbol, in which case doing 'reduction' is an active process that should be clearly indicated.

In any case, the point is that symbolic computation is computing with symbols, which come from a specific language, and those computations are supposed to model a particular meaning. In other words, a particular computation on symbols is supposed to reflect the evaluation of a function on the space of denotations of that language.

One last remark on why this is rather non-trivial:

  1. Some spaces of denotations (like all functions from the reals to the reals) are unbelievably larger that the corresponding space of terms.
  2. Some spaces of denotations (like all boolean-valued functions on 2 boolean variables, i.e. of size 4) are quite a bit smaller than the corresponding space of terms (countably infinite if given 'and', 'or', 'not' and 2 variable names)
  3. Some questions, like "factor this polynomial", have answers which involve concepts outside the natural domain of denotation: a polynomial has a bag of factors, which requires us to enlarge the space of denotations quite a bit to answer them. This is because, in a sense, factoring is a naturally syntactic operation, whose meaning is destroyed by semantic operations. (x-1)(x+1) = (x^2-1) is a tautology if our semantics for polynomials is pair (d, v) of a degree and a vector v (of length d) of coefficients with the caveat that the first be non-zero.
Thus the correspondence between syntax and semantics is necessarily quite complex. This seems to be largely under-appreciated.