November 6, 2013

An imaginary tale (Mathematical perversions, part i)

The imaginary unit is something that always perplexed me through high school. Not because of the concept of a "number" that yields a negative number when squared, but rather because the definition seemed ambiguous. If i is defined as the number x that satisfies x2 = -1, that seems to imply i = -i ⇒ 2i = 0 ⇒ i = 0, since -i also satisfies the equation. It seemed kind of like defining the identity map on the real numbers as the unique differentiable ℝ→ℝ function f for which f' : x ↦ 1. How can it be the definition when it doesn't uniquely define the object?

It's a fair question for someone who is just learning about complex numbers, and probably something a lot of people struggle with at that level. I guess you could invalidate it by saying that the term 'definition' is really just an informal notion describing a category of axioms and there's no such fundamental limitation to what an axiom can and can't say, but when introducing this new and alien concept it's still reasonable to expect a more precise specification of what the imaginary unit is. It seems, however, that when confronted with this question the default response is to avoid giving a real answer and instead say that it's not important because i and -i are essentially the same thing - as in, you could replace i by -i everywhere from the ground up and still get the same results - which is certainly an interesting property, but not a satisfactory answer to the question.

A less obvious, but perhaps more relevant question, would be: what is -i? What is 2i, or 1 + i, or (-i)2? We have an equation defining i, but we haven't defined any other complex number, or any operation on i other than quadration. Can we really say that -i exists before we define it? Come to think of it, have we even really defined i, or did we just define quadration on it? Now we're getting to the root of the issue: we're trying to define both i and how an operation acts on i at the same time, but to define how an operation acts on an element, that element has to exist first!

There are two basic ways to go about when defining numbers: as sets or as so-called "urelements" (elements that are not sets). I'm mentioning urelements to clarify that my criticism of the conventional definition of i and complex numbers in general is not just a petty complaint based on the fact that it doesn't assign i to any particular set. See, pretty much all of modern mathematics is built on the foundations of ZFC, the Zermelo-Fraenkel set theory with the axiom of choice, wherein all objects (numbers, tuples, functions, relations...) are sets, but the properties of objects such as numbers when considered as sets (i.e. what elements they contain) are not important and the fact that they are sets is mostly ignored because it's irrelevant to their usage, and sets are usually thought of as just another type of object (as in, there's a fundamental difference between {0} and 1 because one is a set and the other's a number).

The set-theoretic model of numbers is implemented by first defining natural numbers, then integers, then rational numbers, then real numbers and then finally complex numbers. There are many ways to do it; I'll describe the most common here. 0 is defined as the empty set Ø, and then we introduce a successor function and define each natural number other than 0 as the successor of the number before it.

(There were supposed to be some fancy LaTex-style formulas here via MathJax, but apparently Google hates its customers and does its best to prevent every attempt to improve their services, so you can thank them for disabling that and making everything look like shit.)

0 := Ø
S: nn ∪ {n}

This yields 1 := S(0) = {Ø}, 2 := S(1) = {Ø, {Ø}}, 3 := S(2) = {Ø, {Ø}, {Ø, {Ø}}}, and so on (in general n = {0, 1, 2, ..., (n-1)}). Then we define addition on the natural numbers by

m + 0 := m
m + S(n) := S(m) + n

which yields 7 + 2 = 7 + S(1) = S(7) + 1 = 8 + 1 = 8 + S(0) = S(8) + 0 = 9 + 0 = 9. Then, calling the set of natural numbers ℕ and the set of 2-tuples of natural numbers ℕ2, we assign to each natural number m the set {z ∈ ℕ2| ∃n ∈ ℕ: z = (m + n, n)}, and call this new set the integer m. But wait, is the integer m not the same as the natural number m? Is the set of natural numbers not a subset of the integers? These are questions that might seem natural to ask, but they're not important; they're just linguistic detail. You could say that the set of elements we just defined is a different model of the natural numbers that are also integers, or you could say that what we called natural numbers above are just a bunch of sets we used to define integers (and later we replace the integers in the same way by rational numbers). As said, it's not relevant to their usage.

For each natural number m we also define the integer -m as {z ∈ ℕ2| ∃n ∈ ℕ: z = (n, m + n)}. For example, 3 = {(3,0), (4,1), (5,2), ...} and -4 = {(0,4), (1,5), (2,6), ...}, where the numbers within the brackets are the original natural numbers. For addition on integers we have to do a case-by-case definition; I'll demonstrate the case for two positive integers here: m+n is the integer that contains the element (m+n,0), where the plus sign in the parentheses denotes the original natural number addition. Subtraction is defined by m-n = m+(-n). In the same fashion as integers, rational numbers are defined such that, for example, the rational number 5 (or 5/1) is equal to {(5,1), (10,2), (15,3), ...}, and 7/2 = {(7,2), (14,4), (21,6), ...}, where the numbers within the brackets are the elements we introduced as integers above. Real numbers are defined as proper nonempty subsets of the set ℚ of rational numbers, that are closed downward and contain no greatest element. For example, the real number 7/2 is the set of rational numbers q that are strictly less than the rational number 7/2 (I won't go into detail on how the relation < is constructed, but for example for natural numbers, 3 < 5 because there exists a non-zero natural number n such that 3 + n = 5), and π is, in an intuitive sense, the set of rational numbers that are less than π (the precise definition is more complicated).

Finally, complex numbers are defined as 2-tuples of real numbers. For example, i := (0,1), and (5,-2) is the element we usually think of as 5 - 2i. The representation of the complex number (5,-2) as 5 - 2i could in fact be considered as a less simplified expression than (5,-2), just like 1 + 2 is less simplified than 3, as what it actually means is "the element which is returned when the operation - acts on the elements 5 and 2i", where by 5 we actually mean the complex number (5,0) and by 2i we mean the element which is returned when the operation × acts on 2 and i, where 2 is the complex number (2,0), and the result of all these operations is (5,-2). The same ambiguity is present when we write rational numbers as, for example, 7/2 - the / sign usually denotes an operation, but we don't think writing the number like that is like representing 3 as 1 + 2, because we don't have a more compact way of writing it. Anyway, we define addition, multiplication and so on on the tuples of real numbers, and then we define exponentiation by integers as repeated multiplication, which implies i2 = -1.

The definition based on urelements requires an axiom to assert the existence of each new type of number you introduce. For example, 0 exists and is a natural number, and for each natural number its successor exists, and for each natural number n, -n exists, and for each pair of integers m, n such that n ≠ 0, m/n exists and so on, and i exists, and for each pair of real numbers a and b, bi and a + bi exists. Obviously you need to be more precise than that, but that's the gist of it.

In order to truly grasp the problem with the conventional definition of i, you must also understand what a function (or operation, which is the same thing) really is. According to the simplest set-theoretic model, an element f is a function if all its elements are 2-tuples and for all elements x such that there exists some y such that (x,y) ∈ f, there exists no zy such that (x,z) ∈ f (by this definition, y is the element we denote as f(x)). If we don't want functions to be sets, there's the less formal, more intuitive model that says a function consists of a set (the domain of definition) and a rule that maps each element x of the set to another element y.

The problem now becomes apparent. Going by the set-theoretic definitions, the set (0,1) already exists before we define i, so that's not the issue. But we already have a logical, natural definition of i and the complex numbers as tuples of real numbers, and even if we for some reason want to clumsily circumvent that, we have to define the function of exponentiation that we refer to in the definition i^2 = -1. It's not clear exactly which function the ^ sign refers to, but let's say the exponent is fixed to 2 and the operation is just quadration (denoted by ^2, not just ^). It doesn't matter, because (a,b)^2 refers to the element (a2 - b2, 2ab) in either case. Then we define i as the unique element (a,b) ∈ ℝ2 (where ℝ is the set of real numbers) such that (a,b)^2 = (-1,0). But such an element does not exist, as both (0,1) and (0,-1) have this property. Should we define i as an element (a,b) such that (a,b)2 = (-1,0) and never specify which one we're referring to? That's retarded, and there's no reason to do so. It seems that to get a proper definition of i out of the equation i^2 = -1 we have to say that ^2 refers to the operation {((0,1),(-1,0))}, i.e. the operation defined on the set {(0,1)} that acts on the single element (0,1) and maps it to (-1,0). That's also retarded, and incredibly contrived. It's like defining 1 not as {0} but as the element f(0) where f is the function {(0,{0})} that maps the single element 0 to {0}. It accomplishes the same thing, but artificially inflates the amount of description needed for the definition for no reason. At that point, your only option is to give up and define i the proper way.

With the urelement approach, you can't define i by i^2 = -1 as the function that the ^ sign refers to must exist to make that definition, and for that function to exist, the set that is to be the domain of definition must exist, and for that set to contain i, i must exist (and then there's also the issue we ran into above). Recall that by this approach, i doesn't exist until you postulate that it does. In other words, the definition of i is given by the axiom "there exists an element i that is an urelement but doesn't belong to any category of urelements that we have previously introduced".

Now, you might say this is all highly technical and it's a waste of time to explain this to high school students because there are more practical matters to attend to, but that's a terrible way of viewing education. Teaching people the whats but not the whys is probably a good way to create office drones and construction workers, but if you look at the big picture you'll just end up with a population of mindless, soulless beings that know how to obey, that cling to outdated morals and worldviews without knowing why, that lack creativity and critical thinking. I think the most important role of education is not to be a factory that creates products that can perform simple tasks, not to teach people what to think, but how to think. To create critical, analytical, skeptical minds that can carry our civilization forwards scientifically and culturally. Avoiding discussion on the definition of i does nothing to that end; rather, it teaches people who question it or have a vague feeling that the definition is insufficient that it's best to just accept how things are, and it does nothing to further anyone's understanding of mathematics and directs all focus toward mechanical arithmetic skills.

Looking around the world, my impression is that no matter where you go, no matter how high a country scores in international tests and studies on mathematical skills, people don't actually understand anything of substance about math. Explaining the problem with the conventional definition of i can be done in one or a few sentences, and giving a pedagogic demonstration of how to define numbers in terms of sets can be done in an hour or two. But we neglect to do this, and people's understanding of math remains what it has always been: