Polynomials

Philip J. Erdelsky

March 16, 2009

Please e-mail comments, corrections and additions to the webmaster at pje@efgh.com.

1. Introduction

Polynomials are used in many branches of pure and applied mathematics.

A polynomial in one variable over a field F is a linear combination of a finite number of nonnegative integral powers of an independent variable, with coefficients taken from F. The terms are often written in order of ascending or descending powers:

p(x) = p_n xⁿ + p_n-1 x^n-1 + ... + p₁ x + p₀.

The order in which the terms are written is irrelevant, because addition is presumed to be associative and commutative. If a coefficient is zero, the corresponding term may be omitted. The last term, which is called the constant term, is actually p₀ x⁰, but x⁰ is omitted because its value is always 1. If the coefficient of any other term is 1, the coefficient may be omitted.

The polynomial with all coefficients equal to zero is called the zero polynomial and is exceptional in many ways. A polynomial with at least one one nonzero coefficient is called a nonzero polynomial.

The polynomial p(x) over the field F is also a function p(x):F⟶F, and if a is an element of F,

p(a) = p_n aⁿ + p_n-1 a^n-1 + ... + p₁ a + p₀.

Two polynomials are considered equal if they have equal coefficients of corresponding powers of the independent variable, after like terms are combined. If two polynomials are equal in this sense, then they are equal as functions; i.e., they give equal results for equal values of the independent variable. The converse is false for some fields. For example, x and x² do not have equal coefficients, but they are equal for both values in the tiny field Z₂. It is true for infinite fields, such as the field of real numbers, but this is a theorem that requires proof.

The degree of a nonzero polynomial is the highest exponent with a nonzero coefficient (after like terms are combined). For example, x - 7, x³ + x + 1 and 3x¹⁰ are polynomials of the first, third and tenth degrees, respectively. A nonzero polynomial containing only a constant term has degree zero. There are special names for polynomials of low degree. Here are some of them:

degree name

0 constant

1 linear

2 quadratic

3 cubic

4 quartic

5 quintic

degree	name
0	constant
1	linear
2	quadratic
3	cubic
4	quartic
5	quintic

The zero polynomial has no degree.

The terms "constant" and "linear" are actually a little more general than this table indicates. The term "constant" includes the zero polynomial, and the term "linear" includes the zero polynomial and constant polynomials. Hence a nonconstant polynomial is a nonzero polynomial of degree at least 1, and a nonlinear polynomial is a nonzero polynomial of degree at least 2.

The set of all polynomials over a field F with independent variable x is usually written as F[x].

Addition and multiplication of polynomials are defined to be the usual elementary algebraic operations. For example, two polynomials are added by adding the coefficients of corresponding powers of the independent variable, and two polynomials are multiplied by multiplying all pairs of terms and combining like terms in the product. When addition and multiplication are defined in this way, they produce valid results for all values of the independent variable; i.e. (p+q)(x) = p(x) + q(x) and (pq)(x) = p(x)q(x) for any polynomials over the same field and any value of x in the field.

Multiplication of a polynomial by a constant value from the field can be defined most readily as multiplication of the polynomial by a constant polynomial.

With these definitions, we can show that the polynomials over a field F constitute a vector space over F. It has no dimension, but the set of all polynomials of degree n or less, together with the zero polynomial, constitute a subspace of dimension n+1. The polynomials 1, x, x², ..., xⁿ are a basis, which is often called the canonical basis.

The following properties of polynomial addition and multiplication can be derived directly from the operations on coefficients. Here p(x), q(x) and r(x) are polynomials over the same field, and c is an element of the field.

(p(x)q(x))r(x) = p(x)(q(x)r(x)) (associative law)
(p(x) + q(x)) + r(x) = p(x) + (q(x) + r(x)) (associative law)
p(x)q(x) = q(x)p(x) (commutative law)
p(x) + q(x) = q(x) + p(x) (commutative law)
p(x)(q(x)+r(x)) = p(x)q(x) + p(x)r(x) (distributive law)
c(p(x)q(x)) = (cp(x))q(x)

In many cases, a nonzero polynomial can be multiplied by any nonzero constant and still serve the same purpose. It is customary to use the polynomial with the coefficient of its highest-degree term equal to 1. Such a polynomial is called a monic polynomial. The product of two monic polynomials is monic. If a monic polynomial can be factored into the product of two or more nonconstant polynomials, then there is a factorization in which all factors are monic.

2. Division

The product of two nonzero polynomials is obviously nonzero (and its degree is the sum of the degrees of the factors), a property that the polynomials have in common with the integers. In fact, the polynomials over a field have many properties that are analogous to those of the integers.

The division algorithm for polynomials is as follows.

Theorem 2.1 Let p(x) (the dividend) and d(x) (the divisor) be polynomials over the same field, where d(x) is nonconstant. Then there are unique polynomials q(x) (the quotient) and r(x) (the remainder) over the same field such that p(x) = q(x) d(x) + r(x) and r(x) is zero or of lower degree than d(x).

Proof. We use an induction argument that amounts to long division. If p(x) is zero or the degree of p(x) is less than that of d(x), then q(x) is zero and r(x) is p(x). In other cases, let m be the degree of d(x), let n be the degree of p(x), and write:

d(x) = d_m x^m + t(x),
p(x) = p_n xⁿ + s(x),

where t(x) and s(x) are polynomials of degrees less than m and n, respectively (or perhaps zero polynomials).

Then by inductive hypothesis,

s(x) - (p_n/d_m) x^n-m t(x) = u(x) d(x) + r(x),

where r(x) is zero or of degree less than m.

Then the required decomposition is

p(x) = ((p_n/d_m) x^n-m + u(x)) d(x) + r(x).

Now suppose we have two decompositions:

p(x) = q(x) d(x) + r(x),
p(x) = q'(x) d(x) + r'(x).

Subtract these two equations to obtain:

0 = (q(x) - q'(x)) d(x) + r(x) - r'(x),
(q(x) - q'(x)) d(x) = r'(x) - r(x).

If q(x) - q'(x) were nonzero, the degree of the left member would be greater than the degree of the right member. Hence q(x) = q'(x) and clearly r(x) = r'(x), so the decomposition is unique. █

The division algorithm could be extended to cases in which the divisor is a nonzero constant polynomial, but it would be rather trivial; the remainder would always be zero.

The polynomial p(x) is said to be divisible by the nonzero polynomial d(x), or d(x) is said to divide p(x), or d(x) is said to be a divisor of p(x), if p(x) = q(x)d(x) with remainder zero. Many of the properties of divisibility for integers also apply to polynomials. For example, if two or more polynomials are divisible by d(x), then their sum and product are also divisible by d(x).

The following two theorems, called the remainder theorem and the factor theorem, respectively, are immediate consequences of the division algorithm.

Theorem 2.2 If a polynomial p(x) is divided by the linear polynomial x-a, the remainder is p(a).

Proof. By the division algorithm, p(x) = q(x)(x-a) + r for some constant r. Substitution of a for x gives p(a) = r. █

Corollary 2.3 The polynomial p(x) is divisible by x-a if, and only if, p(a) = 0.

The greatest common divisor of two nonzero polynomials is a common divisor that is divisible by every other common divisor. It is obviously not unique in most cases, but there is a unique monic greatest common divisor, which is often written as gcd(p(x),q(x)).

The following theorem is the Euclidean algorithm for polynomials. It shows that two nonzero polynomials have a greatest common divisor, and it also exhibits a practical way to compute it.

Theorem 2.4 Let p(x) and q(x) be two nonzero polynomials over the same field. Then they have a monic greatest common divisor g(x), and there are polynomials s(x) and t(x) such that

s(x) p(x) + t(x) q(x) = g(x).

Proof. Clearly, the polynominals can always be scaled (multiplied by a nonzero constant) so g(x) is monic.

If one polynomial divides the other, the result is obvious. The divisor is also the greatest common divisor, and s(x) and t(x) are 0 and 1.

The rest of the proof proceeds by induction on the sum of the degrees of p(x) and q(x). If the sum is 0 or 1, one divides the other.

In other cases, arrange the polynomials so the degree of p(x) does not exceed the degree of q(x) and use the divison algorithm to decompose q(x) as follows:

q(x) = u(x) p(x) + r(x),

where r(x) is zero or of degree less than the degree of p(x).

Since p(x) does not divide q(x), r(x) must be nonzero.

Any common divisor of p(x) and q(x) is also a common divisor of p(x) and r(x), and vice-versa. Hence the greatest common divisor, if any, is the same in each case. By inductive hypothesis, p(x) and r(x) have a monic greatest common divisor g(x), which is also the monic greatest common divisor of p(x) and q(x), and there are polynomials v(x) and w(x) such that

g(x) = v(x) p(x) + w(x) r(x).

But r(x) = q(x) - u(x) p(x), so

g(x) = v(x) p(x) + w(x) (q(x) - u(x) p(x)) = (v(x) - u(x)) p(x) + w(x) q(x),

which is the desired result. █

Two polynomials are relatively prime if their monic greatest common divisor is 1.

Modular arithmetic on polynomials is also very similar to modular arithmetic on integers.

Let M(x) be a nonzero polynomial. Then two polynomials p(x) and q(x) (over the same field) are congruent with respect to the modulus M(x) if their difference p(x) - q(x) is divisible by M(x). It is easily shown that congruence is an equivalence relation, and that sums, differences and products of congruent polynomials (with respect to the same modulus) are congruent.

3. Irreducible Polynomials

A polynomial p(x) is irreducible or prime if it is nonconstant and not divisible by any nonconstant polynomial of lower degree. It is reducible if it can be factored into two (or more) nonconstant polynomials.

Clearly, all nonconstant linear polynomials are irreducible.

The irreducible polynomials have many properties analogous to those of prime numbers. For example, every nonconstant polynomial is either irreducible or a product of irreducible polynomials.

Theorem 3.2 If the product v(x)w(x) of two polynomials is divisible by an irreducible polynomial p(x), the either v(x) or w(x) (or both) is divisble by p(x).

Proof. Assume, for purpose of contradiction, that p(x) does not divide either v(x) or w(x). Then the monic greatest common divisor of p(x) and v(x) is 1, so by Theorem 2.4 there are polynomials s(x) and t(x) such that

1 = s(x) p(x) + t(x) v(x).

Similarly, there are polynomials e(x) and f(x) such that

1 = e(x) p(x) + f(x) w(x).

Multiplication of these two equations gives

1 = s(x) p(x) e(x) p(x) + s(x) p(x) f(x) w(x) + t(x) v(x) e(x) p(x) + t(x) f(x) v(x) w(x).

The left member is not divisible by p(x), but every term in the right member is divisible by p(x), which is the desired contradiction. █

Corollary 3.3 If an irreducible polynomial divides the product of any number of polynomials, it must divide at least one factor.

The following theorem is analogous to the fundamental theorem of arithmetic. However, it is not called the fundamental theorem of polynomials!

Theorem 3.4 Every monic nonconstant polynomial can be factored into monic irreducible polynomials in a way that is unique apart from the order of the factors.

Proof. The proof is by induction on the degree of the polynomial. A linear polynomial is irreducible, so its factorization is trivial.

In other cases, if the polynomial is irreducible, its factorization is trivial. Otherwise, it can be factored into two monic polynomials of lower degree, each of which has monic irreducible factors by the inductive hypothesis. Therefore, the original polynomial has monic irreducible factors.

To prove uniqueness, assume that a single polynomial has two factorizations into monic irreducible polynomials:

p₁(x) p₂(x) ... p_n(x) = q₁(x) q₂(x) ... q_m(x)

We prove, by induction on the number n of polynomials in the left member, that m = n and the right member contains the same polynomials, perhaps in a different order.

If n = 1 then

p₁(x) = q₁(x) q₂(x) ... q_m(x)

Since p₁(x) is irreducible, m = 1 and and q₁(x) = p₁(x).

Now if n > 1 then p₁(x) divides the left member, so it must also divide the right member. By Corollary 3.3 p₁(x) divides one of the polynomials in the right member. Since the polynomial is irreducible, it must be equal to p₁(x). Rearrange the right member so this polynomial comes first, and divide both sides by p₁(x) to obtain

p₂(x) p₃(x) ... p_n(x) = q₂(x) q₃(x) ... q_m(x)

By inductive hypothesis, the two sides contain the same polynomials, perhaps in different orders. This completes the proof. █

4. Roots

If p(x) is a polynomial over a field, then a field element r such that p(r) = 0 is called a root (or zero) of p(x). Corollary 2.3 states that r is a root of p(x) if, and only if, p(x) is divisible by x-r.

If p(x) can be factored as p(x) = s(x) t(x), then the roots of p(x), if any, are those of the factors s(x) and t(x).

Theorem 4.1 A nonzero polynomial of degree n cannot have more than n roots.

Proof. This is easy to show by induction on n. A nonzero constant polynomial (of degree 0) obviously has no roots, and a polynomial of degree 1 obviously has one root.

If r is a root of the polynomial p(x) of degree n+1, then p(x) = q(x) (x-r), where the degree of q(x) is n. Any other root of p(x) must be a root of q(x), which by inductive hypothesis can have no more than n roots. Hence p(x) has at most n+1 roots. █

Corollary 4.2 A polynomial which is zero or of degree no greater than n with more than n roots must be the zero polynomial (when like terms are combined).

Corollary 4.3 Two polynomials which are zero or of degree no greater than n which agree in more than n places must be identical (when like terms are combined).

Corollary 4.4 Two polynomials over an infinite field which define the same function must be identical (when like terms are combined).

A nonzero polynomial of degree n may have fewer than n roots. For example, the polynomial (x-1)² has only one root, although in this case it is sometimes said that the two roots are equal, or that the root is a double root.

If r is a root of the nonzero polynomial p(x), then the degree (or multiplicity) of the root is the highest value of k such that (x-r)^k divides p(x).

It is easily shown that the sum of the multiplicities of the roots of a nonzero polynomial cannot exceed the degree of the polynomial.

5. Field Extensions

A nonlinear polynomial may have no roots in the field from which its coefficients are taken, but it appears that such a polynomial may in some cases have a root in some larger field. For example, x² - 2 has no roots in the field of rational numbers, but it does have two roots in the field of real numbers. Similarly, x² + 1 has no roots in the field of real numbers, but it has two roots in the field of complex numbers.

If F is a field, then an extension of F is a field which contains F (or a field isomorphic to F) as a subfield. For example, the complex number field is an extension of the real number field.

A nonlinear irreducible polynomial can always be made reducible by extending the field. This can be done in several (equivalent) ways, but the simplest is the process of adjoining a root. It is a generalization of the process by which the field of complex numbers is constructed by adjoining a root of x² + 1.

Let p(x) be an irreducible polynomial of degree greater than 1 with coefficients in the field F. The proposed extension field is the set of polynomials over F of degree less than that of p(x), together with the zero polynomial. Addition in this field is defined in the usual way, but multiplication is taken modulo p(x); i.e., if s(x) and t(x) are in the extension field, then their product is the remainder when s(x) t(x) (computed in the usual way) is divided by p(x). It is fairly easy to show, using the divison algorithm, that addition and multiplication defined in this way obey all the field axioms except the existence of a multiplicative inverse for nonconstant polynomials. To prove that any nonconstant polynomial s(x) has an inverse, we use the Euclidean algorithm to find polynomials u(x) and v(x) such that

u(x) s(x) + v(x) p(x) = 1,
u(x) s(x) = (-v(x)) p(x) + 1.

If the degree of u(x) is less than that of p(x), it is the required inverse. Otherwise, use the division algorithm to find polynomials q(x) and r(x) such that

u(x) = q(x) p(x) + r(x),

where the degree of r(x) is less than that of p(x). Then a straightforward substitution shows that r(x) is the required inverse.

(q(x) p(x) + r(x)) s(x) = (-v(x)) p(x) + 1,
r(x) s(x) = (-v(x) - q(x)) p(x) + 1.

The constant polynomials constitute a subfield isomorphic to F.

Moreover, the linear polynomial z(x) = x is a root of p(x) in this field. Hence p(x) is divisible by x - z in the extended field, so it is no longer irreducible.

The extension field constructed in this way is also minimal, in the sense that no proper subfield can contain F and also a root of p(x). Moreover, it is unique up to isomorphism; i.e., it is isomorphic to any other minimal extension, and the isomorphism maps each element of the original field to itself. The required isomorphism is the mapping which carries the polynomial u(x) into the value u(z), where z is the constructed root.

If the polynomial is factored in the extended field, it may have nonlinear irreducible factors. We can use the same technique to extend the field further so they, too, are reducible.

The process must stop when an extension field is found in which p(x) can be factored into linear factors. This is called a splitting field of the polynomial. Every nonconstant polynomial has at least one. (If the polynomial is not irreducible, it is factored before beginning the process of adjoining roots. The splitting field of a linear polynomial, or of one that can already be factored into linear factors, is the original field.) The minimal splitting field of a polynomial is unique, up to isomorphism, and the isomorphism maps each element of the original field to itself.

The sum of the multiplicities of the roots of a nonconstant polynomial in its splitting field is equal to the degree of the polynomial.

Another feature of field extensions, whether they are obtained by adjoining roots or not, is slightly surprising at first glance. The greatest common divisor of two polynomials is not affected by extending the field, even though each polynomial may acquire additional divisors. However, it is obvious from the Euclidean algorithm that this is so -- the process by which the greatest common divisor is found involves only operations on the original polynomial coefficients, and cannot wander into an extension. In particular, two polynomials that are relatively prime remain so even if the field is extended. The converse, that two polynomials that are not relatively prime remain so if the field is extended, is fairly obvious.

Theorem 5.1 Let a be an element of an extension of the field F which is a root of a nonzero polynomial over F, but a ∉ F. Then there is a unique monic irreducible polynomial over F which has a as a root, and every polynomial over F which has a as a root is divisible by that unique polynomial.

Proof. By factoring the polynomial of which a is a root, we can show that a is a root of at least one monic irreducible polynomial p(x) over F. Assume, for purpose of contradiction, that it is also a root of a different polynomial q(x) over F which is not divisible by p(x). Since p(x) is irreducible, the two polynomials are relatively prime, and by the Euclidean algorithm there are polynomials r(x) and s(x) over F such that

r(x)p(x) + s(x)q(x) = 1.

Subsitution of a for x gives 0 = 1, which is the desired contradiction.

The uniqueness of p(x) is fairly obvious. Any other monic irreducible polynomial over F with a as a root would have to be divisible by p(x), which is possible only if it is identical to p(x). █

6. Derivatives

We cannot take derivatives of polynomials in the usual sense if the field is discrete. However, we can define a formal derivative using the formulas from regular differentiation:

p(x) = p_n xⁿ + p_n-1 x^n-1 + ... + p₁ x + p₀,
p'(x) = n p_n x^n-1 + (n-1) p_n-1 x^n-2 + ... + 2 p₂ x + p₁.

It should be noted that multiplication by the integers 2, 3, ..., n-1, n is defined as repeated addition. The integers themselves may not be part of the field, so regular field multiplication cannot be used.

Formal differentiation has most of the properties of regular differentiation:

(p(x) + q(x))' = p'(x) + q'(x)
(c p(x))' = c p'(x)

(p(x) q(x))' = p'(x) q(x) + p(x) q'(x)

Some other properties of regular differentiation are not preserved. For example, polynomials with the same derivative do not necessarily differ by a constant; e.g. in a field of characteristic 2, x² and 0 have the same derivative.

A nonconstant polynomial is called separable if its roots (in its splitting field) are all distinct.

A nonconstant linear polynomial is obviously separable. The following results are trivial for linear polynomials.

Theorem 6.1 A nonconstant polynomial is separable if, and only if, it and its derivative are relatively prime.

Proof. No generality is lost by assuming the polynomial is monic.

In the splitting field, we can write the polynomial as (x - r₁)(x - r₂) ... (x - r_n), where r₁, r₂, ..., r_n are its roots, arranged so multiple roots, if any, come first.

If the polynomial has multiple roots, it can be written as (x - r₁)² q(x), and its derivative is 2 (x - r₁) q(x) + (x - r₁)² q'(x), which is not relatively prime to the original polynomial in the splitting field because both are divisible by x - r₁. Hence it is also not relatively prime to the original polynomial in the original field.

If the roots are distinct, then (x - r₁)(x - r₂) ... (x - r_n) is its decomposition into irreducible factors.

Consider any factor x - r_i. The polynomial can be written as (x - r_i) s(x), where x - r_i does not divide s(x). Its derivative is s(x) + (x - r_i) s'(x), which is not divisible by x - r_i. Since none of the factors of the original polynomial divides its derivative, the polynomial and its derivative are relatively prime. █

Corollary 6.2 An irreducible polynomial is separable.

Corollary 6.3 A nonconstant divisor of a separable polynomial is separable.

7. Polynomials in Two or More Variables

A polynomial in two or more independent variables can be defined as a linear combination of a finite number of products of powers of the independent variables.

The following result for two variables is the well-known binomial theorem.

Theorem 7.1 When like terms are gathered together, the product (x+y)ⁿ is

C(n,0) xⁿ + C(n,1) x^n-1y + C(n,2) x^n-2y² + ... + C(n,n-2) x²y^n-2 + C(n,n-1) xy^n-1 + C(n,n) yⁿ.

where the binomial coefficient C(n,k) is given by
n (n-1) (n-2) ... (n-k+1) C(n,k) = -------------------------, (1) (2) ... k
C(n,0) = C(n,n) = 1, and all the coefficients are integers.

Proof. The proof is by induction on n. For n=1 the result is obvious.

We then assume the theorem for particular value of n and multiply the expanded form by x+y to get the expanded form of (x+y)ⁿ⁺¹.

The coefficient of x^n+1-ky^k in the product is C(n,k) + C(n,k-1) except when k=0 or k=n+1. Standard manipulation of fractions shows that this is C(n+1,k), and it is an integer because it is the sum of two integers. The first and last coefficients can be verified separately. █

A noteworthy property of the binomial coefficients is that all but the first and last are divisible by n when n is prime.

Another property of rational and real polynomials is useful for quick proofs of a number of polynomial identities. Consider two rational or real polynomials which agree for all values in an interval [a,b], where a < b:

p(x) = q(x) for a ≤ x ≤ b.

The interval contains an infinite number of values, so Corollary 4.3 shows that p(x) and q(x) must be identical, when like terms are combined (and they must also be of the same degree).

This can easily be extended to rational or real polynomials of two or more variables by induction on the number of variables.

Suppose

p(x₁,x₂,...,x_n) = q(x₁,x₂,...,x_n) for a₁ ≤ x₁ ≤ b₁, a₂ ≤ x₂ ≤ b₂, ..., a_n ≤ x_n ≤ b_n .

By appropriate algebraic manipulations, each polynomial can be converted into a polynomial in one variable x_n with coefficients as polynomials in the other variables. For example, the polynomial xyz + xz + x can be converted to (xy + x) z + x, which is a polynomial in the variable z with coefficients as polynomials in x and y.

Then the result for one variable shows that the each coefficient in one polynomial is equal to the corresponding coefficient in the other polynomial for all values of the independent variables in their respective ranges. By inductive hypothesis, they are identical, when like terms are combined. This extablishes the result for n variables.

Clearly, a similar result holds for complex polynomials, although in this case the variable ranges are two-dimensional.

Now consider the polynomial identity

(p(x) q(x))' = p'(x) q(x) + p(x) q'(x)

The two members are equal for all values of the independent variable and all values of the coefficients, at least when real polynomials are involved. Now interpret each side of the identity as a polynomial in both x and the coefficients of p(x) and q(x). Since the identity holds as an equation for all values of all the independent variables, the two sides must be identical when like terms are combined.