Math 344 Notes and Summaries

Martin Flashman's Courses MATH 344 Linear Algebra Fall, 2003 Class Notes and Summaries


week1	8-25	8-27	8-29
week 2	9-1	9-3	9-5
week 3	9-8	9-10	9-12
week 4	9-15	9-17	9-19
week 5	9-22	9-24	9-26
week 6	9-29	10-1	10-3
week 7	10-6	10-8	10-10
week 8	10-13	10-15	10-17 Exam
week 9	10-20	10-22	10-24
week 10	10-27	10-29	10-31
week 11	11-3	11-5	11-7
week 12	11-10	11-12	11-14
week 13	11-17	11-19	11-21
week 14 NoClasses	11-24	11-26	11-28
week 15	12-1	12-3	12-5
week 16	12-8	12-10	12-12

8-27-03

Other texts in Linear Algebra:

Finite Dimensional Vector Spaces

Linear algebra

Linear Algebra

Applied linear algebra

Linear Algebra and Its Applications

Topics in Abstract Algebra

HSU Library Holdings

On Proofs:

How to Read and Do Proofs

The Keys to Advanced Mathematics

How to Solve It

Motivational Question I:

What can we say about powers of a square 2 by 2 matrix A and AX , where X is (x,y)^tr∞
What is the geometric interpretation of AX when X is (1,0)^tror (0,1)^tr?

(


0	1
1	0

)

(


1	1
0	1

)

(


r	0
0	s

)

r = 1/2 s= 1/2
r= 2, s = 1/2
r=1, s= -1

Complex numbers: C = { z = a+bi, where a and b are real numbers}

C. ||z|| = sqrt(a^2 + b^2)

z = ||z||(cos(t) +i sin(t)) [called the "polar form" of the complex number]

The geometry of complex arithmetic:

If z = a+bi = ||z||(cos(t) +i sin(t)) and w = c+di = ||w||(cos(s) +i sin(s)) then

z+w = (a+c)+(b+d)i which corresponds geometrically to the "vector " sum of z and w in the plane, and

zw = ||z||(cos(t) +i sin(t)) ||w||(cos(s) +i sin(s))= ||z|| ||w|| (cos(t) +i sin(t))(cos(s) +i sin(s))
= ||z|| ||w|| (cos(t) cos(s) - sin(t)sin(s) + (sin(t) cos(s) + sin(s)cos(t)) i)
= ||z|| ||w|| (cos(t+s) + sin(t+s) i)

So you use the product of the magnitudes of z and w to determine the magnitude of the product and use the sum of the angles to determine the angle of the product.

8-29-03

Powers of complex numbers.

Based on the previous result on products, zⁿ = ||z||ⁿ (cos(nt) + sin(nt) i)

If ||z||<1, then the powers of z will spiral into 0. If ||z||>1, then the powers of z will spiral outward without bound. If ||z||=1 and z is not 1, the the powers of Z will oscillate around the circle of radius 1 without having any limit.

Notation: cos(t) + i sin(t) is somtimes written as cis(t).
Note: If we consider the series for e^x = 1 + x + x²/2! +x³/3! + ...
then e^ix = 1 + ix + (ix)²/2! +(ix)³/3! + ... = 1 + ix - x²/2! - ix³/3! + ...
... = cos(x) + i sin(x)
Thus e^pi= cos(p) + i sin(p)= -1. Thus ln(-1) = p i.

Matrices with complex number entries.
If r and s are complex numbers in the matrix A, then as n get large if ||r|| < 1 and ||s|| < 1 the powers of A will get close to the zero matrix , if r=s=1 the powers of A will always be A, and otherwise the powers of A will diverge .

Polynomials with complex coefficients.
Because multiplication and addition make sense for complex numbers, we can consider polynomials with coefficients that are complex numbers and use a complex number for the variable, making a complex polynomial a function from the complex numbers to the complex numbers.
This can be visualized using one plane for the domain of the polynomial and a second plane for the co-domain, target, or range of the polynomial.

The Fundamental Theorem of Algebra: If f is a non constant polynomial with complex number coefficients then there is at least on complex number z* where f(z*) = 0.

Fields: The structure used for "solving linear equations," finding inverses for matrices, and doing other parts of matrix algebra.

Fields.

R, Q, C, Z₂, Z_p, where p is a prime number, F₄, {f where f is a rational function defined on all but a finite number of real numbers}, algebraic numbers = {z: z is a complex number which is the root of a polynomial with integer coefficients}.

Polynomials with coefficients in a field. [We will study these in some detail later].
Matrices with entries in a field.

9-3-03

The class began with a discussion of the properties of a field with 4 elements- from the homework assignment.
A few more properties of Fields:

Suppose F is a field.

If a and b are in F and ab = 0 then either a =0 or b=0.

= c

a =

If a, b and c are in F, a+b = a+c implies b=c.

a+b

a+c

If a, b and c are in F with a not equal to 0, ab = ac implies b=c.

Let n·1 stand for 1 + 1 + 1 + ... + 1 with n summands. Either (i) for all n, n·1 is not 0 in which case we say the field has "characteristic 0" , or (ii) for some n, n·1=0. In the case n·1=0, there is a smallest n for which n·1=0, in which case we say the field has "characteristic n". For example: R, Q, and C all have characteristic 0, while Z₂, Z_p, where p is a prime number, and any finite field such as F₄, all have a non zero characteristic.
If the characteristic of F is not zero, then it is a prime number.

a =

ab=

Motivation Question II

Initial Sample: out of 1000 people.... 600 liked coke, 400 liked pepsi. ... v₀= (600,400)

₁

B. Alternative Resample: 400 of 600 stayed loyal to Coke, 200 switched to Pepsi.
300 of 400 stayed loyal to Pepsi , 100 switched to Coke. so v₁= (500,500)

Example A. v₀TA = v₁where

TA=

(


5/6	1/6
1/4	3/4

)

Example B.

v₀TB = v₁where

TB=

(


2/3	1/3
1/4	3/4

)

If the pattern in case B continues with the proportions of those switching remaining the same:

what can we say about v_n when n is large?

Does the distribution of v_n when n is large depend on v_0?

Is there some initial distribution that would remain unchanged under the pattern of switching in B?

₀

₁

(c*, 1000-c*) TB = (c*, 1000-c*).

How are these questions related to Motivation Question I?

v₀Tⁿ = v_n.

Vectors and vector spaces.

Traditional vectors: vectors in R², R², and Rⁿ.

Lists. Definition: A list is an ordered collection of a finite number of n-objects, separated by comma and surrounded by parentheses.

A list is also called an n-tuple.

Examples: (3,5) is a list of 2 real numbers, (3,4,5,3,5) is a list of 5 real numbers. ((2,3,5,2), (2,3), ( )) is a list of 3 lists.

The length of a list is the number n of places for objects in the list.

If v and w are lists, we say v = w if v and w have the same length n, v = (v₁ , v₂ , ..., v_n ), w = (w₁ ,w₂ , ..., w_n ) and v_i= w_i for each i, where i = 1,2,...,n.

Example. Visualize R², R³, and R⁴.

Definition: If S is a set, Sⁿ= {lists of length n with entries from the set S}

= { v = (v₁ , v₂ , ..., v_n ) where v_iis a member of S for each i, where i = 1,2,...,n..}

Definition: Suppose F is a field. We define an internal operation on Fⁿ, +, and an external operation between F and Fⁿ, * , by the following:

If v = (v₁ , v₂ , ..., v_n ) and w = (w₁ ,w₂ , ..., w_n ) are in Fⁿ and a is in F then v + w = (v₁ + w1, v2+ w₂ , ..., v_n+w_n ) and

a*v = (av₁ , av₂ , ..., av_n ) using the addition and multiplication of F for the operations on the individual elements of F.

Proposition: With the operations just defined on Fⁿ the following properties are true:

If v, w, and z are in Fⁿ and a and b are in F, then v+w and a*v are in Fⁿ [closure] ;
(v+w)+z = v+(w+z) [Associativity 1]
If we let 0 denote (0,0,...0) then 0+v=v+0 = v [Additive identity]
For each v, there is a list in Fⁿ, v', so that v+v' = v' + v =0. [Additive inverse]
v+w=w+v [commutativity]
1*v = v
a*( v + w) = (a*v) + (a*w) [distributive 1]
(a+b)*v = (a*v) + (b*v) [distributive 2]
(ab)*v = a*(b*v) [associativity 2]

Abstract Vector Spaces

Definition: Suppose V is a non empty set, F is a field, and there is an internal operation on V, +, and an external operation between F and V, * .

We say that V is a vector space over F, " V is a vs/F", and we call the elements of V, "vectors", if the following properties are true:

If v,w,and z are in V and a and b are in F, then v+w and a*v are in V [closure] ;
(v+w)+z = v+(w+z) [Associativity 1]
If there is an element of V, denoted 0 so that 0+v=v+0 = v [Additive identity]
For each v, there is an element of V, v', so that v+v' = v' + v =0. [Additive inverse]
v+w=w+v [commutativity]
1*v = v [scalar identity]
a*( v + w) = (a*v) + (a*w) [distributive 1]
(a+b)*v = (a*v) + (b*v) [distributive 2]

(ab)*v = a*(b*v) [associativity 2]

9-5-03

Vectors in a more general context:

Functions: Suppose S is a set. We define V(S,F) = { f:S -> F, the functions with domain S and target F}.

Note: When S = {1,2,...,n} we can give a 1 to 1 correspondence between V(S,F) and Fⁿby letting f (k) = v_k when v=

(v₁ , v₂ , ..., v_n ) is in Fⁿ .

Note: If S is the empty set, then V(S,F) is the empty set.

Definition: Suppose S is a non empty set and F is a field. We define an internal operation on V(S,F), +, and an external operation between F and V(S,F), * , by the following:

If v and w are in V(S,F) and a is in F then v + w is the function defined by (v+w)(s) = v(s) + w(s) and

a*v is the function defined by (a*v)(s) = a v(s) using the addition and multiplication of F for the operations on the elements of F as appropriate.

Proposition: With the operations just defined on V(S,F) the following properties are true:

If v,w,and z are in V(S,F) and a and b are in F, then v+w and a*v are in V(S,F) [closure] ;
(v+w)+z = v+(w+z) [Associativity 1]
If we let 0 denote the function 0(s) = 0 for all s in S then 0+v=v+0 = v [Additive identity]
For each v, there is a function v' in V(S,F), v', so that v+v' = v' + v =0. [Additive inverse]
v+w=w+v [commutativity]
1*v = v [scalar identity]
a*( v + w) = (a*v) + (a*w) [distributive 1]
(a+b)*v = (a*v) + (b*v) [distributive 2]
(ab)*v = a*(b*v) [associativity 2]

Examples:

S = N, the natural numbers = {0,1,2,3,...}. V(S,F) = F^∞ can be identified with infinite sequences of elements from the field.
S= { (i,j): i=1,2,...n; j= 1,2,...,m} then we can give a one to one correspondence between V(S,F) and the collection of n by m matrices with entries in the field F. A matrix M coresponds to the function f where f ((i,j)) = M _i,jthe entry in row i, column j of M.
S = F. Then V(S,F) = V(F,F).
S=F=R. Then V(S,F) = V(R,R)= {f:R->R}
P(F) = { f in V(F,F) where f (x) = a₀ + a₁x + a₂x² + ...+ a_nxⁿ} where a_i is in F for each i} = the polynomial functions from F to F.
C⁰(R) = {f in V(R,R) where f is continuous}.
C¹(R) = {f in V(R,R) where f is continuously differentiable}.
C^∞(R) = {f in V(R,R) where f is n times differentiable for any n}.

Abstract Vector Spaces (Again)

Definition: Suppose V is a nonempty set, F is a field, and there is an internal operation on V, +, and an external operation between F and V, * .

We say that V is a vector space over F, " V is a vs/F", and we call the elements of V, "vectors", if the following properties are true:

If v,w,and z are in V and a and b are in F, then v+w and a*v are in V [closure] ;
(v+w)+z = v+(w+z) [Associativity 1]
If there is an element of V, denoted 0 so that 0+v=v+0 = v [Additive identity]
For each v, there is an element of V, v', so that v+v' = v' + v =0. [Additive inverse]
v+w=w+v [commutativity]
1*v = v [scalar identity]
a*( v + w) = (a*v) + (a*w) [distributive 1]
(a+b)*v = (a*v) + (b*v) [distributive 2]
(ab)*v = a*(b*v) [associativity 2]

Simple properties of Vector spaces

0 is unique.
-v is unique.
0v= 0.
a*0 = 0.
(-1)v = -v.
If v+z = w + z then v=w. [cancellation]
If a*v = 0, then either a = 0 or v = 0.

Subspaces.

Definition.W is a subspace of V, W < V, if W is a subset of V and using the operations of V, W is also a vector space.
Simple testing. Is W a subspace of V?

Hard way: check all the axioms again! :(
Easier: Since the operation is the same, you don't need to check scalar identity, associativity, commutativity, or distributivity.

Fact: If W<V and 0v is the zero for V and 0w is the zero for W then, 0v=0w. Proof: 0v+0w = 0w but 0w+0w = 0w, so 0v=0w.

Test 1: (3 steps) If W is non empty and closed under addition and scalar multiplication, then W < V.

Proof: (i) Since w is not empty, choose w in W. Then 0w= 0, so 0 is in W and so W satisfies the additive identity property.

IRMC.

Easiest: (2 step) If W is non empty and if for any w and z in W and a in F, w+az is W [super closure] then W<V.

Proof: (i) Use a=1, so W is closed under addition.

0 is in W.

IRMC

9-8-03

We began class with a discussion of two problems from the assignment relating to subspaces.

In particular we showed that if U and W are subspaces of V and U union W is also a subspace of V, then either U < W or W < U.

Comment: Make correction on remarks made about countability last class.

Do Examples

F[X] = { f in F^∞, where f(n) = 0 for all but a finite number of n.} < F^∞

Note: When F = R or C, F[X] can be identified with P(F).

F^∞,

F. We need to show

f+ a

F^∞ . So consider

f+ ag

f+ a

Suppose A is an n by k matrix with entries in F.

N(A) ="the null space of A" = { v in Fⁿ where vA= (0,0,...0) in F^k } < Fⁿ

N(A) can be considered the subspace of solutions to the system of k homogenous linear equations in n unknowns determined by vA=(0,0,...,0 ).
Proof: Clearly... 0A = 0, so 0 is in N(A). Suppose v and w are in N(A) and a is in F, then consider (v+aw)A = vA+(aw)A=vA+a(wA) = 0 +0 = 0, so (v+aw) is in N(A).

R(A) = "the range space of A" = { w in F^k where for some v in Fⁿ, vA= w } < F^k

Proof: Clearly... 0A = 0, so 0 is in N(A).Suppose z and w are in R(A) and a is in F, then there are v and u in in F^k where vA=w and uA=z. So (v+au)A = vA+(au)A=vA+a(uA) =w+az, so w+az is also in R(A).
R(A) can be considered the collection of k dimensional vectors, w, for which the system of k linear equations in n unknowns determined by vA=w has a solution.

9-10-03

{f in C^∞(R) where f ''(x) - f(x) = 0} < C^∞(R).

Fact: If U<W and W <V, then U<V.

Proof:The operation on U is the same operation for W and V.

(Internal) Sums , Intersections, and Direct Sums of Subspaces

Suppose U1, U2, ... , Un are all subspaces of V.
Definition: U1+ U2+ ... + Un = {v in V where v = u1+ u2+ ... + un for uk in Uk , k = 1,2,...,n} called the internal sum of the subspaces.

Facts: (i) U1+ U2+ ... + Un < V.
(ii) Uk < U1+ U2+ ... + Un for each k, k= 1,2,...,n.
(iii) If W<V and Uk < W for each k, k= 1,2,...,n, then U1+ U2+ ... + Un <W.
So ...
U1+ U2+ ... + Un is the smallest subspace of V that contains Uk for each k, k= 1,2,...,n.
Examples:

U1 = {(x,y,z): x+y+2z=0} U2 = {(x,y,z): 3x+y-z=0}. U1 + U2 = R³.
Let Uk = {f in P(F): f(x) = a_kx^k where a_kis in F} . Then U0+ U1+ U2+ ... + Un = {f : f (x) = a₀ + a₁x + a₂x² + ...+ a_nxⁿ where a₀ ,a₁ ,a₂,...,a_nare in F}.
Definition: U1^U2^ ... ^ Un = {v in V where v is in Uk , for all k = 1,2,...,n} called the intersection of the subspaces.

Facts:(i) U1^ U2^ ... ^ Un < V.
(ii)   U1^U2^ ... ^ Un < Uk for each k, k= 1,2,...,n.
(iii) If W<V and W < Uk for each k, k= 1,2,...,n, then W<U1^ U2^ ... ^ Un .
So ...
U1^ U2^ ... ^ Un is the largest subspace of V that is contained in Uk for each k, k= 1,2,...,n.
Examples: U1 = {(x,y,z): x+y+2z=0} U2 = {(x,y,z): 3x+y-z=0}. U1 ^ U2 = {(x,y,z): x+y+2z=0 and 3x+y-z=0}= ...
Let Uk = {f in P(F): f(x) = a_kx^k where a_kis in F} then Uj^Uk = {0} for j not equal to k.
...

9-12-03
Direct Sums: Suppose U1, U2, ... , Un are all subspaces of V and U1+ U2+ ... + Un = V, we say V is the direct sum of U1, U2, ... , Un if for any v in V, the expression of v as v = u₁+ u₂+ ... + u_n for u_k in Uk is unique, i.e., if v = u₁'+ u₂'+ ... + u_n' for u_k' in Uk then u₁ = u₁', u₂=u₂', ... , u_n=u_n'. In these notes we will write V = U1# U2# ... # Un.
Examples:Uk = {v in Fⁿ: v = (0,... 0,a,0, ... 0) where a is in F is in the kth place on the list.} Then U1# U2# ... # Un = V.
Theorem: (Prop 1.8) V = U1# U2# ... # Un if and only if (i)U1+ U2+ ... + Un = V AND 0=u₁+ u₂+ ... + u_n for u_k in Uk implies u₁=u₂=...=u_n=0.
Theorem: (Prop 1.9) V = U#W if and only if V = U+W and U^W={0}.

Examples using subspaces and direct sums in appplications:
Suppose A is a square matrix (n by n) with entries in the field F.
For c in F, let W_c= { v in Fⁿ where vA = cv}.
9-15-03
Fact: For any A and any c, W_c< Fⁿ. [Comment: for most c, W_c= {0}. ]
Definition: If W_c is not the trivial subspace, then c is called an eigenvalue or characteristic value for the matrix A and nonzero elements of W_care called eigen vectors or characteristic vectors for A.

Application 1 : Consider the coke and pepsi matrices:

Example A.   vA = cv?where

A= (

5/6 1/6
1/4 3/4
)

Example B.   vB = cvwhere

B= (

2/3 1/3
1/4 3/4
)

Questions: For which c is W_c non-trivial?
To answer this question we need to find (x,y) [not (0,0)] so that

Example A

(x,y) (

5/6 1/6
1/4 3/4
) = c(x,y)

Example B

(x,y) (

2/3 1/3
1/4 3/4
) = c(x,y)

Is R² = W_c1 + W_c2 for these subspaces? Is this sum direct?

Focusing on Example B we consider for which c will the matrix equation have a nontrivial solution (x,y)?
We consider the equations: 2/3 x +1/4 y = cx and 1/3 x+3/4 y = cy.
Multiplying by 12 to get rid of the fractions and bringing the cx and cy to the left side we find that
(8-12c)x + 3 y = 0 and 4x + (9-12c)y = 0

Multiplying by 4 and (8-12c) then subtracting the first equation from the second we have
((8-12c)(9-12c) - 12 )y = 0. For this system to have a nontrivial solution, it must be that
((8-12c)(9-12)c - 12 ) = 0 or 72 - (108+96) c+144c^2 -12 = 0 or 60 -204c +144c^2 = 0.
Clearly one root of this equation is 1, so factoring we have (1-c)(60-144c) = 0 and c = 1 and c = 5/12 are the two solutions... so there are exactly two distinct eigenvalues for example B,
c= 1 and c = 5/12 and two non trivial eigenspaces W₁ and W_5/12 .

General Claim: If c is different from k, then W_c ^ W_k = {0}
Proof:?
generalize?
What does this mean for v_n when n is large?
Does the distribution of v_n when n is large depend on v₀?

9-17-03
Application 2: For c a real number let
W_c = {f in C^∞(R) where f '(x)=c f(x)} < C^∞(R).
What is this subspace explicitly?
Let V={f in C^∞(R) where f ''(x) - f(x) = 0} < C^∞(R).
Claim: V = W₁ # W_-1
Begin?   We'll come back to this later in the course!

If c is different for k, then W_c ^ W_k= {0}
Proof:...

Back to looking at things from the point of view of individual vectors:
Linear combinations:
Def'n. Suppose S is a set of vectors in a vector space V over the field F. We say that a vector v in V is a linear combination of vectors in S if there are vectors u₁, u₂, ... , u_n in S and scalars a₁, a₂, ..., a_n in F where v = a₁u₁+ a₂u₂+ ... + a_nu_n .
Comment: For Axler: S is a finite set.

Def'n. Span (S) = {v in V where v is a linear combination of vectors in S}
Span (S) = V we say that S spans V. "finite dimensional" v.s.

Linear Independence.
Def'n. A set of vectors S is linearly dependent if there are vectors u₁, u₂, ... , u_n in S and scalars a₁, a₂, ..., a_n NOT ALL 0 in F where 0 = a₁u₁+ a₂u₂+ ... + a_nu_n .
A set of vectors S is linearly independent if it is not linearly dependent.

Other ways to characterize linearly independent.
A set of vectors S is linearly independent if whenever there are vectors u₁, u₂, ... , u_n in S and scalars a₁, a₂, ..., a_n in F where 0 = a₁u₁+ a₂u₂+ ... + a_nu_n , the scalars are all 0, i.e. a₁= a₂ = ... =a_n = 0 .

9-19-03
Examples: Suppose A is an n by m matrix: the row space of A= span ( row vectors of A) , the column space of A = Span(column vectors of A).
Relate to R(A)
Recall R(A) = "the range space of A" = { w in F^k where for some v in Fⁿ, vA= w } < F^k.
w is in R(A) if and only if w is a linear combination of the row vectors, i.e., R(A) = the row space of A.
If you consider Av instead of vA, the R*(A) = the column space of A.

"Infinite dimensional" v.s. examples: P(F), F^∞, C^∞ (R)
P(F) was shown to be infinite dimensional. [ If p is in SPAN(p1,....,pn) then the degree of p is no larger than the maximum of the degrees of {p1,...pn}. So P(F) cannot equal SPAN(p1,...,pn) for any finite set of polynomials- i.e, P(F) is NOT finite dimensional.

Some Standard examples.

2.4 Linear dependence Lemma : Suppose S is a finite linearly dependent set indexed by 1,2,.. , n and v1 is not 0, then for some index j, vj is in the span(v1,...v(j-1)) and Span (S) = Span(S -{vj}).
Proof: See LA p25.

2.6 Theorem: Suppose S is a finite set of vectors with V = Span (S) and T is a linearly independent set of vectors in V. Then T is also finite and n( T)< = n(S)
Proof: See LA p25-26.

Comments:

(2.4) shows how to constuct a basis for a non trivial finite dimensional v.s., V. Start with a finite set of vectors S that span V. We can assume S has some non-zero vector in it. Put that element first.

If S is linearly independent you are done. If not, apply (2.4) repeatedly until the resulting set of vectors is linearly independent. This must happen since at worst you will be left with v1 which was not 0. Thus we have proven

Theorem 2.10: Every finite spanning list in a vector space can be reduced to a basis.
and the Corollary (2.11). Every finite dimensional vector space has a basis.

Comment:The proof of Theorem 2.6 also shows that given T, a linearly independent subset of V and S, a finite set where SPAN(S) = V, one can step by step replace the elements of S with elements of T at the beginning of the list of vectors, so that eventually you have a new set S' where Span(S') = V and T contained in S'. Now by applying repeatedly the Lemma to S', one will arrive at a set B that is a basis for V with T contained in B. This proves

Theorem 2.12: Every Linearly independent subset of a finite dimansional vector space can be extended to a basis of the vector space.

Prop: A Subspace of a finite dimensional vs is finite dimensional.

Problem 2.12: Suppose p₀,...,p_m are in P_m(F) and p_i(2) = 0 for all i.
Prove {p₀,...,p_m} is linearly dependent.

Proof: Suppose {p₀,...,p_m} is linearly independent.
Notice that by the assumption for any coefficients

(a₀p₀+..+a_mp_m )(2) = a₀p₀(2)+..+a_mp_m(2) = 0

and since u(x)= 1 has u(2) = 1, u (= 1) is not in the SPAN(p₀,...,p_m).
Thus SPAN(p₀,...,p_m) is not P_m(F).
But SPAN ( 1,x, ..., x^m) = P_m(F) .
By repeatedly applying Lemma 2.4 to these two sets of m+1 polynomials as in Theorem 2.6, we have SPAN (p₀,...,p_m)=P_m(F), a contradiction. So {p₀,...,p_m} is not linearly independent.
End of proof.

Bases- def'n.

Definition: A set B is called a basis for the vector space V over F if (i) B is linearly independent and (ii) SPAN( B) = V.

Prop. If V is finite dimensional vs and B and B' are bases for V, then n(B) = n(B').

Proof: fill in ... based on 2.6.

Define dimension of a finite dimensional v.s. over F.

9-22: Filled in much above on Bases and the proofs of theorems about bases.

9-24
Discuss dim({0}).
What is Span of the empty set? Characterize SPAN(S) = the intersection of all subspaces that contain S. Then Span (empty set) = Intersection of all subspaces= {0}.

The empty set is linearly independent!... so The empty set is a basis for {0} and the dimension of {0} is 0!

2.8: bases and representation of vectors in a f.d.v.s.

Suppose B is a finite basis for V with its elements in a list, (u₁, u₂, ... , u_n) . If v is in V, then there are unique vectors scalars a₁, a₂, ..., a_n in F where v = a₁u₁+ a₂u₂+ ... + a_nu_n . The scalars are called the coordinates of v w.r.t. B, and we will write v = [a₁, a₂, ..., a_n]_B.

Examples: In R², P₄(R).

Connect to Coke and Pepsi example: find a basis of eigen vectors using the B example for R². [Use the on-line technology]

Example B

(x,y) (

2/3 1/3

1/4 3/4
) = c(x,y)

We considerd the equations: 2/3 x +1/4 y = cx and 1/3 x+3/4 y = cy and showed that
there are exactly two distinct eigenvalues for example B,
c= 1 and c = 5/12 and two non trivial eigenspaces W₁ and W_5/12 .
Now we can use technology to find eigenvectors in each of these subspaces.
Matrix calculator, gave as a result that the eignevalue 1 had an eigenvector (1,4/3) while 5/12 had an eigenvector (1,-1). These two vectors are a basis for R².

Dimension Results: Suppose Dim(V) = n, S a set of vectors with N(S) = n. Then
(1) If S is Linearly independent, then S is a basis.
(2) If Span(S) = V, then S is a basis.
Proof: (1) S is contained is a basis, B. If B is larger than S, then B has more than n elements, which contradicts that fact that any basis for V has exactly n elements. So B = S and S is a basis.
(2) S contains a basis, B. If B is smaller than S then B has less than n elements, which contradicts that fact that any basis for V has exactly n elements. So B = S and S is a basis.
IRMC

9-26
2.18: If U, W <V are finite dimensional, then so is U+W and
dim(U+W) = Dim(U) + Dim(W) - Dim(U^W).
Proof: (idea) build up bases of U and W from U^W.... then check that the union of these bases is a basis for U+W.

Linear Transformations: V and W vector spaces over F.
Definition: A function T:V -> W is a linear transformation if for any x,y in V and in F, T(x+y) = T(x) + T(y) and T(ax) = a T(x).

Examples: T(x,y) = (3x+2y,x-3y) is a linear transformation T: R2 -> R2.
G(x,y) = (3x+2y, x^2 -2y) is not a linear trasnformation.
G(1,1) = (5, -1) , G(2,2) = (10, 0)... 2*(1,1) = (2,2) but 2* (5,-1) is not (10,0)!
Notice that T(x,y)can be thought of as the result of a matric multiplication

(x,y) (

3
1

2
-2

)

So the two key properties are the direct consequence of the properties of matrix multiplication.... (v+w)A= vA+wA and (cv)A = c(vA).
For A a k by n matrix : T_A (left argument) and _AT (right) are linear transformations on F^k and Fⁿ.
T_A  (x) = x A for x in F^k and _AT(y) = A[y]^tr for y in Fⁿ and [y]^tr indicates the entries of the vector treated as a one column matrix.

Monday 9-29.

The set of all linear transformations from V to W is denoted L(V,W).

More notes on Chapter 1 and 2

1.9:V = U # W if and only if V = U+W and U^W={0}.
Proof: => suppose v is in U^W, then v=u in U and v=w in W, so 0 = u-v. But since V= U#W, this means u=w = 0 so v=0, so U^W={0}.
Note: This argument extends to V as the direct sum of any family of subspaces.

<= Suppose u is in U and w is in W and u+w = 0. Then, u = -w so u is also in W, and thus u is is U^W={0}. So u=0 and then w= 0 . Since V=U+W, we have by 1.8, V=U#W. EOP

2.19 If V is f.d.v.s. and U1, ...Un are subspaces with V = U1 +...+ Un and
dim(V) = dim(U1)+...+ dim(Un) then V = U1 #...# Un

Proof outline: Choose bases for U1, ..., Un and let B be the union of these setes. Since V = U1 +...+ Un every vector in v is a linear combination of elements from B. But B has exactly dim(U1)+...+ dim(Un) = dim(V) elements in it, B is a basis for V. Now suppose 0=u₁+ u₂+ ... + u_n for u_k in Uk. Then each u_i=can be expressed as a linear combination of the basis vectors for Ui, and the entire linear combination is 0 implies that each coefficient is 0 because B is a basis. So u₁=...=u_n=0 and V = U1 #...# Un. EOP

How do you find a basis for the SPAN(S) in Rⁿ?
Outline of use of row operations...

Back to linear transformations:
Consequences of the definition: If T:V->W is a linear transformation, then for any x and y in V and a in F,

(i) T(0) = 0.

(ii) T(-x) = -T(x)

(iii) T(x+ay) = T(x) + aT(y).

Quick test: If T:V->W is a function and (iii) holds for any x and y in V and a in F, then the function is a linear transformation.

Visualize with Winplot?

Why this called a "linear" transformation:
The geometry of linear: A line in R2 is {(x,y): Ax +By = C where A and B are not both 0} = {(x,y): (x,y) = (a,b) + t(u,v)}= L, line through (a,b) in direction of (u,v).

Suppose T is a linear transformation :
Let T(L) = L' = {(x'y'): (x',y')= T(x,y)}
T(x,y) = T(a,b) + t T(u,v).
If T(u,v) = (0,0) then L' = T(L) = {T(a,b)}.
If not then L' is also a line though T(a,b) in the direction of T(u,v).

Coke/Pepsi example B: T(x,y) =(2/3 x +1/4 y, 1/3 x+3/4 y)
T(v₀) = v₁, T(v₁) = v₂.... T(v_k)=T(v_k+1).
T(v*)=v* means a nonzero v* is an eigenvector with eigenvalue 1. T(1, 4/3) = (1,4/3). Also T(3/7, 4/7) = T[(3/7)(1,4/3)] = 3/7T(1,4/3) =3/7(1,4/3) =(3/7,4/7).
T(1,-1) =(5/12,-5/12 )= (5/12)(1,-1) means that (1,-1) is an eigenvector with eigenvalue 5/12.

D... Differentiation: on polynomials, on ...
Example: (D(f))(x) = f' (x) or D(f) = f'.

T(f)(x) = f''(x) - f(x) or T(f) = DD(f) - f = (DD-Id) f.

Wednesday 10-1
Theorem: T : V->W linear, B a basis, gives S(T):B ->W.
Suppose S:B -> W, then there is a unique linear transformation T(S):V->W such that S(T(S))=S.
Proof: Let T(S)(v) be defined as follows: Suppose v is expressed (uniquely) as a linear combination of elements of B, ie. v = a₁u₁+ a₂u₂+ ... + a_nu_n ... then let T(v) = a₁S(u₁)+ a₂S(u₂)+ ... + a_nS(u_n) ....
This well defined since the representation of v is unique. Left to show T is linear. Clearly... if u is in B then S(T(S))(u) = S(u).
Example: T: P(F) -> P(F).... S(xⁿ) = nx ^n-1.
Or another example: S(xⁿ) = 1/(n+1) x ⁿ⁺¹.

Algebraic stucture on L(V,W)
Definition of the sum and scalar multiplication:
T, U in L(V,W), a in F, (T+U)(v) = T(v) + U(v).
Fact:T+U is also linear.
(aT)(v) = a T(v) .
Fact:aT is also Linear.
Check: L(V,W) is a vector space over F.

Composition: T:V -> W and U : W -> Z both linear, then UT:V->Z where UT(v) = U(T(v)) is linear.

Note: If T':V-> W and U':W->Z are also linear, then U(T+T') = UT + UT' and (U+U') T = UT + UT'. If S:Z->Y is also linear then S(TU) = (ST)U.

Key focus: L(V,V) , the set of linear "operators" on V.... also called L(V).
If T and U are in L(V) then UT is also in L(V). This is the key example of what is called a "Linear Algebra"... a vector space with an extra internal operation usually described as the product. That satisfies the distributive and associative properties.

Key Spaces related to T:V->W
Null Space of T= kernel of T = {v in V where T(v) = 0 [ in W] }= N(T) < V
Range of T = Image of T = T(V) = {w in W where w = T(v) for some v in V} <W.

Recall definition of "injective" or "1:1" function.

Theorem: T is 1:1 (injective) if and only if N(T) = {0}
Proof: => Suppose T is 1:1. We now that T(0)=0 , so if T(v) = 0, then v = 0. Thus 0 is the only element of N(T) or N(T) = {0}.
<= Suppose N(T) = {0}. If T (v) = T(w) then T(v-w) =T(v)-T(w) = 0 so v-w is in N(T).... ok, than must mean that v-w = 0, so v=w and T is 1:1.

Friday 10-3
More details to follow on this lecture:
he first part of the lecture discussed the importance of the Null Space of T, N(T) is undertanding what T does in general.

Example 1. D:P(R) -> P(R)... D(f) = f'. Then N(D) = { f: f(x) = C for some constant C.} [from calculus 109!]
Notice: If f'(x) = g'(x) the f(x) = g(x) + C for some C.
Proof: consider D(f(x) - g(x)) = Df(x) - Dg(x) = 0, so f(x) -g(x) is in N(T).

Example 2: Solving a system of homogeneous linear equations. This was connected to finding the null space of a linear trasnformation connected to a matrix. Then what about a non- homogeneous system with the same matrix. Result: If z is a solution of the non- homogeneous system of linear equations and z ' is another solution, then z' = z + n where n is a solution to the homogeneous system.

General Proposition: T:V->W. If b is a vector in W and a is in V with T(a) = b, then T^-1(b} = {v in V: v = a +n where n is in N(T)} = a + N(T)

Comment: a + N(T) is called the coset of a mod N(T)...these are analogous to lines in R². More on this later in the course.

Major result of the day: Suppose T:V->W and V is a finite dimensional v.s. over F. Then N(T) and R(T) are also finite dimensional and Dim(V) = Dim (N(T)) + Dim(R(T)).
Proof: Done in class- see text: Outline: start with a basis C for N(T) and extend this to a basis B for V. Show that T(B-C) is a basis for R(T).

Next: Monday. Oct.6. Matrices and Linear transformations. (with Dr. B).

Oct.8
Footnote on notation for Matrices: If the basis for V is B and for W is C and T:V->W,
the matrix of T with respect to those bases can be denoted M_B^C(T). The matrix for a vector V is denoted M_B(v). Thus M_C(T(v))=M_B^C(T)M_B(v).

The function M : L(V,W) -> Mat (m,n; F) is a linear transformation.

Invertibility of Linear Transformations:
Def'n: T:V -> W is invertible if and only if
there is a linear transformation S :W -> V where TS = Id_W and ST = Id_V .

Fact: If T is invertible then the S :W->V used in the definition is also invertible!
S is unique: If S' satsifies the same properties as s, then

S = S Id = S(TS') =(ST)S' = Id S' = S'

S is called "the inverse of T".
Prop (3.17): T is invertible iff T is 1:1 and onto (injective and surjective).
(i) => Assume S... (a)show T is 1:1. (b) show T is onto.
(ii) <= Assume T is 1:1 and onto. Define S. Show S is linear and TS =I and ST = I

Def: If there is a T:V->W that is invertible, we say W is isomorphic with V. (V=_T W)
Comments: (i)V=_Id V (ii)If V=_T W then W=_S V (iii)If V=_T W and W=_U Z then V=_UT Z.
[Start 10-10 here!]
Theorem (3.18) Suppose V and W are finite dimensional v.s.
Then V=_T W if and only if dim(V) = dim(W).
Proof: Done! See text!
Theorem (3.19) Suppose B= (v₁,...,v_n) and C=(w₁,...,w_m) are finite bases (lists) for V and W respectively. The linear transformation M: L(V,W) -> Mat(m,n,F) is an isomorphism.
Proof: Show injective by Null(M)= {Z} -where Z is the zero transformation.
Show M is onto by giving TA where M(TA) = A based on knowing A.

Cor. Prop 3.20: Dim L(V,W) = Dim(V) Dim(W).

Prop 3.21. V a f.d.v.s. If T is in L(V) then the following are equivalent:
(i) T is invertible.
(ii) T is 1:1.
(iii) T is onto.
Proof: (i) =>(ii). Immediate.
(ii)=>(iii) . Dim V = Dim(N(T)) + Dim(R(T)). Since T is 1:1, N(T)={0}, so Dim(N(T))= 0 and thus Dim V = Dim (R(T)) so R(T) = V and T is onto.
(iii) =>(i) Dim V = Dim(N(T)) + Dim(R(T)) Since T is onto, R(T) = V... so Dim(N(T)) = 0. ... so N(T) = {0} and T is 1:1, so T is invertible.

End of material for midterm exam

10-13 Connection to square matrices:
A is invertible is equivalent to....Systems of equations statements.

Look at Coke/Pepsi example B: T(x,y) =(2/3 x +1/4 y, 1/3 x+3/4 y)= (x,y)A
T(v₀) = v₁, T(v₁) = v₂.... T(v_k)=T(v_k+1).
v₂=T(v₁) = TT(v₀);... T(v_k)=T^k(v₀) = (x₀,v₀)A^k.
We considerd the equations: 2/3 x +1/4 y = cx and 1/3 x+3/4 y = cy and showed that
there are exactly two distinct eigenvalues for example B,
c= 1 and c = 5/12 and two non trivial eigenspaces W₁ and W_5/12 .
Now we can use technology to find eigenvectors in each of these subspaces.
Matrix calculator, gave as a result that the eignevalue 1 had an eigenvector (1,4/3)=v1 while 5/12 had an eigenvector (1,-1)=v2. These two vectors are a basis for R².
B=(v1,v2)
What is the matrix of T using this basis.

M_B^B(T)=

(


1	0
0	5/12

)

Using this basis and matrix makes it easy to see what happens when the transfomation is applied repeatedly:

M_B^B(Tⁿ)=[M_B^B(T)]ⁿ=

(


1	0
0	5/12

)ⁿ


1	0
0	(5/12)ⁿ

10-15 Change of basis:
So ... What is the relationship between this very nice matrix for T that results from using the basis B of eigenvectors and the matrix for T that uses the standard basis, E = (e1,e2)?

M_E^E(T)=

(


2/3	1/4
1/3	3/4

)

The key to understanding the relationship between these is the identity map!
We consider the matrix for the identity operator using B for the source and E for the target.

M_B^E(Id)=

(


1	1
4/3	-1

)

And for the identity operator using E for the source and B for the target, M_E^B(Id).
Notice that M_E^B(Id) M_B^E(Id) =M_B^B(Id* Id)=M_B^B(Id)= I_n the n by n identity matrix, and similarly M_B^E(Id) M_E^B(Id) =M_E^E(Id) = I_n . Thus both these matrices are invertible and each is the inverse of the other!
Furthermore:

M_B^B(T)=

(


1	0
0	5/12

)

M_B^E(Id)M_B^B(T)M_E^B(Id)=M_E^E(IdTId)=M_E^E(T)
and
M_E^B(Id)M_E^E(T)M_B^E(Id)=M_B^B(IdTId)=M_B^B(T).

If we let P =M_E^B(Id) and L = M_B^E(Id) = P^-1, then we have

LM_B^B(T)P =P^-1M_B^B(T)P=M_E^E(T)
or
M_B^B(T)P= PM_E^E(T)
and
PM_E^E(T)L=M_B^B(T).

10-20
Change of Basis, Invertibility and similar matrices.
The previous example works in general:
The Change of Basis Theorem:
Suppose V is a f.d.v.s over F, dim(V) = n, and B and E are two bases for V. Suppose T:V -> V is a linear operator, then

M_B^E(Id)M_B^B(T)M_E^B(Id)=M_E^E(T)
and
M_E^B(Id)M_E^E(T)M_B^E(Id)=M_B^B(T).

If we let P =M_E^B(Id) and L = M_B^E(Id) = P^-1,
[LP =M_B^E(Id)M_E^B(Id) = M_E^E(Id)= I_n ]
then we have

LM_B^B(T)P =P^-1M_B^B(T)P=M_E^E(T)

M_B^B(T)P= PM_E^E(T)

and likewise PM_E^E(T)L=M_B^B(T).

Def'n: We say that two matrices A and B are similar if there is an invertible matrix P so that

B = P^-1AP.

Cor.Suppose V is a f.d.v.s over F, dim(V) = n, and B and E are two bases for V. Suppose T:V -> V is a linear operator, then M_B^B(T) and M_E^E(T) are similar matrices.

There is a "converse" to the theorem based on the following
Proposition: If P is an invertible n by n matrix, then T_P:Fⁿ -> Fⁿ defined by the matrix P where M_E^E(T_P) =P maps every basis B of Fⁿ to a basis, T_P(B)= B' .

Eigenvectors, Eigenvalues, Eigenspaces, Matrices, Diagonalizability, and Polynomials!

Definition:Suppose T is a linear operator on V, then a is an eigenvalue for T if there is a non-zero vector v where T(v) = av. The vector v is called an eignevector for T.
Proposition: a is an eignevalue for T if and only if Null(T-aId) is non-trivial.]

Def'n:T is diagonalizable if V has a basis of eigenvectors.
T is diagonalizable if and only if M(T) is similar to a diagonal matrix, i.e., a matrix A where A_i,j=0 for indices i, j where i is not equal to j.
10-22
Fact: If T is diagonalizable with distinct eigenvalues a₁,...,a_k , then S = (T-a₁Id)(T-a₂Id).... (T-a_kId) = 0.
Proof: It suffices to show that for any v in a basis for V, T(v) = 0. Choose a basis for V of eigenvectors, and suppose v is an element of this basis with T(v) = a_j v. Then S(v)= (T-a₁Id)(T-a₂Id).... (T-a_kId)(v) = (T-a₁Id)(T-a₂Id).... (T-a_jId)... (T-a_kId)(T-a_jId)(v) = 0.

What about the Converse? If there are distinct scalars a₁,...,a_k where S(v) = (T-a₁Id)(T-a₂Id).... (T-a_kId)(v) = 0 for any v in V, is T diagonalizable? we will return to this later....!

10-24
A Quick trip into High School/Precalculus Algebra and Formal Polynomials: Recall... F[X]
F[X] = { f in F^∞, where f(n) = 0 for all but a finite number of n} =
{ formal polynomials with coefficients in F using the "variable" X}< F.
X = (0,1,0,0,....). example: 2+X + 5X² +7 X⁴ = (2,1,5,0,7,0,0,...)

Notice: F[X] is an algebra over F... that is it has a multiplication defined on its elements... in fact it has a multiplicative unity, 1 =(1,0,0,0...), and furthermore, this algebra has a commutative multiplication: if f,g are in F[X] the f*g = g*f.

Notice:If f is not 0, then deg(f) = ...., and
Theorem: If f and g are not 0 = (0,0,0...), then f*g is also non-zero, with deg(f*g) = deg(f) + deg(g).

If A is any algebra over F with unity, and f is in F[X],
f= (a1,a2,...an,0,0,...), then we have a function, f :A->A defined by
f (t) = a₁ I + a₂t+ ... +a_ntⁿ where I is the unity for A.

In particular (i)A can be the field F itself, so f is in P(F).
Example: F = Z₂. f = X² + X in F[X]. Then f is not (0,0,0....) but f(t) = 0 for all t in F.

(ii) A can be L(V) where Vis a finite dimensional vector space over F.
Then f (T) is also in L(V).
(iii) A can be M(n;F), the n by n matrices with entries from F.
Then f (M) is also in M(n;F).

The Division Algorithm, [proof?]
If g is not zero, for any f there exist unique q , r in F[X] where f = q*g +r and either (i) r = 0 or (ii) deg(r) < deg(g).
The Remainder and Factor Theorems [Based on the DA]
Suppose c is in F, for any f there exist unique q , r in F[X]
where f = q*(X-c) +r and r = f(c).
Suppose c is in F ,then f (c) = 0 if and only if f = (X-c)*q for some q in F[X].

10-27
Roots and degree.
If c is in F and f (c) = 0, then c is called a root of f.
If f is not 0, and deg(f) = n then there can be at n distinct roots of f in F.
Factoring polynomials. A polynomial in F[X] is called reducible if there exist polynomials p and q, with deg(p)>0 and deg(q)>0 where f=p*q.
If deg(f )>0 and f is not reducible it is called irreducible (over F).
Example: X² + 1 is irreducible over R but Reducible over C.

(I)The FTof Alg for C[X].
Theorem: If f is non-zero in C[X] with deg(f)>0, then there is a complex number r where f (r) = 0

(II) The FT of Alg for R[X].
Theorem: If f is non-zero in R[X] and f is irreducible, then deg(f)= 1 or 2.

Proof of II assuming (I): If f is in R[X] and deg(f)>2, then f is in C[X].
If r is a root of f and r is a real number then f is reducible by the factor theorem.
If r=a +ib is not a real number, then because the complex conjugate of a sum(product) of complex numbers is the sum (product) of the conjugates of the numbers, and the complex conjugate of a real number is the same real number, we can show that f(a+bi) =0 = f(a-bi). Now by the factor theorem (applied twice)

f = (X-(a+bi))*(X-(a-bi))*q=((X-a)² + b² )*q

and deg(q) = deg(f ) -2 >0. Thus f is reducible.

Back to Linear Algebra, Eigenvalues and "the Minimal Polynomial for a Linear Operator":
10-29
Theorem: Suppose V is nontrivial f.d.v.s / C. T in L(V). Then T has an eigenvalue.
Comment: First we considered this with the Coke/Pepsi example B:
T(x,y) =(2/3 x +1/4 y, 1/3 x+3/4 y).
Consider ( e₁= (1,0), T(e₁) = (2/3,1/3), T(T(e₁ ))= (4/9+1/12, 2/9+3/16) ). This must be linearly dependent because it has 3 vectors in R². This gives some coefficients in R not all zero, where a₀ Id(v) + a₁T(v) + a₂T²(v)=0. Thus we have f in R[X] , f = a₀ + a₁X + a₂X²
and f(T)(e₁) = 0. In fact, we can use f =(X-1)(X-5/12) . f(T)(e₁)=(T-Id)(T-5/12Id)(e₁)= (T-Id)((T-5/12Id)(e₁))=(T-Id)(2/3-5/12,1/3) = 0 Thus we find that (T-Id) (1/3,1/3)= 0, so (1/3,1/3) is an eignvector for T with eigemvalue 1. Now here is a
Proof (outline): Suppose dim V = n >0.
Consider v, a nonzero element of V, and the set (v, T(v), T²(v),T³(v)....Tⁿ(v)).
Since this set has n+1 vectors it must be linearly dependent. ...
...
This means there is a non-zero polynomial, f, in C[X] where f (T)(v) = 0.
Let m = deg(f ).
Using the FT of Alg for C we have that f = a (X-c₁)... (X-c_m).
Now apply this to T as a product and ....
for some i and w (not 0), (T-c_iId) (w) = 0. Thus T has an eigenvalue.

Theorem:V, T as usual. Then there is some non-zero polynomial f in F[X] where f (T) = 0, i.e., for all v in V, f (T)(v)= 0.
Proof (outline). Suppose dim(V)=n. Consider the set (Id, T, T²,T³....T^n*n).
Since this set has n*n+1 vectors in L(V) where dim(L(V))= n*n, so it must be linearly dependent. ...
...
This means there is a non-zero polynomial, f, in C[X] where f (T) = 0, i.e., f(T)(v) = 0 for all v in V.

10-31
Definition: Ann(T)={f in F[X] : f (T) = 0}. The previous Theorem shows Ann(T) has a non trival element.
Prop: f, g in Ann(T), h in F[X] then f+g and h*f are in Ann(T).

Theorem: There is a non zero monic polynomial in Ann(T) of smallest degree. This polynomial is unique and any polynomial in Ann(T) has this polynomial as a factor.

Proof: The previous theorem has shown Ann(T) has a nonzero polynomial element. Considering the degrees of the non-zero polynomials in Ann(T) there is a smallest positive degree, call it m and a polynomial g in Ann(T) with deg(g) = m. If g = bX^m +....terms of lower degree, where b is not 0,
then f = 1/b*g is also in Ann(T) and f is a monic polynomial.

Now suppose h is also in Ann(T) , then by the division algorithm, h = q*f + r where either r = 0 or deg(r)< m. But since h and f are in Ann(T), h - q*f = r is also in Ann(T). Since deg(f )=m, which is the lowest degree for an element of Ann(T), it must be that r = 0, so h =q*f. Now if h is also monic and deg(h) = m, then deg(q) = 0, and since h and f are both monic, q = 1, and h = f. Thus the non zero monic polynomial in Ann(T) of smallest degree is unique!

Prop. Suppose m in F[X] is the mininal polynomial for T. Then T has eigenvalues c if and only if X-c is a factor of m.
Proof: => Suppose c is an eigenvalue for T. Then W_c =Null(T-c) is a nontrivial subspace of V and for w in W_c
T(w) = cw is also in W_c . Let S(w)=T(w) for w in W_c .Notice that S is in L(W).
As a linear operator on W_c ,(X-c)(S) = 0, so X-c is the minimal polynomial for S.
But for any w in W_c (and thus in V), m(S)(w) = m(T)(w) = 0, so m is in Ann(S), and X-c is a factor of m.

<= Suppose X-c is a factor of m. m= (X-c)*q. Since m is the minimal polynomial for T and deg(q)= deg(m)-1, q(T) is not the 0 operator. Thus there is some v in V where w =q(T)(v) is not 0.
But m(T)(v)=...=(T-cId)(q(T)(v))=(T-cId)(w) = 0. So c is an eigenvalue for T. EOP

Cor. T is invertible if and only if the constant for m = m(0) is not 0.
Cor. If T is invertible then T^-1 = -1/m(0) (( m-m(0))/X)(T))=

11-3
Invariant Subspaces: V, T as usual.
W is called an invariant subspace for T if for all w in W, T(w) is also in W... i.e. T(W)<W.
If W is an invariant subspace of T, then T:W->W is a linear operator as well, denoted T|_W.
If W is an invariant subspace of T, then the minimal polyonmial of T|_W is a factor of the minimal polynomial of T.
Invariant Subspaces and Block Matrices:
If V is a fdvs / F and T is in L(V) with W₁ ,W₂ ,...,W_k invariant subspaces for T and V = W₁ #W₂ #...# W_k .
and if the basis for V is B is composed of bases for each W₁ ,W₂ ,... ,W_k in order, then M(T) is composed of matrix blocks - each of which is M(T|_Wi). Furthermore if m₁ ,m₂ ,... , m_k is the minimal polynomial for T restricted to W₁ ,W₂ ,... ,W_k then the minimal polynomial for T is the lowest common multiple of the polynomials m₁ ,m₂ ,... , m_k.

Prop. Suppose m in F[X] is the mininal polynomial for T.
T is diagonalizable if and only if there are distinct c₁,...,c_m in F where m = (X-c₁)... (X-c_m)
Proof : => Suppose T is diagonalizable and T has eigenvalues c₁,...,c_m. By the preceeding Proposition, for each c₁,...,c_m , each (X-c₁),...,(X-c_m) is a factor of the minimal polynomial, and by our previous work, S=(X-c₁)*...*(X-c_m) is in Ann(T) so m = S.

<=[ Modified from proof in Hoffman and Kunze- Linear algebra 2nd Ed]
Suppose there are distinct c₁,...,c_m in F where m = (X-c₁)*...*(X-c_m).
Let W be the subspace spanned by all the characteristic vectors of T. So W = W₁ +W₂ +...+ W_m where W_k = Null(T-c_k).

We will show that V = W indirectly. Suppose V is not W.

Lemma: There is a vector v* not in W and a characteristic value c_j where w*=(T-c_j Id)(v*) is in W. [Proof is below.]

Now express w* in terms of vectors in W_k.
Then for any polynomial h,

h(T)(w*) = h(c₁)w₁ + ... +h(c_k) w_k.
Now m = (X-c_j )q and q-q(c_j ) = (X-c_j )h.

THUS...

q(T)(v*)-q(c_j )(v*) = h(T)(T-c_j Id)(v*)= h(T)(w*) which is in W!

BUT 0 = m(T)(v*)=(T-c_j Id)(qT)(v*).... so q(T)(v*) is in W.

Thus q(c_j )(v*)=q(T)(v*)- h(T)(w*) is in W.

But we assumed v* is not in W, so q(c_j ) = 0. So the factor (X-c_j ) appeared twice in m! A Contradiction!

11-5
Proof of Lemma: We must find a vector v* not in W and a characteristic value c_j where w*=(T-c_j Id)(v*) is in W.

Suppose b is in V but not in W.
Consider C={ all polynomials f where f (T) (b) is in W}.
[There are some non-trivial elements of C since m(T)(b) = 0, so m is in C.]
Of all the elements of C, there is a unique non-zero monic polynomial of least degree which we will call g. [Proof left as exercise for Friday]
Then g is a factor of m. [Proof left as exercise for Friday.] Since b is not in W, g is not a constant, and so deg( g ) >0.
Since we know all the factors of m, for some c_j ,

(X- c_j ) is a factor of g.

So g= (X- c_j ) * h, and because g was of minimal degree for polynomials where f (T) (b) is in W,

h(T)(b)=v* is not in W.

But w* = g(T)(b) = (T-c_j Id)h(T)(b) = (T-c_j Id)(v*) is in W.
End of lemma's proof.

Remark: every polynomial is the minimal polynomial for some linear operator: STILL To be filled in here.
Example: f = (X-2)(X-1)^2. = (X-2) (X^2 -2X +1) = X^3 -4x^2 ...
Define

Nilpotent Operators.

Now what about operators where the minimal polynomial splits into powers of linears? or where the minimal polynomial has non-linear irreducible factors?

First Consider the case of powers of linear factors.
The simplest is just a power of X. (or (X-c)).

Example: D: P₃(R) -> P₃(R), the derivative. Then the minimal polynomial for D is X⁴.

Definition: In general, an operator N is called nilpotent if for some k>0, N^k =0. The smallest such k is called the index of nilpotency for N, and if k is the index of nilpotency for N, then the minimal polynomial for N is X^k .

11-7

Proposition: If N is nilpotent of index k and dim V = k, then there is a basis for V , (b₁,b₂,...b_k) where N(b_k)= 0 and N(b_i) = b_i+1 for all i <k.
Proof: Since the minimal polynomial for N is X^k, there is some vector v* in V where N^k-1 (v*) is not zero but N^k (v*)=0. Let b₁ = v* and b₂ = N(b₁), b₃ = N(b₂), ...,b_k = N(b_k-1).
Then clearly N(b_k )= N^k (v*)=0. It suffices to show that (b₁,b₂,...b_k) is linearly independent. Suppose a₁b₁ + a₂b₂+...+ a_kb_k= 0. Then a₁v* + a₂N(v*)+...+ a_kN^k-1 (v*)= 0. Now apply N to obtain N^k-1(a₁b₁ + a₂b₂+...+ a_kb_k)= 0 or a₁N^k-1b₁ + a₂N^k-1b₂+...+ a_kN^k-1b_k= 0. But N^k-1b_j =0 for j>1, so a₁N^k-1b₁=0. But N^k-1 (v*) is not zero, so a₁= 0. Now by using N^k-2(a₂b₂+...+ a_kb_k)= 0 a similar analysis shows that a₂ =0. Continuing we can show that a₃ =0, ..., a_k-1
= 0. But that leaves a_k b_k = 0 and thus a_k = 0, so (b₁,b₂,...b_k) is linearly independent.

Alternatively: Let f = (a₁, a₂,...,a_k) the polynomial of degree k-1 with a₁, a₂,...,a_kfor coefficients. If f is not the zero polynomial and f(N)(v*)=0 then f must be a factor of X^k . But the assumption is that N^k-1 (v*) is not zero, so f must be the zero polynomial and all of the coefficients are 0. Hence, (b₁,b₂,...b_k) is linearly independent.

Example: Find the basis for D: P₃(R) -> P₃(R).
Comments:

Using the basis (b₁,b₂,...b_k) the matrix for N, M(N) has 0 everywhere except for 1's that are below the main diagonal.
If T is an operator on a vector space V with dim V= k and the minimal polynomial for T is (X-c)^k .
Then let N = T-cId and so N^k =0. So N is nilpotent with index of nilpotency k. Using the basis for V determined in the last proposition we have M(T-cId ) = M(T) - cM(Id), So the matrix of T , M(T) = M(T-cId) + cM(Id). This matrix is called the Jordan Block matrix of dimension k for the characteristic value c. J(c;k)
11-10
The BIG Picture.
Theorem (Jordan Canonical Form): Suppose T is a linear operator on V and m is the minimal polynomial for T.
If m = (X-c₁)^p1... (X-c_m)^pm then
there are T- invariant subspaces W1,....,Wq with V = W1#... #Wq. Furthermore the minimal polynomial for T restricted to each subspace Wi is of the form (X-c_j)^riwhere dim(Wi) =ri and ri=<pj and ri= pj for at least one i and each j.
Thus there is a basis for V so that using that basis, M(T) is a block matrix where each block has the form of the J(c_j ; ri). This matrix is called the Jordan Canonical Form for T.
Fill in examples of possible matrices for T when m = (X- 2)³(X+3)²X³ (X-1)^{where dim (V) =12. Discuss uniqueness.}
By the Fundamental Theorem of Algebra, If V is a finite dimensional vector space over C, then any linear operator on V has a basis for which M(T) is a Jordan Canonical Form.
Corollary: For T a linear operator on V, a finite dimensional vs over R or C, the degree of the minimal polynomial for T is not greater than the dimension of V.
Proof: By the Jordan Theorem, the degree of m is p1 + p2+ ....+ pm which is no greater than the r1+r2+ ... +rq = dim(V).
11-12
Application to powers of matrices:
Suppose A is a n by n matrix with complex coefficients. Then there is a linear operator T on Cⁿ that has A for it's matrix using the standard basis for Cⁿ. By the Jordan Theorem, there is a basis for Cⁿ where M(T)=J is in Jordan form. Thus there is an inverible matrix S where A =S-1 J S.
Now consider Aⁿ = (S^-1 J S )ⁿ= S^-1J ⁿ S.
So we can determine the behavior of Aⁿ by studying powers of the Jordan blocks.
Do an example with J( 1/2;3), J( 1;3), J(2;3) , and J( i ;3).
What conclusions can we infer about the convergence of Aⁿ for large n based on the eigenvalues of A?
If any eigenvalue ,c, has |c|>1 then the sequence diverges. If |c| = 1 and c is not 1, then the sequence will not converge, c= 1 and the block is J(1;k) with k > 1, then the sequence diverges.
Otherwise, the sequence will converge!
Look at J(c;3) and J(c;4) to see with examples... using technology.
Application to Markov Chains:
A markov chain consist of a finite number, n, of states and a matrix T where T(i,j) = the probability that something currently in state j will move to state i after one period of transition. Thus 0 <= T(i,j) <= 1 for all 1 <= i,j <= n. It is assumed the only possible changes of state from j to i are listed, so T(1,j) + T(2,j) +... +T(n,j) = 1 for any j. If v = (v₁, v₂, ... ,v_n) gives the initial distribution of objects in the various states, then T(v) is the likely distribution of objects after one period of transition.
T(v) _i = T(i,1)v₁ + T(i,2)v₂ +... +T(i,n)v_n , so T gives a linear operator on Rⁿand Cⁿ. Notice that using the standard basis, M(T) = T. If we assume that the probabilities of T remain the same for every period of transition, then T is called a Markov transition matrix. Given the initial distribution v, the distribution after k periods of transition is T(...(T(v))...) = T^k(v). Treating T as a linear operator on Cⁿ, there is a basis for Cⁿ that is composed of Jordan blocks. Thus, the question of what happens to the distribution between the states in the long run is determined by the eigen values of T. T is called a regular Markov process, if for some k, all the entries of T^kare positive.
DO example to illustrate how this works on graph.
Proof on Monday 11-17 for the following
Theorem: If T is a regular Markov process, then 1 is an eigenvalue for T and all other eigenvalues of T have magnitude less than 1. Furthermore: Dim Null(T-Id)=1 and the power of X-1 in the minimal polynomial of T is 1.
Corollary: For k large, T^k(v) is close to the unique vector v*in Cⁿwhere

T(v*)=v* and v₁ + v₂ +... +v_n =v*₁ + v*₂ +... +v*_n.

Illustrate this with example... and technology.
Use the result to find long run for a 5 state markov process using graph of transitions and technology to find the limit as the solution to T(v*)=v* or (T-Id)(v*)=0.
Proof (outline ..based in part on Friedberg, Insel, Spence): Suppose T is a regular Markov process and let J be the Jordan matrix for T.
Notice: If c is an eigenvalue for T, then c is an eigenvalue for the transpose of T, T^t.
[ f (T)^t = f(T^t), so T and T^t have the same minimal polynomial, hence the same eigenvalues. Write up the details of this as an exercise!]
Lemma 1: If c is an eigenvalue for T, then |c| ≤ 1.
Proof of Lemma 1: Consider c as an eigenvalue for T^t. Suppose x be an eigenvector with eigenvalue c for T^t.
Then T^t(x)_i=T(1,i)x₁ + T(2,i)x₂ +... +T(n,i)x_n = cx_i for each i. Let b = Max{|x₁|,|x₂|,...,|x_n|}
Then for some k,

|c| b = |c x_k| = |T(1,k)x₁ + T(2,k)x₂ +... +T(n,k)x_n|
≤|T(1,k)x₁|+ |T(2,k)x₂ |+... +|T(n,k)x_n|
≤|T(1,k)||x₁| + |T(2,k)||x₂| +... +|T(n,k)||x_n|
≤|T(1,k)|b + |T(2,k)|b +... +|T(n,k)|b
≤ [|T(1,k)| + |T(2,k)| +... +|T(n,k)|]b = b
since T(1,j) + T(2,j) +... +T(n,j) = 1 and 0≤ T(i,j).
Lemma 2: 1 is an eigenvalue for T.
Proof: 1 an eigenvalue for T^t: T(1,j)1 + T(2,j)1 +... +T(n,j)1 = 1 so T^t has an eigenvector of (1,...,1) = u.
Lemma 3: J has single blocks for c=1, of the form J(1;1).
Powers of T have real number entries that are never larger than 1. [Prove as exercise.] But if there is a Jordan block in J with J(1;k) with k>1, the powers will get much larger than 1.
Start here on 11-19.
Lemma 4: If c is an eigenvalue for T with |c|=1, then c = 1 and Dim(Null(T-Id))= 1.
Proof: Since T is a regular Markov process, we may assume that all entries of T are positive.
First, since |c| = 1, all the previous inequalities in Lemma 1 are actually equalities for any eigenvector x for the eigenvalue c.

b =|c| b = |c x_k| = |T(1,k)x₁ + T(2,k)x₂ +... +T(n,k)x_n|
=|T(1,k)x₁|+ |T(2,k)x₂ |+... +|T(n,k)x_n| (*)
=|T(1,k)||x₁| + |T(2,k)||x₂| +... +|T(n,k)||x_n|
=|T(1,k)|b + |T(2,k)|b +... +|T(n,k)|b (**)
=[|T(1,k)| + |T(2,k)| +... +|T(n,k)|]b = b .
And we already have that T(1,k) + T(2,k) +... +T(n,k) = 1
Notice that for any complex numbers a and b, |a+b| = |a| + |b| only if a and b are "colinear", that is, there is some complex number z with |z| = 1 and real numbers r and s where a = rz and b = sz.
Using this fact for (*) there must be some complex number z* with |z*| = 1 and real numbers r_j where

r_j z* =T(j,k)x_jfor all j.
From (**) , since b > 0, either T(j,k) = 0 or |x_j|=b, but the assumption of regularity is that T(j,k)>0, so |x_j|= b > 0 for all j. But then r_j z*/T(j,k)=x_jso
b =|x_j| = |r_j z* /T(j,k)| = r_j |z*| /T(j,k) = r_j/T(j,k) for all j. Thus b z* =x_jfor all j.

SO.... x = bz*(1,1,...,1) and c = 1. EOP
Finish Proof of Theorem and corollary.

Do Example. We looked at a 3 by 3 example of a regular Markov chain matrix modelling the movement of rental cars between SF, LA, and Vegas. We assumed there are 400 cars in the fleet. To find the v* suggested by the Theorem, we need to solve Tv* = v* and 400=v*₁ + v*₂ +v*₃ . This is equivalent to solving the system of equations (T-I)v* = 0 with 400=v*₁ + v*₂ +v*₃ . Since the columns of T add to 1, the matrix T-I has linearly dependent rows (they add to 0). replacing the final row with 1 1 1 and 0 with (0,0,400) the equations had a unique solution, 400(3/11,6/11, 2/11). Obviously, if the total number of cars were N, then the result would be N(3/11,6/11, 2/11). The proportions are fixed then by the characteristic vector (3/11,6/11, 2/11). Furthermore, this result is independent of the starting distribution, so if all cars started in SF, the result would be the same, and similarly for LA or Vegas. So for n large Tⁿ is approximately the matrix with each column equal to the transpose of (3/11,6/11, 2/11).
Notice that T ⁿ -> T* where each column of T* is v* with v*₁+... + v*_n = 1

12-1
Inner Products: Unit vectors ("normal"). Orthogonal vectors.
V... finite dimensional i.p.s. Notation: If w = x+ yi then use ~w to denote the conjugate of w, so ~w = x - yi.
Orthonormal basis: ||v_j||=<v_k,v_k> = 1. <v_k,v_j> = 0 for k and j different.
Suppose B = (v₁,v₂ ... v_n) is an orthonormal basis for Fⁿ. Let A be the matrix that has each row determined as the coordinates of the basis B in terms of the standard basis. Let A* be the matrix that is the conjugate of the transpose of A. Then AA* = I.
If B = (v₁,v₂ ... v_n) is an orthonormal basis for V then v = sum <v,v_j>v_j .
Proof: v = Σ a_jv_j so <v,v_i> = <Σ a_jv_j , v_i> = Σ a_j <v_j,v_i> = a_i .
Proposition: If S is an orthonormal set of vectors then S is linearly independent.
Proof: If 0= Σ a_jv_j then 0=<0,v_i> = <Σ a_jv_j , v_i> = Σ a_j <v_j,v_i> = a_i for each i, so the vectors are linearly independent..
How to find an orthonormal basis for a finite dimensional vector space.
Convert a basis C= (w₁,w₂ ... w_n). [Called the Gram-Schmidt process]: Start: v₁= w₁ / ||w₁||.

Next: Let z₂= w₂ - <w₂,v₁>v₁, so <z₂,v₁ > =0. Now let v₂= z₂ / ||z₂||. Then <v₁,v₂ > = 0 and Span(v₁, v₂ ) = Span (w₁,w₂ ).
Continue....Let z₃= w₃ - (<w₃,v₁>v₁ + <w₃,v₂>v₂), so <z₃,v₁ >=<z₃,v₂> =0. Now let v₃= z₃ / ||z₃||. Then <v₃,v₁ >=<v₃,v₂ > = 0 and Span(v₁,v₂,v₃>) = Span (w₁,w₂ ,w₃). and so on.... until we are done.
An Alternative: v₃= w₃ - (<w₃,v₁>/ ||v₁||² v₁ + <w₃,v₂>/||v₂||² v₂) , etc.
Do an example in R³; discuss the geometry of this.
12-3
Miscellaneous topics:
Orthogonal Linear Operators:
Def. T:V-> V in L(V) is called an orthogonal if ||T(v)|| = ||v|| for all v in V.
Relation to Inner Products, angles:
Prop: T is orthogonal if and only if <T(v),T(w)> =<v,w> for all v and w in V.
An Orthonormal operator is 1:1 and "preserves the angles between vectors", in particular it transforms a set of orthonormal vectors to a set of orthonormal vectors.
An Orthonormal operator will transform an orthonormal basis to an orthonormal basis.
O(n): If V = Rn with the usual inner product, then O(n) = { T in L(V): T is orthonormal}
Prop: If T and S are in O(n) then , so TS and T^-1 are also in O(n).
Comment: Since Multiplication of operators is associative, and clearly, Id is in O(n), O(n) forms a "group".... called the "orthogonal group."
12-5
Relation to matrices: A square matrix defines a linear operator by multiplication. If T is the operator from the matrix M then using the standard basis, M(T) = M. Then T is orthogonal if and only if M^t*M = I. Thus O(n) = {n by n matrices M | M^t*M = I}
The determinant: An axiomatic approach with multilinearity.

Definition: V a fdvs over F. A Function T: Vⁿ -> F is called a multilinear function on V if
T(v₁,..., v_n) is a linear function for variable vector, i.e. for all i,
T(v₁,. ,av_i+v'_i,.., v_n)= aT(v₁,.., v_i,..., v_n) + T(v₁,..,v'_i,..., v_n)
Prop.: If T is multilinear then T(v₁,...,0,..., v_n)=0.
T is called alternating if
T(v₁,...,v_i ,..., v_j, ..., v_n) = -T(v₁,... ,v_j,...,v_i,..., v_n)
Prop: If 1+1 is not 0, then T is alternating if and only if T(v₁,..,v_i ,...,v_i,.., v_n)= 0
Assume for the remainder of this topic that 1+1 is not 0 in F.
12-8
If T is alternating and multilinear then (v₁,. ,v_i,...,av_i+v_j,.., v_n)= T(v₁,..,v_i ,...,v_j,..., v_n)
Theorem: There is a unique alternating multilinear function D on Fⁿ satisfying the property that D(e1,...,en) = 1.
Furthermore when considered as a function on the n by n matrices with coefficients in F,
D(M) = 0 if and only if M is singular.
Start of Proof: Every square matrix is row equivalent to an upper triangular matrix. [Math 241.]
Lemma: If M is an upper triangular matrix, then D(M) is the product of the entries along the main diagonal.
Proof:Let the product be denoted p. Then D(M) = pD(M') where M' is upper triangular with 1's along the main diagonal. Then since D is alternating and multilinear, D(M') = D(I) = 1.
D(MP) = D(M)D(P) ..... Proof : If D(P) = 0 then P is singular and so is MP, so D(MP) = D(M)*D(P).
Suppose D(P) is not 0. Let D* be defined on M by D*(M) =1/D(P)* D(MP). Show D* is multilinear, alternating and D*(I)=1.
Cor. If M is invertible with inverse P then D(P) = 1/D(M).
Cor. If T is a linear operator on a fdvs / F then D(T) = D(M(T)) is well defined using any basis.

12-10

Prop: If N is the transpose of M, then D(M)=D(N).
Proof: Let D* be defined on M by D*(M) = D(M^t). Then show D* is multilinear, alternating and D*(I)=1.
Cor. : If M is in O(n), then D(M)= +1 or - 1.
Construction of D using permutations.
What is a permutation?
What is the sign of a permutation?
If p is a permutation and t is a transposition then sign(tp) = - sign(p)
Definition. D(M) = Σ sign(p) M_1,p(1) * M_2,p(2) * ...*M_n,p(n)
D is multilinear, alternating and D(I) = 1.
Applications of deteminants:
D is a multivariable polynomial on the n² entries of a matrix.
The collection of non-invertible matrices correspond to points in an n²dimensional euclidean space where a single polynomial equation D(....) = 0. These matrices form a "hypersurface" in n² euclidean space.
Thus the invertible matrices are an n² dimensional .."manifold"... in n² euclidean space, composed of all those points not on the hypersurface of non-invertible matrices.
These matrices are also a group under multiplication... in fact with a little more exploration of the properties it can be shown that the multiplication is a continuous (differentiable) function and so this group (called the "general linear group" GL(n) ) is a special type of group called a Lie Group. [Pronounced "Lee" named after Sophus Lie.]
Area ,Volume, and generalized volume.
The rows of M span P= {v in V where v is a linear combination of the row vectors of R with coefficients in [0,1]}, a parallelogram, parallelpiped, ... in R², R³, .....
Then |D(M)| = area P or volume of P...
Geometry of the row operations and its effect on area and volume, etc
The cross product in R³ and the determinant. Notice that the definition of D makes sense as long as the multiplication does, so we can use one row of vectors and still compute D. Use (e1,e2, e3) for the first row and we can define v× w using the matrix with v and w in the second and third rows.
Generalized "cross product" of n-1 vectors in Rⁿ.
What about eigenvalues, eigenvectors and determinants?
If c is an eigenvalue then A- cI or T-cId is singular.
Thus c is an eigenvalue for A (or T) if and only if D(A-cI) = 0 or D(T-cId) = 0
The characteristic polynomial! Notice that the definition of D make sense as long as the terms in the matrix can be multiplied. So this will work for polynomials in F[X]. In particular, D(A-XI) makes sense and is a polynomial of degree n. This polynomial is called the characteristic polynomial of A or T. Sometimes this uses the letter χ.
The characteristic values of A or T are precisely the roots (zeroes) of its characteristic polynomial.
Theorem: ( Cayley -Hamilton for C or R) For any A (or T) , χ(A) = 0.
Proof: For Complex matrices. Use the Jordan Theorem and the Jordan matrix. Then the minimal polynomial of A is a factor of D(A-XI). IRMC! For a matrix with real entries the result follows by considering the matrix as a matrix with complex entries.
Comment: The minimal polynomial is a factor of the characteristic polynomial and has exactly the same roots-- which makes it a little easier to find the minimal polynomial.
Upper triangular representations. The following gives an example of a proof by mathematical induction on the dimension of a vector space. The proof of the Jordan theorem is another result that usually is proven by some form of mathematical induction.

Theorem 5.13 Vfdvs / C. T in L(V) then V has a basis B=(v₁,..., v_n) where T(v_k) is in Span(v₁,..., v_k). Thus M(T) is a matrix with M(T)_ij = 0 for j<= i.

Proof: By induction on the dimension of V. When n = 1, the statement is true obviously. (!)

Now assume that dim(V) = n > 1 and the statement is true for all vector spaces of dimension less than n.
Since V isa complex vector space, T has an eigenvalue c. Consider S = T -cId. Then dim(Null(S))>0 making dim(R(S)) = m <n. Let U = Range(S) and consider T(u) where u is in U. T(u) = T(u) - cId(u) + cu= (T-cId) u + cu and clearly (T-cId) (u) is in U and cu is in U so T(u) is in U. Now we can apply the induction hypothesis to U to find a basis B'= (u₁,..., u_m) for U so that T(uk) is in Span(u₁,..., u_m). Extend B' to a basis for V, B =(u₁,..., u_m, v₁,...v_n-m) . Then T(v_k) = T(v_k) - cId(v_k) + cv_k= (T-cId) v_k + cv_k . But (T-cId)v_k is in U so T(v_k) is in the span(u₁,..., u_m ,v₁,...v_k).

Cor. (5.16)V fdvs/ C, t in L(V). T is invertible iff all entries of the diagonal of an upper triangular matrix representing T are non-zero.