Math 210: Day By Day, Part 2

Math 210, Spring 2001: Day By Day Part 2

A day by day outline of some topics what we cover.

These are not intended to be balanced or comprehensive notes. They are to supply motivation and fill gaps. Thus, standard material easily available in traditional texts will be skipped, while brief (quick and dirty) notes about computer or Internet issues may be here.
In principle, the Wednesday classes will involve more computer issues. But real life is never that tidy.

Contents

Week of Jan. 16	Week of Jan. 23	Week of Jan. 30	Week of Feb. 6
Week of Feb. 13	Week of Feb. 20	Week of Feb. 27	Week of March 6
Week of March 20	Week of March 27	Week of April 3	Week of April 10
Week of April 17	Week of April 24

Tues. Feb. 13

(Markov Chains (continued)

Wed. Feb. 14

Computer Type Faces. We have seen F page 1and F page 2 how to use matrices to change the appearance of the letter F. One can stretch, slant, rotate, or reflect each letter. Computers regularly implement this to modify a type face. See the article New Font Technology

Rigid Motions. Rigid motions are perhaps the simplest transformations in graphics. A rigid motion F(X) simply preserves the distance between any two points:

|| F(X) - F(Y) || = || X - Y ||, where we use Euclidean geometry to measure distance. Three familiar examples of rigid motions in the plane are
translations, such as (x₁, x₂) --> (x₁ - 3, x₂ + 1), which translates every point by (-3, 1).
rotations, such as the +90 degree rotation: (x₁, x₂) --> (-x₂, x₁).
reflections, such as across the horizontal axis: (x₁, x₂) --> (x₁, -x₂).

Thurs. Feb. 15

The transition matrices M = (a_ij) that arise in Markov chains have the properties that that i) a_ij >= 0 and the sum of each column is 1. Since each column of M - I sums to zero one sees that the homogeneous equations (M - I)P = 0 have a solution P, where P is not the zero vector. This means that 1 is an eigenvalue of M with corresponding eigenvector P. While there may be other eigenvectors with eigenvalue 1 (a primitive case is for M = I, when every vector is an eigenvector with eigenvalue 1), there is at least one eigenvector whose elements are >= 0 and at least one element is positive. By multiplying this eigenvector by a constant, we obtain a probability eigenvector P₁.

In the notes Markov Chains. we show that for these transition all of the eigenvalues of M satisfy |eigenvalue| <= 1. Moreover, if all the elements of M are strictly positive, then all the eigenvectors other than those of the form cP₁ satisfy |eigenvalue| < 1 and the matrices M^k converge as k gets large.
This convergence is easy to prove if one can diagonalize M.

back to table of contents

Tues. Feb. 20

One of last week's homework problems was to show that linear maps take parallel lines to parallel lines. This is one way to show this. We start by thinking of a straight line as describing the motion of a particle P(t) at time t, assuming it travels with constant (non-zero) velocity V and begins at X₁. Then P(t) = X₁ + Vt. A parallel Q(t) line passing through X₁ can be written as Q(t) = X₂ + Vt. We look at their images LP(t) and LQ(t). Now LP(t) = L(X₁ + Vt) = L(X₁) + (LV)t = Y₁ + Wt, where we let Y₁ = L(X₁ is the image of X₁ and W = LV is the image of V. In particular, this shows that under a linear map L the image of a straight line is another straight line. Similarly, LQ(t) = L(X₂ + Vt) = L(X₂) + (LV)t = Y₂ + Wt, where Y₂ = L(X₁ is the image of X₂. These formulas LP(t) = Y₁ + Wt and LQ(t) = Y₂ + Wt and show that the lines LP(t) and LQ(t) are parallel.

Leonteif input-output matrix This is an important application in economics. Here one also meets matrices all of whose elements are non-negative. See the printed lecture notes from the book by Strang that I distributed in class (it also has more on Markov chains).

Data. We begin a new segment of the course discussing some aspects of data. The first example is Sex Bias in Graduate Admissions? . It illustrates Simpson's paradox, which arises when summarizing data too hastily. The key question to ask is, "does the summary weigh the component data adequately?".

Wed. Feb. 21

After discussing some of the homework problems, we thought More on Rigid Motions. We observed that any rigid motion F(X) can be written as a translation plus a motion that leaves the origin fixed: F(X) = X₀ + G(X), where X₀ = F(0) is the image of the origin and G(X) is a rigid motion leaving the origin fixed. Since the translation part is a rigid motion is obvious, we need only study rigid motions G(x) leaving the origin fixed.
Some obvious examples of rigid motions from the plane R² to itself that leave the origin fixed are rotations and reflections. In fact, these are the only possibilities. For maps of R² to itself (and also in higher dimensions), it is less obvious what a "rotation" means. [continued next Wednesday].

Thurs. Feb. 22

Projections: P² = P. This is an alternate approach to that done in class. Since P is not 0, there is a vector X so that V := PX is not zero. But then PV = P²X = PX = V, as desired. Similarly, since P is not the identity, there is a vector Y so that PY is not Y. Let U := Y - PY. Then U is not zero, but PU = PY -P²Y = PY - PY = 0.
We can write any vector X as X = PX + (I-P)X. Let W = PX and Z = (I-P)X. Clearly PW =P²X = PX = W and PZ = P(I-P)X =(P - P²)X = 0.

Rat in a two room maze. Say that Room 1 opens to the outside, and if a rat enters that room, it simply leaves the maze. If the rat is in Room 2, the door to Room 1 opens every hour and there is a 70% chance that the rat will stay Room 2. If the rat starts in Room 2, what is the expected waiting time until it goes to Room 1?
To solve this, for simplicity let p = .70. We compute the expected time spent in Room 2. Since the rat starts in Room 2, then it certainly spends the first hour there, but there is only a probability of p that it spends the second hour in Room 2. Similarly there is a probability of p² that it spends the third hour there, etc. Consequently, the expected time T spent in Room 2 is

T = 1 + p + p² + p³ + ... This is a geometric series with ratio p = .7, so its sum is T = 1/(1-p) = 10/3 = 3 1/3. Thus the average waiting time gets to Room 1 (and freedom) is T = 3 1/3 hours.
There is another way to reach the same conclusion. Again, let T be the expected waiting time to go to Room 1. The rat starts in Room 2. With probability p the rat will be in Room 2 for the second hour. But then the process is back at its starting point and the rat will wait T additional hours before entering Room 1. Thus T = 1 + pT. Solving this for T we have T = 1/(1-p) = 10/3.

Data (continued) Say at times t₁, t₂,...,t₁₃ we get the 13 pieces of data q₁, q₂,...,q₁₃. To understand the data we plot it -- and observe that it looks like it approximately fits a straight line q = at + b. How can we find the values of the coefficients a and b that best fit this data? A standard approach is to use the method of least squares. For this we measure the error

E(a,b) := (at₁ + b - q₁)² + (at₂ + b - q₂)² + ... + (at₁₃ + b - q₁₃)². Now find a and b that minimize E(a,b), say by using calculus.

back to table of contents

Tues. Feb. 27

Least Squares. These notes use two approaches, calculus and orthogonal projections, to investigate least squares. Notes on Vectors and Least Squares

Wed. Feb. 28

Rigid Motions (continued). Today we studied rigid motions G(X) that leave the origin fixed, so ||G(X) - G(Y)|| = ||X - Y|| for all X and Y, and G(0) = 0. The following properties are not difficult to prove, although they do take some effort:

G(X) = RX for some matrix R.
||RX - RY|| = ||X - Y|| for all X and Y (obvious from the Property of G(X)).
||RX|| = ||X || (the special case Y = 0 from above).
<RX, RY>= <X, Y> (square both sides of Property 2, then use ||Z||² = <Z, Z> and the algebraic properties of the inner product).
R^*R = I, that is, R^* = R^-1. This is proved using Property 4 and the definition of the adjoint, R^*.
The columns of R are unit vectors and they are mutually orthogonal. To show this, one uses, for instance, that the 2nd column of R is just Re₂, where e₂ = (0,1,0,...,0). Then use Property 4 where X and Y are the vectors e₁, e₂, etc.
If c is an eigenvalue of R, then |c| = 1. (Proof: Say RV = cV for come constant c and some non-zero vector V. Then by Property 3, ||V|| = ||RV|| = ||cV|| = |c| ||V||, so |c| = 1.)
det(R) = +1 or -1. (Proof: We use that for square matrices A, B then det(AB) = det(A) det(B) and det(A^*) = det(A). Then 1 = det(I) = det(R^*R) = det(R^*) det(R) = [det(R)]². Thus det (R) = +1 or -1.)

These matrices R describing rigid motions leaving the origin fixed are called orthogonal matrices.

We show that in 2 dimensions, every 2 × 2 orthogonal matrix with det R = +1 is a rotation, so

To begin, by Property 6 the first column of R is a unit vector, it must have the form (a,b) with a² + b² = 1. This agrees with the first column above. Again by Property 6 The second column must be a unit vector perpendicular to the first column, giving either (-b,a) or -(- b, a). This too agrees with what we have, except for the choice of sign in this column. Since det R = +1, we are forced to use the above choice of sign.
If det R = -1, the same holds if we switch the signs in one column, say the last column. Geometrically this just includes a reflection.

In 3 dimensions, by using a preliminary reflection we may assume that det R = +1. Then by a brief argument one shows that +1 is an eigenvalue of R. This implies there is some (non-zero) vector V with RV = V. Thus, this vector is unchanged by R so R only acts in the plane perpendicular to V. Since this plane is two dimensional, by the previous paragraph we conclude that R is simply a rotation of this plane leaving V fixed, so R is a rotation with axis V.

These rigid motions are essential is any computer graphics. As an example, you may find it illuminating to look at the following Maple program that seeks understanding of how the number of hours of daylight on earth depends on the day of the year and on your latitude. The key ingredient is that the earth rotates around the sun roughly every 365 days, and that its axis is tilted from the vertical plane defined by the earth's orbit around the sun. Hours of Daylight (uses Maple).

Thurs. March 1

back to table of contents

Tues. March 6

Expected Value and Standard Deviation
Let X and Y be random variables. Here are some standard properties.

E(X+c) = E(X) + c
E(aX) = aE(X)
E(X+Y) = E(X) + E(Y)
Var(X) = E[(X-E(X)]²) = E(X²) -E(X)²
SD(X) = sqrt[Var(X)]
Var(X+c) = Var(X)
Var(aX) = a²Var(X) so SD(aX) = |a|SD(X).
Var(X+Y) = Var(X) +Var(Y) +2[E(XY) - E(x)E(Y)]
In particular, if X and Y are independent, then E(XY) = E(x)E(Y) so
Var(X+Y) = Var(X) +Var(Y)
As a special case, if X and Y have the same standard deviation =Sd, then Var(X+Y) = 2Sd² and SD(X+Y) = sqrt{2}Sd
More generally, if X₁, ... , X_k are independent and have the same standard deviation = Sd, then,
Var(X₁ + ... + X_k) =kSd² so
SD(X₁ + ... + X_k) =sqrt(k)Sd
Thus by property 7 above, we have the following square root law for the standard deviation of an average of independent variables that have the same standard deviation Sd:
SD((X₁ + ... + X_k)/k) = Sd/sqrt{k}

Example: One die has 6 possible faces. Let the random variable X assign to each throw of the die the number that appears. Then the expected value E(X) = 7/2 = 3.5 and Var(X) = 2.9166. Then SD(X) = sqrt{2.9166} = 1.71.
If you throw two die, let the random variable W be the sum of the numbers, W = X₁ + X₂ of each die. Since we just computed the expected value and variance if the X₁ and X₂ using X in the previous example, we can use the above properties to find them for W: E(W) = E(X₁ + X₂) = 2(7/2) = 7, while similarly, using Property 9 then Var(W) = 2Var(X) = (2.9166) = 5.8332 and SD(W) = sqrt(2)SD(X) = sqrt(5.8332) = 2.42. If we let Y be the average of the sum on these two dice, then Y = (1/2)W so SD(Y)= sqrt(1/2)SD(X) = 1.21.
Example: Say you have a box with only two different numbers, j of them are a"s and k of them are b's. If you take one random number from the box and let the random variable X be the number you get, what is its expected value? What is the standard deviation?
Solution: Since there are j+k numbers, the expected value is m := E(X) = (ja + kb)/(j+k). The variance is
Var(X) = [j(a-m)² + k(b-m) ²]/(j+k).
After substituting the value of m and collecting terms you find that Var(X) =(b-a)²[p(1-p)], where p = j/(j+k) is the percentage of a's in the box = the probability of getting an "a", while 1-p = k/(j+k) is the probability of getting a "b". Consequently,
SD(X) = |b-a|sqrt{p(1-p)}.

Wed. March 7

How does make a moving graphic on a Web page? Click here for an example: Prof DeTurck's Home Page View the page source to get added insight.
The essence is a gif image that consists of several different images "glued" together as several frames of a single graphic.
There are several steps to make such images:
0. Turn off the computer and think through your design. What do you want create? Ideally, what should it look like? Make sketches. This design phase is the most difficult and creative step.
1. Turn on the computer. Use a "paint" program to create each of the separate frames in your design.
2. Use some utility to collect these frames into a single moving image.
3. Use a utility to have this image play in a loop as many times as you wish.
4. Insert your image into a web page.

Thurs. March 8

Mid-Semester Exam. Closed book -- but you may use 1 page with notes. No calculators.

back to table of contents

Tues. March 20

We discussed some examples. These are from the book Statistics by Freedman, Pisani, and Purves, Chapters 26 and 28.

Wed. March 21

We continued the discussion of March 7 and as an example made a web page with an animated graphic. Login to your account on johnny.sas.upenn.edu and look in the directory /home/~kazdan/html/210/notes-misc/arrows/ for the raw ingredients.
Voting on the Web.How can you arrange for a group of people to vote on something using the Web?

Thurs. March 22

Differential Equations We discussed the simplest equations.
dx/dt = cx
If c > 0 one has exponential growth, while if c < 0 exponential decay. One can measure the constant c experimentally say, by measuring when one has double (or half) the initial amount.
dx/dt = cx(M - x), c>0 and M>0 are constants [logistic equation] Although one can solve this explicitly by "separation of variables" integrating the result using partial fractions, one gets more insight into the qualitative behavior by understand the function f(x) = cx(M - x), plotting y = f(x). Note that the points x_j where f(x) = 0 are equilibrium solutions of the equation.
There are three qualitatively different cases to understand: x(0) <0, 0 < x(0) < M, and x(0) > M.

back to table of contents

continued on next page