- Tues. Feb. 13
- (Markov Chains (continued)
- Wed. Feb. 14
- Computer Type Faces. We have seen
F page 1and F page 2
how to use matrices to change the appearance of the letter F.
One can stretch, slant, rotate, or reflect each letter. Computers
regularly implement this to modify a type face. See the article
New Font Technology
Rigid Motions. Rigid motions are perhaps the simplest
transformations in graphics. A rigid motion F(X) simply preserves the
distance between any two points:
|| F(X) - F(Y) || = || X - Y ||,
where we use Euclidean geometry to measure distance. Three familiar
examples of rigid motions in the plane are
translations, such as (x1, x2) -->
(x1 - 3, x2 + 1), which translates every point by
(-3, 1).
rotations, such as the +90 degree rotation:
(x1, x2) --> (-x2, x1).
reflections, such as across the horizontal axis:
(x1, x2) --> (x1, -x2).
- Thurs. Feb. 15
- The transition matrices M = (aij) that arise in Markov chains
have the properties that that i) aij >= 0 and the sum of each
column is 1. Since each column of M - I sums to zero one sees that the
homogeneous equations (M - I)P = 0 have a solution P, where P is not the
zero vector. This means that 1 is an eigenvalue of M with
corresponding eigenvector P. While there may be other eigenvectors with
eigenvalue 1 (a primitive case is for M = I, when every vector is an
eigenvector with eigenvalue 1), there is at least one eigenvector whose
elements are >= 0 and at least one element is positive. By
multiplying this eigenvector by a constant, we obtain a probability
eigenvector P1.
In the notes
Markov Chains.
we show that for these transition all of the eigenvalues of M satisfy
|eigenvalue| <= 1. Moreover, if all the elements of M are strictly
positive, then all the eigenvectors other than those of the form
cP1 satisfy |eigenvalue| < 1 and the matrices
Mk converge as k gets large.
This convergence is easy to prove if one can diagonalize M.
- Tues. Feb. 20
- One of last week's homework problems was to show that linear maps
take parallel lines to parallel lines. This is one way to show this.
We start by thinking of a straight line as describing the motion of a
particle P(t) at time t, assuming it travels with constant (non-zero)
velocity V and begins at X1. Then
P(t) = X1 + Vt.
A parallel Q(t) line passing through X1 can be written as
Q(t) = X2 + Vt.
We look at their images LP(t) and LQ(t). Now
LP(t) = L(X1 + Vt) = L(X1) + (LV)t
= Y1 + Wt,
where we let Y1 = L(X1 is the image of
X1 and W = LV is the image of V. In particular, this shows
that under a linear map L the image of a straight line is another
straight line. Similarly,
LQ(t) = L(X2 + Vt) = L(X2) + (LV)t
= Y2 + Wt,
where Y2 = L(X1 is the image of X2.
These formulas LP(t) = Y1 + Wt and LQ(t) = Y2 + Wt and
show that the lines LP(t) and LQ(t) are parallel.
Leonteif input-output matrix This is an important application in
economics. Here one also meets matrices all of whose elements are
non-negative. See the printed lecture notes from the book by Strang
that I distributed in class (it also has more on Markov chains).
Data. We begin a new segment of the course discussing some
aspects of data. The first example is
Sex Bias in Graduate Admissions? .
It illustrates Simpson's paradox, which arises when summarizing data
too hastily. The key question to ask is, "does the summary weigh the
component data adequately?".
- Wed. Feb. 21
- After discussing some of the homework problems, we thought
More on Rigid Motions. We observed that any rigid motion F(X) can
be written as a translation plus a motion that leaves the origin fixed:
F(X) = X0 + G(X),
where X0 = F(0) is the image of the origin and G(X) is a
rigid motion leaving the origin fixed. Since the translation part is a
rigid motion is obvious, we need only study rigid motions G(x) leaving
the origin fixed.
Some obvious examples of rigid motions from the plane
R2 to itself that leave the origin fixed are rotations and
reflections. In fact, these are the only possibilities. For maps of
R2 to itself (and also in higher dimensions), it is less
obvious what a "rotation" means. [continued next Wednesday].
- Thurs. Feb. 22
- Projections: P2 = P. This is an alternate
approach to that done in class. Since P is not 0, there is a vector X
so that V := PX is not zero. But then PV = P2X = PX = V,
as desired. Similarly, since P is not the identity, there is a vector
Y so that PY is not Y. Let U := Y - PY. Then U is not zero, but
PU = PY -P2Y = PY - PY = 0.
We can write any vector X
as X = PX + (I-P)X. Let W = PX and Z = (I-P)X. Clearly PW
=P2X = PX = W and PZ = P(I-P)X =(P - P2)X = 0.
Rat in a two room maze.
Say that Room 1 opens to the outside, and if a rat enters that room,
it simply leaves the maze.
If the rat is in Room 2, the door to Room
1 opens every hour and there is a 70% chance that the rat will stay
Room 2. If the rat starts in Room 2, what is the expected waiting
time until it goes to Room 1?
To solve this, for simplicity let p = .70. We compute the expected
time spent in Room 2. Since the rat starts in Room 2, then it
certainly spends the first hour there, but there is only a probability
of p that it spends the second hour in Room 2. Similarly there is a
probability of p2 that it spends the third hour there, etc.
Consequently, the expected time T spent in Room 2 is
T = 1 + p + p2 + p3 + ...
This is a geometric series with ratio p = .7, so its sum is T = 1/(1-p) =
10/3 = 3 1/3. Thus the average waiting time gets to Room 1 (and
freedom) is T = 3 1/3 hours.
There is another way to reach the same conclusion.
Again, let T be the expected waiting time to go to Room 1. The rat
starts in Room 2. With probability p the rat will be in Room 2 for
the second hour. But then the process is back at its starting point
and the rat will wait T additional hours before entering Room 1. Thus
T = 1 + pT.
Solving this for T we have T = 1/(1-p) = 10/3.
Data (continued) Say at times
t1, t2,...,t13 we get the 13 pieces of
data q1, q2,...,q13. To understand
the data we plot it -- and observe that it looks like it approximately
fits a straight line q = at + b. How can we find the values of the
coefficients a and b that best fit this data? A standard approach is
to use the method of least squares. For this we measure the
error
E(a,b) := (at1 + b - q1)2
+ (at2 + b - q2)2
+ ... + (at13 + b - q13)2.
Now find a and b that minimize E(a,b), say by using calculus.
- Tues. Feb. 27
- Least Squares. These notes use two approaches, calculus and
orthogonal projections, to investigate least squares.
Notes on Vectors and Least Squares
- Wed. Feb. 28
- Rigid Motions (continued). Today we studied rigid motions
G(X) that leave the origin fixed, so ||G(X) - G(Y)|| = ||X - Y|| for all
X and Y, and G(0) = 0. The following properties are not difficult to prove,
although they do take some effort:
- G(X) = RX for some matrix R.
- ||RX - RY|| = ||X - Y|| for all X and Y (obvious from the Property
of G(X)).
- ||RX|| = ||X || (the special case Y = 0 from above).
- <RX, RY>= <X, Y> (square both sides of Property 2, then use
||Z||2 = <Z, Z> and the algebraic properties of the inner
product).
- R*R = I, that is, R* = R-1. This
is proved using Property 4 and the definition of the adjoint,
R*.
- The columns of R are unit vectors and they are mutually orthogonal.
To show this, one uses, for instance, that the 2nd column of R is just
Re2, where e2 = (0,1,0,...,0). Then use
Property 4 where X and Y are the vectors e1, e2, etc.
- If c is an eigenvalue of R, then |c| = 1. (Proof: Say RV = cV for
come constant c and some non-zero vector V. Then by Property 3, ||V|| =
||RV|| = ||cV|| = |c| ||V||, so |c| = 1.)
- det(R) = +1 or -1. (Proof: We use that for square matrices A, B
then det(AB) = det(A) det(B) and det(A*) = det(A). Then
1 = det(I) = det(R*R) = det(R*) det(R) =
[det(R)]2. Thus det (R) = +1 or -1.)
These matrices R describing rigid motions leaving the origin fixed are
called orthogonal matrices.
We show that in 2 dimensions, every 2 × 2 orthogonal matrix
with det R = +1 is a rotation, so
To begin, by Property 6 the first column of R is a unit vector, it
must have the form (a,b) with a2 + b2 = 1. This
agrees with the first column above. Again by Property 6 The second
column must be a unit vector perpendicular to the first column, giving
either (-b,a) or -(- b, a). This too agrees with what we have, except
for the choice of sign in this column. Since det R = +1, we are
forced to use the above choice of sign.
If det R = -1, the same holds if we switch the signs in one column,
say the last column. Geometrically this just includes a reflection.
In 3 dimensions, by using a preliminary reflection we may assume that
det R = +1. Then by a brief argument one shows that +1 is an
eigenvalue of R. This implies there is some (non-zero) vector V with
RV = V. Thus, this vector is unchanged by R so R only acts in
the plane perpendicular to V. Since this plane is two dimensional, by
the previous paragraph we conclude that R is simply a rotation of this
plane leaving V fixed, so R is a rotation with axis V.
These rigid motions are essential is any computer graphics. As an
example, you may find it illuminating to look at the following Maple
program that seeks understanding of how the number of hours of daylight
on earth depends on the day of the year and on your latitude. The key
ingredient is that the earth rotates around the sun roughly every 365
days, and that its axis is tilted from the vertical plane
defined by the earth's orbit around the sun.
Hours of Daylight (uses Maple).
- Thurs. March 1
-
- Tues. March 6
- Expected Value and Standard Deviation
Let X and Y be random variables. Here are some standard properties.
- E(X+c) = E(X) + c
- E(aX) = aE(X)
- E(X+Y) = E(X) + E(Y)
- Var(X) = E[(X-E(X)]2) = E(X2) -E(X)2
- SD(X) = sqrt[Var(X)]
- Var(X+c) = Var(X)
- Var(aX) = a2Var(X) so SD(aX) = |a|SD(X).
- Var(X+Y) = Var(X) +Var(Y) +2[E(XY) - E(x)E(Y)]
- In particular, if X and Y are independent, then E(XY) = E(x)E(Y) so
Var(X+Y) = Var(X) +Var(Y)
As a special case, if X and Y have the same standard deviation =Sd,
then Var(X+Y) = 2Sd2 and SD(X+Y) = sqrt{2}Sd
- More generally, if X1, ... , Xk are
independent and have the same standard deviation = Sd, then,
Var(X1 + ... + Xk) =kSd2 so
SD(X1 + ... + Xk) =sqrt(k)Sd
Thus by property 7 above, we have the following square root law for
the standard deviation of an average of independent variables that have
the same standard deviation Sd:
SD((X1 + ... + Xk)/k) = Sd/sqrt{k}
Example: One die has 6 possible faces. Let the random variable X
assign to each throw of the die the number that appears. Then the
expected value E(X) = 7/2 = 3.5 and Var(X) = 2.9166. Then SD(X) =
sqrt{2.9166} = 1.71.
If you throw two die, let the random variable W be the sum of the
numbers, W = X1 + X2 of each die. Since we just
computed the expected value and variance if the X1 and
X2 using X in the previous example, we can use the above
properties to find them for W: E(W) = E(X1 + X2)
= 2(7/2) = 7, while similarly, using Property 9 then Var(W) = 2Var(X) =
(2.9166) = 5.8332 and SD(W) = sqrt(2)SD(X) = sqrt(5.8332) = 2.42.
If we let Y be the average of the sum on these two dice, then Y =
(1/2)W so SD(Y)= sqrt(1/2)SD(X) = 1.21.
Example: Say you have a box with only two different numbers,
j of them are a"s and k of them are b's. If you take one random number
from the box and let the random variable X be the number you get, what is its
expected value? What is the standard deviation?
Solution: Since there are j+k numbers, the expected value is
m := E(X) = (ja + kb)/(j+k). The variance is
Var(X) = [j(a-m)2 + k(b-m) 2]/(j+k).
After substituting the value of m and collecting terms you find
that Var(X) =(b-a)2[p(1-p)], where p = j/(j+k) is the
percentage of a's in the box = the probability of getting an "a", while
1-p = k/(j+k) is the probability of getting a "b". Consequently,
SD(X) = |b-a|sqrt{p(1-p)}.
- Wed. March 7
- How does make a moving graphic on a Web page? Click here for an
example:
Prof DeTurck's Home Page
View the page source to get added insight.
The essence is a gif image that consists of several different images
"glued" together as several frames of a single graphic.
There are several steps to make such images:
0. Turn off the computer and think through your design. What do
you want create? Ideally, what should it look like? Make sketches.
This design phase is the most difficult and creative step.
1. Turn on the computer. Use a "paint" program to create each
of the separate frames in your design.
2. Use some utility to collect these frames into a single moving
image.
3. Use a utility to have this image play in a loop as many times as
you wish.
4. Insert your image into a web page.
- Thurs. March 8
- Mid-Semester Exam. Closed book -- but you may use 1 page with notes.
No calculators.
- Tues. March 20
- We discussed some examples. These are from the book
Statistics by Freedman, Pisani, and Purves, Chapters 26 and 28.
- Wed. March 21
- We continued the discussion of March 7 and as an example made a web
page with an animated graphic. Login to your account on
johnny.sas.upenn.edu and look in the directory
/home/~kazdan/html/210/notes-misc/arrows/ for the raw ingredients.
Voting on the Web.How can you arrange for a group of people
to vote on something using the Web?
- Thurs. March 22
- Differential Equations We discussed the simplest equations.
dx/dt = cx
If c > 0 one has exponential growth, while if c < 0 exponential
decay. One can measure the constant c experimentally say, by
measuring when one has double (or half) the initial amount.
dx/dt = cx(M - x), c>0 and M>0 are constants [logistic equation]
Although one can solve this explicitly by "separation of variables"
integrating the result using partial fractions, one gets more insight
into the qualitative behavior by understand the function f(x) = cx(M - x),
plotting y = f(x). Note that the points xj where f(x) = 0
are equilibrium solutions of the equation.
There are three qualitatively different cases to understand: x(0)
<0, 0 < x(0) < M, and x(0) > M.