St1norm.1 Prof Vinod Stat 1 Normal dist. word problems. Important hints in doing these right. Step 1: Begin by drawing one long vertical line and two horizontal lines. Name the upper horizontal line x axis And name the lower horizontal line as z axis. Get the bearings on the graph for x. The upper limit of x is found as: Upper limit for the bell curve = mean of x + 4 times standard deviation of x Lower limit for the bell curve = mean of x - 4 times the std. dev of x For example, if mean is 100 and standard deviation is 10 the limits are 100-40=60 to 100+40=140 Normal distribution x~N(mean mu, variance sigma squared) upstairs With range (mu MINUS 4* sigma) to (mu PLUS 4* sigma) Standard Normal z~N(0,1) is downstairs with range -4 to +4 downstairs Note that in this notation ~ means “is distributed as” N means Normal, we give the two parameters mean and variance in parentheses separated by a comma. This notation is used in the statistical literature. The bearings for z are: Lower limit at -4, center at 0 and upper limit at 4 Draw two bell shaped curves. If you can’t draw well, it is ok. Let us call the two bell-shaped curves as UPSTAIRS / Downstairs graphs. Step 2: Translate the probability asked into area under normal curve and shade the area on a graph (usually upstairs graph showing x on horizontal axis) (If you can’t draw well, it is ok) Step 3: Map the problem to the z distribution The bearings for z are: Lower limit at -4, center at 0 and upper limit at 4 Shade the desired area Under the z curve Step 4: Use Tables to find the answer. In using the Normal dist. table remember that: It is symmetric, centered at zero Area is always positive, (Area to the left of the middle)=( area to the right of the middle0 = 0.50 Table only gives area between 0 and a positive number, If you want tail area you must subtract from 0.5 If you want two equal tails together, you can double one tail To get full credit in exam, you must show the graphs......... Normal Distribution word problems AREAs under the Normal: Be sure to have a copy of a table having "area of a standard Normal distribution" Inside cover page of your text. Bring a xerox copy to class from now on every day, unless you plan to bring the text. You may not be admitted to class if you do not have the table! x is Normally distributed r. v. with mean mu=200, variance=100 for example, x=number of dozens of eggs sold in a week. Find the probability that xL < x < xU where xL is lower limit, xU is the upper limit. P( xL < x < xU ) This probability does not change when we make a Z transformation of all terms of the inequality. The probability remains Exactly the same. This is what justifies the mapping from upstairs to downstairs. To solve these problems draw a picture of the Normal curve for x show where the mean lies for x and find the xL and xU on the figure. Compute the std.dev= sq. root of variance =10 here. Now z transform: zL= ( xL - mean ) / std.dev. zU= ( xU - mean ) / std.dev. There are several different types or problems possible: 1) Right tail, zL>0, zU is positive infinity Find the prob. of more than 220 dozen eggs. xL=220, xU=+infinity, zL=[(220-200)/10] = 20/10 = 2 zL=2, zU=infinity means right tail from 2. Table gives the area from 0 to a positive number like 2 (=0.4772) Tail area to the right of 2 is 0.5 minus 0.4772 (Ans=0.0228) 2) Left of a positive zU number: xL= negative infinity, xU= positive number. Find the prob of less than 220 dozen eggs. zL=neg infinity, zU=2, Ans=0.5 + 0.4772 = 0.9772 Why add the half here? left half is included. 3) Both zL=1 and zU=2 are positive. Find the prob of selling between xL=210 and xU=220 doz. eggs. zL= (210-200)/10= 10/10 =1 Area from table bet. 0 and 1 is 0.3413 zU= (220-200)/10 =2. (tabel area 0 to 2 is 0.4772 as before) Ans subtract the smaller tabulated value 0.3413 from 0.4772. Ans=0.1359 4) Both zL and zU are negative. Find the prob of selling bet. xL=180 and xU=190 doz eggs. zL=(180-200)/10= (-20/10)=-2 prob is 0.4772 from table (use symmetry) zU= (190-200)/10 = (-10/10)= -1, prob is 0.3413 Ans as in (3) above 0.1359 5) zL is negative zU is positive. Find the prob of selling bet. xL=180 and xU=210 doz eggs. zL=-2, zU=1. Add the tabulated areas. 0.4772+0.3413 =0.8185 6) zL is neg infinity and zU is a negative number. Find the prob. of less than xU=180 eggs sold. zU=-2 We want left tail area which equals right tail area (symmetry) Right tail is 0.0228 as in (1) above. Ans=0.0228 7) Two tails areas. Find the prob of less than 180 or more than 210 doz eggs. Find P( x<180) + P ( x>210). (xL=neg infinity, xU= 180) plus prob(xL=210,xU= positive infinity) Here there are two ares to be found with two separate xL and xU. zU is -2, left tail is 0.5-0.4772=0.0228 zL=1, right tail is 0.5-0.3413=0.1587. The answer is 0.0228+0.1587=0.1815 8) Similar to (5), zL is negative and zU is positive But the absolute magnitudes (irrespective of sign) are the same. The complication arises because the mean and std.dev are unknown numbrs! No matter though. We can still answer the question. Find the prob of selling within one standard deviation of the mean. zL=-1, zU=1 Ans= 0.3413 + 0.3413 = 0.6826 Another question find the prob of selling within 1.96 standard deviations of the mean. zL=-1.96 and zU=1.96, Area from the table is 0.4750 Ans= 0.4750+0.4750 = 0.95 9) Negation of the (8). Find the prob of not selling within 1.96 std.deviations of the mean. Ans= 1- 0.95 = 0.05 = two tail areas. Remember the key phrase "Within so many standard deviations of the mean". SEC. 2: SAMPLING DIST OF MEANS PROBLEMS Here the random variable is xbar (or mean)~ N(mu, sigma square/n ) Thus sampling distribution has the additional wrinkle that the standard deviation of the mean is sigma divided by square root of n Otherwise these are very similar to word problems for Normal. What is a sampling distribution? It is a probability distribution of a statistic like mean, variance, standard deviation, range, median, etc. Recall that things we calculate from a sample are called statistic's A sampling distribution is defined over all possible values of the statistic computed from all possible samples. If population size is N and sample size is n there are NCn or N choose n ways. E.g. N=5, n=2, NCn=5C2= 5!/(2!*3!)=10 These all possible ways of selecting a sample of size 2 from a population Of size 5 Thus the sample space S over which the sampling distribution is defined. The sampling distribution is somewhat hard to get exactly. Fortunately great statisticians have proved that if x is Normal (mean mu, variance sigma^2) then sample mean xbar is also Normal with same mean and variance sigma^2/n hence standard deviation of xbar is sigma/(square root of n) EXAMPLE Average score is unknown ( mu is unknown) standard dev is 130 a sample of 460 is obtained (n=460) sd(xbar)= sd/square-root(n) =130/ sqrt(460) =6.06 Find the prob that the sample mean will “differ” from the unknown population mean (mu) by less than 12 Here we use absolute value | |to recognize that it Can “differ” by being smaller or by being larger. P( | xbar MINUS mu| < 12 ) = ? (mu+12) - mu/ 6.06 = 1.98 Prob 0 to 1.98 is 0.4761 prob -1.98 to zero is also 0.4761 Ans= 0.9522 SEC. 3 Approximating the Binomial dist by the Normal Think of the integers 0, 1,2, .., n of the Normal as pillars centered at the integers. First translate the prob. problem into shaded area First find p the probability of one success in one trial. Let q=1-p Before starting, we need to know if Normal distribution can approximate The binomial. This involves two tests. TEST1) np> or =5 AND TEST2) nq> or =5 both tests must be satisfied if Normal is to be a good approximation to the Binomial problem. CORRECTION FOR CONTINUITY is needed since Binomial is discrete and Normal is a continuous distribution. What is this correction? You need to split the difference between neighboring integers allocating 0.5 to the left and 0.5 to the right pillar. for example if the desired pillars are 1 to 3, 1 begins at 0.5 and 3 ends at 3.5 Now use steps 2,3,4 above. Remember that the MEAN OF BINOMIAL is np and VARIANCE IS npq=np(1-p) don’t forget the square root of npq These are needed in finding the right variable x ~approx~ N(np, npq) CENTRAL LIMIT THEOREM What is central limit theorem? This is a powerful result by Polya in 1920's showing that even if x is not Normal, if n>30 the process of averaging (is so helpful) that it yields Normality of xbar. NORMAL DENSITY IN Microsoft EXCEL software INSTEAD OF from THE TABLES FUNCTION NAME IS NORMSDIST IT HAS 4 INPUTS The first input is x =the value of x for which we want to compute the cumulative density Second input is the mean The third input is the standard deviation The fourth input is a value True if you want cumulative probability or False if you want the height of probability curve NORMSINV function computes the inverse of NORMSDIST It has 3 inputs First input is cumulative probability, Second input is the mean The third input is the standard deviation