Prof. Vinod, Stat I, Descriptive Statistics A complete example of descriptive statistics covered. Original Unclassified Data 50 98 82 23 46 40 63 52 92 54 xbar or mean= 60 ---------------------------------------------- i Xi-xbar (Xi-xbar)^2 ---------------------------------------------- 1 -10 100 2 38 1444 3 22 484 4 -37 1369 5 -14 196 6 -20 400 7 3 9 8 -8 64 9 32 1024 10 -6 36 ---------------------------------------------- Check sum of deviations from the mean=0 0 Sum of squared deviations= 5126 Denominator for sample variance= n-1 = 9 Unclassified sample variance~std dev 569.556 23.8654 Sum of absolute deviations= 190 Mean absolute deviation=mad= 19 Coefficient of Variation=100*(std dev)/xbar 39.7756 Sorted data 23 40 46 50 52 54 63 82 92 98 The Range= Xmax- Xmin 75 xbar or mean= 60 Median= 53 Now Skewness from Unclassified mean and median Since mean>median, underlying frequency distribution is skewed to the right! Now compute 10% trimmed mean k~intk~frack~n' 1 1 0 8 59.875 Now 13% trimmed mean k~intk~frack~n' 1.3 1 0.3 7.4 y[nlo]~(1-frack) 40 0.7 y[nlo+1:nup]'~((1-frack)*y[nlo])~(1-frack)*y[nup+1] 46 50 52 54 63 82 28 64.4 59.3784 Now report the percentiles for a Notched Box Plot: 5% = 23 10%= 31.5 Q1=25%= 46 Mi=50%= 53 Q3=75%= 82 90%= 95 95%= 98 IQR= Interquartile range= 36 Outlier detection Limits (Q1-1.5*IQR)~(Q3+1.5*IQR) -8 136 No outlier on the left side is present No outlier on the right side is present Number of classes in which to classify the data is given to be: 3 Width of ultimate class intervals shuld be at least 25 Chosen Lower limit of 1st class interval and width = 20 30 First class interval's midpoint= M1= 35 Lower limits: 20 50 80 Total frequency=Summation of fj= 10 ------------------------------------------------------------------- j Low Up Mj fj Mj*fj ------------------------------------------------------------------- 1 20 50 35 3 105 2 50 80 65 4 260 3 80 110 95 3 285 ------------------------------------------------------------------- Total frequency=Summation of fj column = 10 Sum of last column and total freq. 650 10 Mean from classified data= 65 Compare above to the earlier computed mean from unclassified data= 60 Max of frequencies= 4 Mode for classified data is the midpoint of the class interval. containing the largest number of frequencies 65 Now begin computation of the variance for classified data. First we need to extend the above table with 3 more columns: ----------------------------------------------------- j (Mj-xbar) (Mj-xbar)^2 fj*(Mj-xbar)^2 ----------------------------------------------------- 1 -30 900 2700 2 0 0 0 3 30 900 2700 ----------------------------------------------------- Sums 0 1800 5400 ----------------------------------------------------- Denominator for sample variance= n-1 = 9 sample variance for classified data= 600 Classified data std dev= 24.4949 For comparison, the unclassified data std dev. above was= 23.8654 Find the classified data median graphically, plot the Less-Than-Ogive. Also plot More-Than-Ogive and find the point of intersection to get Median. A dummy class is one without any frequency, designed to plot freq. polygon. Need one dummy classes at the beginning and another at the end of classes.