Prof. Vinod, Stat I, Descriptive Statistics Given following numbers 14 79 62 102 80 20 6 6 30 81 Find coeff of variation and MAD for above data. Classify the data into four intervals, starting at 5 as the Lower limit of the first interval. Find the Mode, Median (graphically) and standard deviation of classified data. xbar or mean= 48 ---------------------------------------------- i Xi-xbar (Xi-xbar)^2 ---------------------------------------------- 1 -34 1156 2 31 961 3 14 196 4 54 2916 5 32 1024 6 -28 784 7 -42 1764 8 -42 1764 9 -18 324 10 33 1089 ---------------------------------------------- Check sum of deviations from the mean=0 0 Sum of squared deviations= 11978 Denominator for sample variance= n-1 = 9 Unclassified sample variance~std dev 1330.89 36.4813 Sum of absolute deviations= 328 Mean absolute deviation=mad= 32.8 Coefficient of Variation=100*(std dev)/xbar 76.0028 Sorted data 6 6 14 20 30 62 79 80 81 102 The Range= Xmax- Xmin 96 xbar or mean= 48 Median= 46 Now Skewness from Unclassified mean and median Since mean>median, underlying frequency distribution is skewed to the right! Now compute 10% trimmed mean k~intk~frack~n' 1 1 0 8 46.5 Now 13% trimmed mean k~intk~frack~n' 1.3 1 0.3 7.4 y[nlo]~(1-frack) 6 0.7 y[nlo+1:nup]'~((1-frack)*y[nlo])~(1-frack)*y[nup+1] 14 20 30 62 79 80 4.2 56.7 46.7432 Now report the percentiles for a Notched Box Plot: 5% = 6 10%= 6 Q1=25%= 14 Mi=50%= 46 Q3=75%= 80 90%= 91.5 95%= 102 IQR= Interquartile range= 66 Outlier detection Limits (Q1-1.5*IQR)~(Q3+1.5*IQR) -85 179 No outlier on the left side is present No outlier on the right side is present Number of classes in which to classify the data is given to be: 4 Width of ultimate class intervals shuld be at least 24 Chosen Lower limit of 1st class interval and width = 5 25 First class interval's midpoint= M1= 17.5 Lower limits: 5 30 55 80 Total frequency=Summation of fj= 10 ------------------------------------------------------------------- j Low Up Mj fj Mj*fj ------------------------------------------------------------------- 1 5 30 17.5 4 70 2 30 55 42.5 1 42.5 3 55 80 67.5 2 135 4 80 105 92.5 3 277.5 ------------------------------------------------------------------- Total frequency=Summation of fj column = 10 Sum of last column and total freq. 525 10 Mean from classified data= 52.5 Compare above to the earlier computed mean from unclassified data= 48 Max of frequencies= 4 Mode for classified data is the midpoint of the class interval. containing the largest number of frequencies 17.5 Now begin computation of the variance for classified data. First we need to extend the above table with 3 more columns: ----------------------------------------------------- j (Mj-xbar) (Mj-xbar)^2 fj*(Mj-xbar)^2 ----------------------------------------------------- 1 -35 1225 4900 2 -10 100 100 3 15 225 450 4 40 1600 4800 ----------------------------------------------------- Sums 10 3150 10250 ----------------------------------------------------- Denominator for sample variance= n-1 = 9 sample variance for classified data= 1138.89 Classified data std dev= 33.7474 For comparison, the unclassified data std dev. above was= 36.4813 Find the classified data median graphically, plot the Less-Than-Ogive. Also plot More-Than-Ogive and find the point of intersection to get Median. A dummy class is one without any frequency, designed to plot freq. polygon. Need one dummy classes at the beginning and another at the end of classes.