Author: H. D. Vinod,

Dates: Noted in the software itself

All The code on this page is provided gratis without any guarantees or warrantees.

Part A has GAUSS software code and Part B has some math typing tricks in MS-Word

Proprietary modifications of this code are not permitted.

Please make appropriate attribution if you use the code in a research project.

PART A: R code

#it
is a good idea to clean out old objects from R memory and record the date

#__________________________ Cut here ____________________________

objects()
# these objects are already in memory

rm(list=ls())
#this cleans them

ls()
#this lists what is left

options(prompt="R>")
#this changes the prompt

print(paste("Following
executed on", date()))

#__________________________ Cut here ____________________________

# I
have written the following function to get outliers automatically

#First
copy and paste all lines of the following “function” in R

get.outliers
= function(x) { #this left curly brace
begins function

#function
to compute the number of outliers automatically

#author
H. D. Vinod, Fordham university,

#revised

# input
a column vector of values,

#
output: various quantities used in outlier
detection

# such as interquartile range, limits
and

# xnew= revised vector after outliers
are deleted

xnew=x #initialize the xnew
found after removal of outliers

su=summary(x)

if
(ncol(as.matrix(x))>1) {print("Error: input to get.outliers function
has 2 or more columns")

return(0)}

iqr=su[5]-su[2]#inter
quartile range

dn=su[2]-1.5*iqr #dn denotes lower limit for outlier
detection

up=su[5]+1.5*iqr

LO=x[x<dn]#vector
of values below the lower limit

nLO=length(LO)

UP=x[x>up]

nUP=length(UP)

print(c("
Q1-1.5*(inter quartile range)=",

as.vector(dn),"number
of outliers below it are=",as.vector(nLO)),quote=F)

or=1:length(x)

if
(nLO>0){

print(c("Actual
values below the lower limit are:", LO),quote=F)

print(c(“sequence number of outlier(s) for possible deletion are:”,
or[x<dn]),quote=F)

} #this right curly brace ends the if
statement

print(c("
Q3+1.5*(inter quartile range)=",

as.vector(up),"
number of outliers above it are=",as.vector(nUP)),quote=F)

if (nUP>0){

print(c("Actual
values above the upper limit are:", UP),quote=F)

print(c(“sequence number(s) of outlier(s) for possible deletion
are:”, or[x>up]),quote=F)

xnew=x[-c(or[x<dn],or[x>up])]#the minus means remove those
observations

} #this
right curly brace ends the if statement above

#now
outputs from the function are ready for extraction

# with
the use of the dollar symbol and are listed as follows

list(below=LO,nLO=nLO,above=UP,nUP=nUP,low.lim=dn,up.lim=up,
xnew=xnew)} #this right curly brace ends the function formally

#TEST
Example x=c(1,-4,3,4,5,55)

#xx=get.outliers(x)

#xx$xnew
extracts xnew=revised x without outliers

#xx$be
extracts actual values below the lower outlier limit and so on

# the “$b”
is an abbreviation for “$below”

# b
alone works since nothing else in the “list” has b at the start

# = = =
= = = function ends here = = = = = =

# WARNING on xnew for regression! It will not work!

# If
you are removing outliers in a regression be sure to remove

# the complete
matched set of observations for all
variables.

# e.g.,
if fifth observation is outlier in y but not in x or z and

# if
lm(y~x+z) is used, remove fifth observation from x, y and z

# This
will have to be done manually rather than by
using xnew above

# xnew
works only if the model has only one variable

#now assuming x, y and z are already in memory, type

xx=get.outliers(x)

xx=get.outliers(y)

xx=get.outliers(z)

#__________________________ Cut here ____________________________

summary2=function(x)

#object
is to also provide greater digits in mean and sd and info about length

{xx=as.matrix(x)

#print("Means")

#print(apply(xx,2,mean,
na.rm=T))

print("standard
deviations")

print(apply(xx,2,sd,
na.rm=T))

print("Lengths")

print(apply(xx,2,length))

sumx=summary(x)

return(sumx)

}

#summary2(x)

#_____________________Cut
here ____________________________

get.skewkurt
= function(x)

{

#object
compute third and fourth powers of deviations from mean

#INPUT x =data

#
OUTPUT

# sum3= sum of cubes of deviations from
the mean

# sum4= sum of fourth powers of
deviations from the mean

# devfromm=vector of deviations from the
mean

#new
variance is (1+a)^2 times var(x)

#new
range is (1+a) times old range
max(x)-min(x)

xb=mean(x)

n=length(x)

devfromm=rep(1,n)

sum3=0

sum4=0

i=1

while
(i<=n) {

devfromm[i]=x[i]-xb

sum3=sum3+(x[i]-xb)^3

sum4=sum4+(x[i]-xb)^4

i=i+1 }

list(sum3=sum3,
sum4=sum4, devfromm=devfromm)

}

#_____________________Cut
here ____________________________

sort.matrix
=function(x,j)

{

#sort
matrix x by column j

# and
carry along the remaining columns

#author
H. D. Vinod, June 14, 2006.

y=0

dd=dim(x)

if
(is.numeric(dd[1])){

#print("Error
in sort.matrix function")

oo=order(x[,j])

fn=function
(x,oo) {y=x[oo]; return(y)}

y=apply(x,2,fn,oo=order(x[,j]))

}

return(y) }

#example

#x=round(matrix(rnorm(12),4,3),2)

#sort.matrix(x,2)

#_____________________Cut
here ____________________________

cen.moments
= function(x)

#object compute 4 sample central moments and
cumulants

{
n=length(x)

m=mean(x)

m2=sum((x-m)^2)/(n-1) #WARNING dividing by n-1 not n here

m3=sum((x-m)^3)/n

m4=sum((x-m)^4)/n

k1=m

k2=m2

k3=m3

k4=m4-3*(m2^2)

list(m2=m2,
m3=m3, m4=m4, k1=k1, k2=k2, k3=k3, k4=k4)

}

#_____________________Cut
here ____________________________

PART B: GAUSS code

1) A code for testing the numerical accuracy of any software, written in GAUSS software.

http://www.american.edu/academic.depts/cas/econ/gavussres/utilitys/utilitys.htm

This link has useful gauss procedures for computing accurate mean and variance.

2) Following simple proc helps in reshaping the data without giving number of rows.

@The
following test program should be run to understand what it does.

Note
that since 8 is not divisible by 3, it ignores the last two

data
points if you want to reshape into 3 columns.

of
course, reshape is typically used for getting large data from

ascii
files, not for data typed in the way it is shown below.

@

new;

x={1,
2, 3, 4, 5, 6, 7, 8};

y=reshape2(x,2);

y;

y=reshape2(x,3);

y;

proc
(1)=reshape2(x,ncol);

@Author:
H. D. Vinod,

proc
returns the reshaped matrix with correct number of rows

@

local
n,n1;

clear
n1;

n=rows(x);"number
of rows before reshaping= " n;

n1=floor(n/ncol);

"
(number of rows before reshaping)/(no of columns) " n1;

retp(reshape(x,n1,ncol));

endp;

PART C:

Some
Great Tricks for Math Typing in MS Word

If you want to type mathematical functions in MS Word without much

difficulty, use the autocorrect in the Tools menu.

The following file has many preset ideas. For example, \a gives alpha

/app= gives approximately equal and numerous other useful symbols.

the attached file called normal.dot can be downloaded. Use this to

replace your normal.dot file typically in the location

C:\Program Files\Microsoft Office\Templates

or

C:\Documents and Settings\user\Application Data\Microsoft\Office\Recent

Microsoft keeps changing this, but you can find it!

Be careful though. Keep a backup copy before replacing.

It may not work for your configuration. It has worked for many of my

graduate students.