Computational software such as Matlab, R, JMP, SPSS, and SAS
• Please SHOW ALL YOUR WORK ON SEPARATE PAGES FOR EACH PROBLEM. Please include and submit papers, computational modules and all graphs.
• You can use any computational software such as Matlab, R, JMP, SPSS, and SAS to be able to answer the following questions.
QUESTIONS:
2. Consider the bankrupt.txt data set with response y taking the values 1 and 0 on 5 predictors x1,x2,x3,x4,and x5 for n = 66 companies. Suppose we fit a generalized linear model (GLM) using the predictor matrix X, response y, and distribution distr in Matlab. Acceptable values for distr are ’normal’, ’binomial’, ’poisson’, ’gamma’, and ’inverse gaussian’. The distribution is fit using the canonical link corresponding to distr. Model matrix X is a matrix with rows corresponding to observations, and columns to predictor variables. glmfit.m module automatically includes a constant term in the model (do not enter a column of ones directly into X).
Example: Fit a probit regression model for y on x. Each y(i) is the number of successes in n(i) trials.
x = [2100 2300 2500 2700 2900 3100 3300 3500 3700 3900 4100 4300]’;
n = [48 42 31 34 31 21 23 23 21 16 17 21]’;
y = [1 2 0 3 8 8 14 17 19 15 17 21]’;
b = glmfit(x, [y n], ’binomial’, ’link’, ’probit’);
yfit = glmval(b, x, ’probit’, ’size’, n);
plot(x, y./n, ’o’, x, yfit./n, ’-’)
[B,DEV,STATS] = glmfit(...) returns a structure that contains the following fields:
’dfe’ degrees of freedom for error
’s’ theoretical or estimated dispersion parameter
’sfit’ estimated dispersion parameter
’se’ standard errors of coefficient estimates B
’coeffcorr’ correlation matrix for B
’covb’ estimated covariance matrix for B
’t’ t statistics for B
’p’ p-values for B
’resid’ residuals
’residp’ Pearson residuals
’residd’ deviance residuals
’resida’ Anscombe residuals
(a) Obtain the ’covb’ estimated covariance matrix for the coefficients β under ’normal’, ’binomial’, ’poisson’, ’gamma’, and ’inverse gaussian’ and score the information criteria
Where
Also score Akaike’s Information Criterion (AIC)
According to the minimu of the criteria which distributional model fits this data better?
(b) Consruct the QQ Plots of the ’residp’ Pearson residuals for the best fitting model.