Cost accounting: Regression Analysis

How good is my regression model?

This page takes us through a question from an old copy of Charles T Horngren's Cost Accounting textbook. The purpose of the exercise is that we are given a set of data from which we are to derive and comment on a regression model that we can use to predict cost values.
What follows is the table of data, a table that takes us step by step through a comparison of the validity of the variables in the data set; and three graphs that give us an insight into the normality or otherwise of the variables.
By the end of the page, we have a much deeper insight into the model, the variables in the model and the usefulness of the model we might finally decide to use.
The data set

Year	Non academic overhead costs	Number of non academic staff	Number of student applications	Number of enrolled students
1	2200	29	1010	342
2	4120	36	1217	496
3	3310	49	927	256
4	4410	53	1050	467
5	4210	54	1563	387
6	5440	58	1127	492
7	5600	88	1892	513
8	4380	72	1362	387
9	5270	83	1623	346
10	7610	73	1646	487
11	8070	101	1870	564
12	10388	103	1253	764

The results of the regression analysis: including the Durbin Watson statistic: we ar dealing here with time series data; and we use the Durbin Watson technique to assess the possibility of autocorrelation being present in our data.

Regression 1
Overhead costs = f(number of non academic staff)

variable	coefficient	standard error
constant	112.04	1119.4
independent variable 1: number of staff	79.68	15.89
r2 =	0.72
standard error of residuals =	1269.2
Durbin Watson =	1.82

Regression 2
Overhead costs = f(number of student applicants)

variable	coefficient	standard error
constant	1147.4	2710
independent variable 2: number of applicants	3.1	1.91
r2 =	0.21
standard error of residuals =	2118.7
Durbin Watson =	0.89

Regression 3
Overhead costs = f(number of enrolled students)

variable	coefficient	standard error
constant	-1382.2	1350.3
independent variable 3: number of students	14.83	2.84
r2 =	0.73
standard error of residuals =	1233
Durbin Watson =	1.09

The table that takes us through the comparison of the variables in the data set

	Regression A	Regression B	Regression C
Criterion	Non academic staff	Student applications	Enrolled students
1 Economic Plausibility	One would expect that there is a relationship between the number of non academic staff and non academic overheads	The relationship between non academic overheads and student applications should be small unless the most significant non academic job is dealing with student applications	Enrolled students must, by definition, be a significant driver of non academic costs
2 Goodness of fit	r2 = 0.72	r2 = 0.21	r2 = 0.73
3 Significance of independent variables	t statistic of non academic staff of 5.02 is significant	t statistic of student applications of 1.62 is insignificant	t statistic of enrolled students of 5.22 is significant
4 Specification analysis
A linearity	Reasonably linear with a relatively high correlation coefficient of 0.72	This data series returns a correlation coefficient of only 0.21 suggesting a poor relationship between student applications and non academic costs	Students enrolled values are reasonably linear
B constant variance of residuals	We only have 12 data points for this exercise so our results are not that reliable. However, the constant variance assumption does not appear valid here … plot non academic costs against the residuals of non academic staff and see for yourself!	The same as with non academic staff!	There is some hope here: the residuals plot doesn’t seem to contradict the constant variance assumption with enrolled students.
C independence of residuals	DW = 1.92 In general for such a sample size, this means that there is no problem of serial correlation (aka autocorrelation)	DW = 0.89 is outside of the range 1.3 – 2.7 thus there is the possibility that we are dealing with data that suffers from autocorrelation	DW = 1.09 is outside of the range 1.3 – 2.7 thus there is the possibility that we are dealing with data that suffers from autocorrelation
D normality of residuals	Using SKEW from Excel this distribution of residuals is slightly skewed	This distribution of residuals is heavily skewed	This distribution of residuals is not skewed

Graphs to accompany the normality analysis section of the table. The graphs show the skewness coefficient derived from the =SKEW() function in Excel together with the plot of the residuals for each variable.

Conclusions

This page has been prepared to help anyone who is carrying out a regression analysis of a time series data set.
The comparison table has been based on the table found in Horngren and Foster on page 780: the contents of the table found here are my own,
Work through your own example and apply the analysis you see here. Let me know if it helps ... or not!
References
Charles T Horngren & George Foster(1987)
Cost Accounting: a managerial emphasis
6th edition
Prentice Hall

Cost accounting

Sunday, July 17, 2011

Regression Analysis

No comments:

Post a Comment

About Me

Blog Archive