Lesson 1, Topic 1

In Progress

3.2.2 Ordinary Least Square method (simple regression only)

ORDINARY LEAST SQUARES REGRESSION METHOD

Regression analysis is a technique that uses a statistical model to measure the amount of change in one variable (dependent variable) that is associated with changes in amounts of one or more variables.

This method is used to determine the equation of the line of best fit by minimizing the sum of the squares of the vertical

When it has been established that a causal relationship exists in the data and that a linear function is appropriate the statistical technique known as least squares is frequently used to establish values for the coefficients a and b (representing fixed and variable cost respectively) in the linear cost function.

= +

where y is total cost – the dependent variable

and x is the agreed measure of activity – the independent variable

The values of a and b are determined after substituting data.

In the normal equation below.

 y  n a  b  x……………………………………..(i)

x y  ax  b  x²……………………………..(ii)

In the formulas below

nxyxy ybx _ _

b 2 2 a  or a  ybx

nx (x) n

= +

Where y is total cost – the dependent variable and x is the agreed measure of activity – the independent variable

Characteristics of linear regression

It is objectively determined.
It makes use of all the data or observations
It minimizes the sum of squares of the error terms
If there is a linear relationship between the dependent and independent variable, this method gives the best predictions within the relevant range.

Illustration

The following table shows the number of units of a good produced and the total costs incurred.

Units produced

Total costs

100

200

300

400

500

600

700

40,000

45,000

50,000

65,000

70,000

80,000

Calculate the regression line for y and n.

Solution

Notes on the calculation

The calculation can reduced to a series of steps as follows;-

Step 1:

Tabulate the data and determine which is the dependent variable, y, and which the independent x.

Step 2:

Calculate∑ , ∑ , ∑ , ∑ (leave room for a column for ∑ which may well be needed subsequently)

Step 3;

Substitute in the formation in order to find b and a in that order.

Step 4;

Substitute a and b in the regression equation.

The calculation is set out as follows, where x is the activity level in units of hundreds and y is the cost in units of sh.1, 000.

x²

420

150

260

350

420

560

1,870

140

n = 7

b =

Try to avoid rounding at this stage since, although n ∑ are large, their difference is much smaller.

a – 6.79 = 60 – 27.16= 32.84

Therefore the regressional line for y on x is:

y = 32.84 + 6.79x (x in hundreds of units produced, y in sh.1,000).

(Always specify what x and y are very carefully)

This line would be used to estimate the total costs for a given level of output. If, say, 250 units were made we can predict the expected yield by using the regression line where x = 2.5.

y = 32.84 + 6.79 x 2.5 = 32.84 + 16.975 = 49.815

i.e. we predict total costs of sh.49,815 for production of 250 units.

Using the regression line for forecasting

In the previous example, having found the equation of the line of best fit, we used this to forecast the total cost for a given level of activity.

The validity of such forecasts will be dependent upon two main factors.

Whether there is sufficient correlation between the variables to support a linear relationship within the range of the data used.
Whether the forecast represents an interpolation or an extrapolation

Illustration

The following data have been collected on costs and output:

Output (000s) Costs (sh.000s)

Required;-

Calculate the coefficients in the linear cost function.

y = a + bx

Using

The Normal Equation and (ii) the coefficient formulae

Solution

Output (x)

Costs (y)

x²

Σx = 28

Σy = 140

132

217

Σxy = 624

Σx²= 140

Where n = 7 (i.e. number of pairs of readings)

i) Using the normal equations

140 = 7a + 28b ………..I

624 = 28a + 140b ……….. II

And eliminating one coefficient thus

624 = 28a + 140b ………..I

560 = 28a + 112b ……….. 1 x 4

64 = 28b

∴ b = 2.286 and, substituting this value in one of the equations, the value of a is found to be

10.86

∴ Regression line is y = 10.86 + 2.26x

Using the coefficient formulae

= = 10.86

= = 2.286

When the coefficients have been calculated the cost function can be used for forecasting simply by inserting the appropriate level of activity i.e. a value for x, and calculating the resulting total cost.

For example, what are the predicted costs at output levels of:

4,500 units (i.e. 4.5 in ‘000s), and
8,000 units (i.e. 8 in ‘000s)

y = 10.86 + 2.286 (4.5) = sh. 21,147

Note: A prediction within the range of the original observations (1 to 7 in Example 1) is known as an interpolation.

y = 10.86 + 2.286 (8) = sh.29,148

Note: A prediction outside the range of original observations is known as an extrapolation.

REVISION QUESTIONS

QUESTION ONE

The management of Limuru Processing Company Limited wishes to obtain better cost estimates to evaluate the company’s operations more effectively.

The following information is provided to you for analysis:

Year 2004	Equivalent production	Overheads
Month	Units (‘000’)	Sh.’000’
January	1,425	12,185
February	950	9,875
March	1,130	10,450
April	1,690	15,280
May	1,006	9,915
June	834	9,150
July	982	10,133
August	1,259	11,981
September	1,385	12,045
October	1,420	13,180
November	1,125	13,180
December	980	10,430

Additional information:

In November, the opening work in progress inventory contained 1,000,000 units that were 30% complete with respect to conversion costs.
During the same month of November, the manufacturing department transferred 1,500,000 units.
The closing inventory for the month of November was 1,200,000 units and the units were 305 incomplete with respect to conversion costs
Using the above information, you have obtained the following variables by applying simple regression analysis.

Sh. ‘000’

Constant 3,709

Slope 6,487

Required:

i) Use the high-low method to estimate the overhead cost function. ii) Use the regression method to determine the overhead cost function.
- Compute the equivalent units of production with respect to conversion costs for the month of November using the FIFO method.

Use the regression function formulated in (ii) above to estimate the overhead cost for the month of November.

Solution:

Use the high-low method to estimate the overhead cost function

Highest cost (OHs) – 15,280 level of activity 1690

Lowest cost (OHs) – 9150 level of activity 834

Range = 15,280 – 9,150 = 6130 = 7.16

1690 – 834 856

Y = a + bx whereb = 7.16

Y = 15,280

Therefore 15,280 = a + 7.16 x 1690

a = 15,280 – (7.16 x 1690)

a = 3180

Therefore y = 3,180,000 + 7160x

Use the regression method – determine the overhead cost function y = a + bx where a = 3,709,000

b = 6487

Therefore y = 3,709,000 + 6487x

Equivalent units of production

Looking at the output side using FIFO method

		Completion %	Conversion
Opening stock (WIP)	1,000,000	70	700,000
Completely processed during production	500,000	100	500,000
Closing stock (WIP)	1,200,000		1,199,695
Equivalent units with respect to conversion costs			2,399,695

Estimate on cost for the month of November

Y = 3,709,000 +06487x where x = 1125

Therefore y = 3,709,000 + 6487 x 1125

= 11,006,875

QUESTION TWO

(a)Explain the advantages and disadvantages of the high-low method of cost estimation.

(b)Central Machinery Ltd. is preparing its budget for the year ending 30 June 2004. For the fuel expenses consumption it is decided to estimate an equation of the form, y = a + bx, where y is the total expense at an activity level x, a is the fixed expense and b is the rate of variable cost.

The following information relate to the year ended 30 June 2003:

Month

Machine hours

Fuel Oil expense

Month

Machine hours

Fuel oil expense

2003

(Sh. ‘000’)

2004

(Sh.‘000’)

(Sh. ‘000’)

July

August

September

October

November

December

640

620

590

500

530

January

February

March

April

May

June

500

530

550

580

680

The annual total and monthly average figures for the year ended 30 June 2003 were as follows:

Machine hours

Fuel oil expense

(‘000’)

(Sh. ‘000’)

Annual total

Monthly average

420 35

6,840 570

Required:

Using the high-low method, estimate and interpret the fixed and variable cost elements of the fuel oil expense.
Using the results in (i) above, predict the fuel oil expense for November 2004 if experience indicates that 41,000 machine hours will be used.
Briefly explain any two limitations of High-low method of cost estimation that may be overcome by using simple linear regression analysis.

Solution:

Advantages of high-low method
- Method is easy to use
- Not many data are needed
- Visually it gives the general direction of the trend Disadvantages
- Choice of the high and low points is subjective
- Method does not use all available data
- Cannot be used for more than one independent variable
- Not possible to defend the results statistically
- If the two points are outliers, the predictive equation will be wrong.
- Method may not be reliable

(i) High-low method

	Machine hours Sh. ‘000’	Fuel oil expense Sh. ‘000’
High-point (June 2004) Low-point (January, 2004) Difference	48 26 22	680 500 180
Variable cost per machine hour =	180,000 22,000 = Sh.8.182 per hour
Substituting for January 2004 Variable costs (26 × 8.182) = Fixed cost (difference)	212,730 287,270 500,000

Interpretation:

Within the relevant range, Sh.282,270 will be incurred irrespective of the machine hour usage of the unit i.e. 282,270 is fixed.

The total fuel consumption will thereafter vary at the rate of Sh.8.182 for each machine hour usage.

Fuel expense in November, 2004

= 287,264 + 8.182 x 41,000

= Sh.622,726

Limitations of high-low method
- Relies only on two data points – highest and lowest which may be outside and therefore not representative of the entire data set.
- The method does not use robust statistical techniques, to measure the predictive quality of the resultant function.