XL: Regression Analysis and Best Fit Lines (XE0124)

ID: Q103839


The information in this article applies to:


SUMMARY

The Application Note "Regression Analysis and Best Fit Lines" (XE0124), discusses how to use Microsoft Excel functions to perform simple, multiple, and polynomial regression analysis. It contains examples of how to use LINEST(), LOGEST(), TREND(), and GROWTH() to describe a best fit line or curve and to make predictions about your data. It also outlines some of the new statistical functions and tools available with Microsoft Excel versions 4.0 and later.

This Application Note is available for download from the Microsoft Software Library:

~ XE0124.exe

For more information about downloading files from the Microsoft Software Library, please see the following article in the Microsoft Knowledge Base:

Q119591 : How to Obtain Microsoft Support Files from Online Services



If you are unable to access the source(s) listed above, you can have this Application Note mailed or faxed to you by calling Microsoft Product Support Services Monday through Friday, 6:00 A.M. to 6:00 P.M. Pacific time at (425) 635-7070. If you are outside the United States, contact the Microsoft subsidiary for your area. To locate your subsidiary, see the Microsoft World Wide Offices Web site at:

http://www.microsoft.com/worldwide/default.htm

TEXT OF XE0124

The following is the full text of the Application Note. (Note that some graphics and formatting may be missing from this version of the document. For best results, download the Microsoft Word version of this document.)

OVERVIEW ========
This Application Note discusses how to use Microsoft Excel functions to perform simple, multiple, and polynomial regression analysis. It contains examples of how to use LINEST, LOGEST, TREND, and GROWTH to describe a best-fit line or curve and to make predictions about your data. It also outlines some of the built-in statistical functions and tools available with Microsoft Excel.


   General Information

      To Use the Analysis ToolPak
      Choosing the Best Function
         When Your Data Is Linear
         When Your Data Is Exponential
         When Your Data Is Curvilinear

   Simple Regression Analysis

      Describing a Best-fit Line
         Finding the Slope and the Y Intercept
      Calculating a Best-fit Line
         Using TREND
         Using LINEST
      Plotting the Best-fit Line
         Plotting the Trendline Automatically
      Predicting Future Values
         Using FORECAST
         Using TREND
         Using LINEST

   Multiple Regression Analysis
      Predicting Y-Values
         Using TREND
         Using LINEST

   Polynomial Regression Analysis
      Calculating a Polynomial Curve
      Charting a Polynomial Curve

   Using Regression Statistics
         Using LINEST/LOGEST for Regression Statistics
         Using R2 to Test Regression Model Accuracy 


GENERAL INFORMATION

Regression is a statistical method used to predict values based on relationships in existing data. By analyzing how a single dependent variable (y) is affected by the values of one or more independent variables (x), you can predict what y will be given x. You can use this information to fit a line or a curve to your existing data and to forecast future values. The LINEST, TREND, LOGEST, and GROWTH functions are the primary functions you will use to perform regression analysis in Microsoft Excel.

While this Application Note focuses primarily on the functions that can be used in Microsoft Excel versions 3.0 and later, Microsoft Excel versions 4.0 and later offer several new functions and tools that you can use to perform regression analysis and to create best-fit lines. When one of these new functions can be used to perform a task described in this Application Note, the function will be noted in the appropriate section. The following table lists some of these new functions.


                                        Use this
   To do this                           function
   ----------------------------------------------

   Return the correlation coefficient   CORREL
   for two arrays of cells

   Return a single predicted y-value    FORECAST
   based on a linear regression of
   known x and y ranges

   Return the y intercept of the linear INTERCEPT
   regression line

   Calculate R2, the coefficient of     RSQ
   Determination

   Return the slope of the linear       SLOPE
   regression line

   Return the standard error of the     STEYX
   Regression

   Table 1--Regression Analysis Functions in
   Microsoft Excel Versions 4.0 and later. 


In addition, the Analysis ToolPak add-in provides a special set of analysis tools, including tools to accomplish the following tasks.


                                            Use this
   To do this                               analysis tool
   -------------------------------------------------------

   Predict a value based on the forecast    Exponential
   for the prior period, adjusted for the   Smoothing
   error in that prior forecast

   Project values in the forecast period    Moving Average
   based on the average value of the
   variable over a specific number of
   preceding periods

   Perform linear regression analysis and   Regression
   return statistics and plots as specified

               Table 2--Analysis ToolPak Add-in Features 


To Use the Analysis ToolPak

In Microsoft Excel 5.0 and later:

  1. On the Tools menu, click Data Analysis.


  2. If the Data Analysis command is not available, click Add-Ins on the Tools menu. In the Add-Ins dialog box, click to select the Analysis ToolPak check box.


NOTE: If the Analysis ToolPak add-in is not listed, run the Setup program, choose Add/Remove, and select the Add-ins option for Microsoft Excel.

In Microsoft Excel 4.0:

  1. On the Options menu, click Analysis Tools.


  2. If the Analysis Tools command is not available, click Add-In on the Options menu. In the Add-Ins dialog box, click Add. Click Analysis.xla in the Library\Analysis folder.


  3. In the Data Analysis dialog box, choose the tool that you want to use, such as Exponential Smoothing. For help on how to use a particular analysis tool, click Help in the dialog box for the tool.


CHOOSING THE BEST FUNCTION

Whether you are performing simple regression (one x variable), multiple regression (two or more x variables), or polynomial regression (one x variable raised to different powers), you will get the most accurate results if the function that you choose to regress your data is based on the patterns in your existing data.

When Your Data Is Linear

Your data is linear if the rate of change in your data is even to such an extent that when you plot it in a chart, the pattern in your data points resembles a line. If your data is linear, use the linear regression functions, LINEST and TREND. Both functions use the "least squares" method to calculate a straight line that best fits your data. LINEST returns information about the line, such as its slope and y intercept, and TREND returns predicted values along the line.

In Microsoft Excel versions 4.0 and later, the Regression tool (in the Analysis ToolPak add-in) performs linear regression, returns regression statistics, calculates best-fit lines, and creates best-fit line charts.

When Your Data Is Exponential

Your data is exponential if the rate of change in your data, when plotted on a chart, resembles a curve that rises or falls at an increasingly higher rate. If your data is exponential, use the logarithmic regression functions, LOGEST and GROWTH. LOGEST calculates an exponential curve that best-fits your data and, like LINEST, returns information about the curve. Like TREND, GROWTH returns predicted values along the curve.

When Your Data Is Curvilinear

To most accurately predict values when the pattern in your data is neither linear nor exponential, use polynomial regression in conjunction with the TREND function to calculate a best-fit curve. For example, use this method if, when you plot your data in a chart, it resembles a curve for which the rate of change is not dramatic or if your data fluctuates in such a way that no linear or curved pattern can be identified.

SIMPLE REGRESSION ANALYSIS

Your regression analysis is "simple" if you have only one independent x variable for each dependent y variable. For example, assume you are analyzing the sales figures for the first six months of operation for Wingtip Toys, a company that specializes in the design and manufacture of toys.


   NOTE: The following examples primarily use the LINEST and TREND
   functions. Wherever these two functions are discussed, LOGEST and
   GROWTH can be substituted if your data is exponentially curved and
   if a curve fit would be more accurate than a straight line. 


In the following sample data, the values in the Month column are the independent x variables and the values in the Sales column are the dependent y variables. Based on this data, you can describe, calcu late, and plot a best-fit line, and you can then predict future sales figures. Because the data is linear, you will use the LINEST and TREND functions to perform the regression analysis.

The Regression tool in Microsoft Excel version 4.0 and later performs each of these tasks automatically.

For additional information on calculating regression, see the following references.


   Version of
   Microsoft Excel   Reference
   ------------------------------------------------------------

   97, 98            In Help, search for "Regression, Analysis"
   7.0               In Help, search for "Regression"
   5.0               In Help, search for "Regression "
   4.0               User's Guide 2, pages 41-45 


Because this tool performs linear regression, if your data resembles an exponential curve, use LOGEST and GROWTH.

Following are the sales figures for Wingtip Toys and the corresponding months in both table and chart form.


         A       B

   1   Month   Sales
   2        1  $4,200
   3        2  $6,100
   4        3  $7,300
   5        4  $7,300
   6        5  $8,700
   7        6  $10,500

         Table 3--Sample Data (Sales Figures for Wingtip Toys) 


DESCRIBING A BEST-FIT LINE

The equation of a straight line is y=mx+b, where m is the slope and b is the y intercept. LINEST returns the slope (m) and y intercept (b) values that describe the line derived from your existing data.

Microsoft Excel versions 4.0 and later provide specific SLOPE and INTERCEPT functions for calculating the slope and the y intercept when your data is linear.

For additional information, see the following references.


   Version of
   Microsoft Excel   Reference
   ------------------------------------------------------------

   97, 98            In Help, search for "slope" or "intercept"
   7.0               In Help, search for "slope" or "intercept"
   5.0               In Help, search for "slope" or "intercept"
   4.0               Function Reference Guide, pages 405-406

   NOTE: If your data is exponentially curved, use LOGEST to return
   the slope (m) and y intercept (b) values that describe the curve.
   The equation used by LOGEST is y=b*m^x. 


Finding the Slope and the Y Intercept

To calculate the values of the slope (m) and y intercept (b), use the procedure appropriate for your version of Microsoft Excel.

Microsoft Excel 4.0 and later:

  1. Use the data in Table 3--Sample Data (Sales Figures for Wingtip Toys).


  2. To find the slope, select cell E2 and type the following formula:

    =SLOPE(B2:B7,A2:A7)


The slope of the line for this data is 1122.857.

  1. To find the y intercept, select cell F2, and type the following formula:

    =INTERCEPT(B2:B7,A2:A7)


The point at which the line crosses the y axis is 3420.

Microsoft Excel 3.0:

  1. Using the data in "Table 3--Sample Data (Sales Figures for Wingtip Toys)," select cells E2:F2.


  2. Type the following formula:

    =LINEST(B2:B7,A2:A7)

    NOTE: Because the function returns data to more than one cell, you must enter the formula as an array by pressing CTRL+SHIFT+ENTER in Microsoft Excel for Windows or COMMAND+RETURN in Microsoft Excel for the Macintosh.


The first argument in the LINEST function is the array containing the known y-values (which in this example are the Sales numbers). The second argument is the array containing the known x-values (in this case, the Month numbers).


   NOTE: LINEST also takes other optional arguments that are not
   necessary for this example. 


The result 1122.857, in E2, is the slope, and the result 3420, in cell F2, is where the line crosses the y-axis (y intercept).


        E        F

   1  Slope      Y intercept
   2  1122.857    3420

   Table 4--Example of Slope Intercept Values 


CALCULATING A BEST-FIT LINE

If your data is linear, use TREND or LINEST to calculate your best-fit line. In Microsoft Excel versions 4.0 and later, you can also use the FORECAST function (forecast is mainly useful for finding a data point based on existing data, but can it also be used as a substitute for the TREND function). If your data fits an exponential curve, use LOGEST or GROWTH.

Using TREND

The TREND function is the easiest and most efficient function for calculating the points along a best-fit line. To simultaneously calculate all the values on the best-fit line, do the following:

  1. Using the data in Table 3 of this Application Note, select cells C2:C7 and type the following formula:

    =TREND(B2:B7,A2:A7)

    NOTE: Because the function returns data to more than one cell, you must enter the formula as an array by pressing CTRL+SHIFT+ENTER in Microsoft Excel for Windows or COMMAND+RETURN in Microsoft Excel for the Macintosh.


The first argument in the TREND function is the array containing the known y-values and the second argument is the array containing the known x- values.


   NOTE: TREND also takes other optional arguments that aren't
   necessary for this example. 


Using LINEST

You can also use the slope (m) and the y intercept (b) values returned by LINEST to find data points on the best-fit line by substituting the y- values or the x-values into the equation for a line (y=mx+b). By plugging each month number into this formula, you can calculate all the data points for your best-fit line. With the slope(m) value (1122.857) in cell E2 and the intercept (b) value (3420) in cell F2, do the following to generate the points on your best-fit line:

  1. Select cell D2 and enter the following formula (because the formula only references single cells, it is not necessary to enter this formula as an array):

    =($E$2*A2)+$F$2


  2. Select cells D2:D7.


  3. Use the method appropriate for you version of Microsoft Excel.

    - In Microsoft Excel 5.0 and later, point to Fill on the Edit menu, and click Down.

    - In Microsoft Excel 4.0 and earlier, click Fill Down on the Edit menu.


The values returned are the y-values for your best-fit line.

The following table shows the results of the values returned when you use TREND and LINEST.


        A        B          C             D           E           F

   -----------------------------------------------------------------------
   1  Month    Sales   Predicted Y   Predicted Y    Slope    Y Intercept

                          TREND         LINEST

   -----------------------------------------------------------------------
   2       1    $4,200       $4,543        $4,543  1122.857          3420
   3       2    $6,100       $5,666        $5,666
   4       3    $7,300       $6,789        $6,789
   5       4    $7,300       $7,911        $7,911
   6       5    $8,700       $9,031        $9,031
   7       6   $10,500      $10,157       $10,157

  Table 5--Results of the Values Returned When You Use TREND and LINEST.

   NOTE: The returned values for TREND and LINEST are identical. 


PLOTTING THE BEST-FIT LINE

Once you have calculated the values on your best-fit line, you can add that line to your existing chart by copying cells C1:C7 and pasting them into your existing chart.

The resulting chart will have a straight line (best-fit line) running through your original data.


   NOTE: When you display the points of a best-fit line against the
   original data in a chart, in most cases you will get the best results
   by using an xy (scatter) chart. If you use a line chart, the x-values
   will be treated as labels rather than as values, and curved lines may
   result. 


Plotting the Trendline Automatically

In Microsoft Excel versions 5.0 and later, you can insert a trendline directly into your chart without having to first calculate the points or copy and past those points into the chart. Do the following to automatically insert a trendline in a chart:

  1. Double-click the chart to activate it.


  2. Select the series for which you want to plot a trendline.


  3. Use the procedure appropriate for your version of Microsoft Excel:

    If you are using Microsoft Excel 97 or 98, click Add Trendline on the Chart menu. In the Trendline dialog box, click the Type tab, and then select the type of trend or regression you want to plot. Click OK.

    If you are using Microsoft Excel 5.0 or 7.0, click Trendline on the Insert menu.


PREDICTING FUTURE VALUES

In addition to returning values along the line fitted to your existing data, you can use TREND and LINEST to predict future values. In Microsoft Excel 4.0 and later, you can also use the FORECAST function to predict future values.

Using the Wingtip Toys example, suppose you want to calculate sales figures for months 7, 8, and 9. The following examples show how to accomplish this using the FORECAST, TREND, and LINEST functions, respectively.

To predict values for months 7, 8, and 9, first enter the month numbers for which you want predicted sales figures and then use the FORECAST function to calculate the values.

Using FORECAST

  1. In cells A8:A10, type 7, 8, and 9, respectively.


  2. Select cells B8:B10.


  3. Type the following formula:

    =FORECAST(A8:A10,B2:B7,A2:A7)

    NOTE: Because the function returns data to more than one cell, you must enter the formula as an array by pressing CTRL+SHIFT+ENTER in Microsoft Excel for Windows or COMMAND+RETURN in Microsoft Excel for the Macintosh.


The first argument in the FORECAST function is the array containing the new x-values for which you want to derive predicted y-values.

The resulting values in cells B8:B10 are the predicted sales for the next three months.


         A       B

   1   Month   Sales
   2     1     $4,200
   3     2     $6,100
   4     3     $7,300
   5     4     $7,300
   6     5     $8,700
   7     6     $10,500
   8     7     $11,280
   9     8     $12,403
   10    9     $13,526

   Table 6--Sample Data Using TREND to Predict Future Values Using TREND 


  1. In cells A8:A10, type 7, 8, and 9, respectively.


  2. Select cells B8:B10.


  3. Type the following formula:

    =TREND(B2:B7,A2:A7,A8:A10)

    NOTE: Because the function returns data to more than one cell, you must enter the formula as an array by pressing CTRL+SHIFT+ENTER in Microsoft Excel for Windows or COMMAND+RETURN in Microsoft Excel for the Macintosh.


The third argument in the TREND function is the array containing the new x-values for which you want to derive predicted y-values.

The resulting values in cells B8:B10 are the predicted sales for the next three months.


         A       B

   1   Month   Sales
   2     1     $4,200
   3     2     $6,100
   4     3     $7,300
   5     4     $7,300
   6     5     $8,700
   7     6    $10,500
   8     7    $11,280
   9     8    $12,403
   10    9    $13,526

       Table 7--Sample Data Using TREND to Predict Future Values

   NOTE: In Microsoft Excel versions 4.0 and later, you can use the
   AutoFill feature to predict future values. Using the data in the
   previous table, if you wanted to predict sales for months 7, 8, and 9,
   you would select cells B2:B7, select the AutoFill handle in the lower-
   right corner of the selected area, and drag down three additional
   cells. (The AutoFill method is by far the easiest method to use for
   predicting values; however, if you use the formulas, it is easier to
   tell which values are derived and which values are static). The
   functions give you more power and flexibility than the AutoFill feature
   does.

   CAUTION: In addition to returning predicted values for months 7, 8, and
   9, the data in cells B2:B7 will be overwritten with the values that
   represent the best-fit line. If you do not want your original data to
   be overwritten, copy it to a separate area on your worksheet and then
   use AutoFill. 


Using LINEST

To obtain the new y-values, you can also substitute the slope and y intercept values that you derived with the LINEST function (these results are on page 5) and the new x-values (7, 8, and 9) into the formula, y=mx+b. See Using LINEST in the "Calculating a Best-fit Line" section for step-by- step instructions on how to do this.

MULTIPLE REGRESSION ANALYSIS ============================
When you have two or more independent x variables for each y variable, the regression analysis is considered multiple. For example, you could predict a child's weight given his or her age and height. Assume you've collected the following data


        A      B      C

   1   Age   Height  Weight
   2       3     32     35
   3       5     40     40
   4       6     39     43
   5      10     50     70

        Table 8--Sample Data: Age, Height, and Weight of Child 


where the values under Weight (C2:C5) represent the dependent y variables and the values under Age and Height (A2:B5) represent the independent x variables.

PREDICTING Y-VALUES

You can use either the TREND or the LINEST function to analyze the relationship of the age and height to weight, and you can make predictions based on the results of this analysis.

In Microsoft Excel 4.0 and later, the Regression tool can also be used to predict y-values in a multiple regression model.


   NOTE: Do not use the FORECAST function because it only works for
   simple regression. 


For additional information on predicting values, see the following references.


   Version
   Of Microsoft Excel   Reference
   ---------------------------------------------------------------
   97, 98               In Help, search for "Multiple, Regression"
   7.0                  In Help, search for "Multiple Regression"
   5.0                  In Help, search for Multiple Regression"
   4.0                  User's Guide 2, pages 41-45 


Using TREND

Using TREND to Predict a Child's Weight:

This example uses the data in Table 8--Sample Data: Age, Height, and Weight of Child.

To use TREND to predict the weight of a 9-year-old, 45-inch child, do the following:

  1. In cells A6 and B6, type 9 and 45, respectively.


  2. Select cell C6 and type the following formula:

    =TREND(C2:C5,A2:B5,A6:B6)

    Because the function returns data to more than one cell, you must enter the formula as an array by pressing CTRL+SHIFT+ENTER in Microsoft Excel for Windows or COMMAND+RETURN in Microsoft Excel for the Macintosh.


The result of the formula, 63.42, is the predicted weight.

Using LINEST

To predict a y-value with LINEST, you must first calculate the slopes for each x variable and find the y intercept. Because a slope is returned for each x variable, when you use the LINEST function, you must first select a range of cells that consist of a single row and a single column plus an additional column for each x variable in your data table. In this example, because you have two x variables, you will need to select a range of three cells, three columns wide by one row tall.

Using LINEST to Calculate a Child's Weight:

This example uses the data in Table 8--Sample Data: Age, Height, and Weight of Child.

To calculate the slopes and the y intercepts, select cells A7:C7 and type the following formula:


   =LINEST(C2:C5,A2:B5) 


NOTE: Because the function returns data to more than one cell, you must enter the formula as an array by pressing CTRL+SHIFT+ENTER in Microsoft Excel for Windows or COMMAND+RETURN in Microsoft Excel for the Macintosh.

The following values will be returned


        A     B      C
   7  -0.32  5.98     24 


where -0.32 is the slope for the second x variable (height), 5.98 is the slope for the first x variable (age), and 24 is the y intercept.


   NOTE: The slopes are in reverse order: the first slope value
   corresponds to the second x variable and the second slope value
   corresponds to the first x variable. The LINEST function always
   returns the slopes in reverse order when more than one x variable
   is involved. 


You can use the slope values and the y intercept value to make predictions based on your data. Using the formula, y=(m1*x1)+(m2*x2)+(mn*xn)+b, you can predict the weight of a 9-year old, 45-inch child:
=(9*5.98)+(45*-0.32)+24
The result of the formula, 63.42, is the predicted weight.

Similar to LINEST, the Regression tool in Microsoft Excel versions 4.0 and later returns the slope values and the y intercept value. As described previously, you can plug these values into the formula, y=mx+b, to predict y.


   NOTE: Because the x variables are independent, there may not be a
   good graphical representation for a multiple regression model. Each
   x-value can be plotted with its corresponding y-value, but the
   individual lines may be completely unrelated, and, therefore, may
   be meaningless. 


POLYNOMIAL REGRESSION ANALYSIS ==============================
When your data is neither exponentially curved nor consistently linear, use the polynomial method of regression. When you plot a best- fit curve calculated with polynomial regression, the curve will rise and fall with the data.

CALCULATING A POLYNOMIAL CURVE

To calculate a polynomial curve, the dependent y variable is regressed against the independent x variable raised to different powers. To illustrate this process, take the following example. Fitting a straight line to the following data would not accurately predict the sales for any given month.


         A       B

   1   Month   Sales
   2        1  $4,200
   3        2  $1,600
   4        3  $5,120
   5        4  $4,500
   6        5  $5,400
   7        6  $1,460

   Table 9--Six-Month Sales Figures Wingtip Toys 


In this case, you will get the best results by setting up the following polynomial regression model.


       A    B    C      D      E        F

   10  X   X^2  X^3    X^4   Sales    Trend
   11   1     1    1      1  $4,200   $4,089
   12   2     4    8     16  $1,600   $2,154
   13   3     9   27     81  $5,120   $4,011
   14   4    16   64    256  $4,500   $5,609
   15   5    25  125    625  $5,400   $4,846
   16   6    36  216   1296  $1,460   $1,571

   Table 10--Polynomial Regression Model for Wingtip Sales Figures 


The values in cells A11:A16 are the month numbers copied from cells A2:A7 of Table 9--Six-Month Sales Figures Wingtip Toys. The values in cells B11:D16 are the original x variables raised to the second, third, and fourth powers, respectively. To obtain these values, do the following:

  1. Select cell B11 and enter the formula:

    =A11^2


  2. Select cell C11 and enter the formula:

    =A11^3


  3. Select cell D11 and enter the formula:

    =A11^4


  4. Select cells B11:D16.


  5. Use the method appropriate for your version of Microsoft Excel.

    - In Microsoft Excel version 5.0 or later, point to Fill on the Edit menu, and click Down.

    - In Microsoft Excel version 4.0 or earlier, click Fill Down on the Edit menu.


The values in E11:E16 are the sales figures copied from B2:B7. To derive the trend values in column F, select cells F11:F16 and type the following formula:
=TREND(E11:E16,A11:D16)

   NOTE: Because the function returns data to more than one cell, you
   must enter the formula as an array by pressing CTRL+SHIFT+ENTER in
   Microsoft Excel for Windows or COMMAND+RETURN in Microsoft Excel
   for the Macintosh. 


CHARTING A POLYNOMIAL CURVE

To add the TREND results to this chart, select cells F10:F16, and then copy and paste them into your existing chart.

USING REGRESSION STATISTICS ===========================
The LINEST and LOGEST functions can return additional regression statistics that can be helpful in using and evaluating your regression model. If you are using Microsoft Excel 4.0 or later and have linear data, you can use the Regression Tool from the Analysis ToolPak add- in. This tool will automatically return all the regression statistics. If your data resembles an exponential curve, use LOGEST to return accurate regression statistics.

USING LINEST/LOGEST FOR REGRESSION STATISTICS

To return the additional statistics using LINEST or LOGEST, you must select a range that includes five rows and a single column plus an additional column for each x variable in your data. In addition, the stats argument, which is the fourth argument in both of these functions, must be set to TRUE.

The following table lists the ages, weights, and heights of a number of children.


        A      B      C

   1   Age   Height  Weight
   2       3     32     35
   3       5     40     40
   4       6     39     43
   5      10     50     70

        Table 11--Sample Data: Age, Height, and Weight of Child 


To return the additional regression statistics using the data from "Table 11--Sample Data: Age, Height, and Weight of Child" use the following steps:

  1. Select cells D1:F5.


NOTE: This range consists of five rows and a single column plus two additional columns (one for each x variable).

  1. Type the following formula:

    =LINEST(C2:C5,A2:B5,,TRUE)


Because the function returns data to more than one cell, you must enter the formula as an array by pressing CTRL+SHIFT+ENTER in Microsoft Excel for Windows or COMMAND+RETURN in Microsoft Excel for the Macintosh.

The resulting data should resemble the data in the following table.


          D          E         F

   1  -0.32      5.98      24
   2  2.243569   5.647619  57.68449
   3  .950813    6.024948  #N/A
   4  9.665289   1         #N/A
   5  701.7      36.3      #N/A

   Table 12--Regression Statistics 


The first row of the statistics contains the slope for the height, the slope for the age, and the y intercept. The second row contains the standard error of the slopes and of the y intercept. The third row contains R2 and the standard error for the y estimate. The fourth row contains the F statistic and degrees of freedom. And, the fifth row contains the regression sum of squares and the residual sum of squares.

USING R2 TO TEST REGRESSION MODEL ACCURACY

A particularly useful statistic returned is the coefficient of determination called R2. In Microsoft Excel versions 4.0 and later, you can also use the RSQ function to find R2. This R2 indicator ranges in value from 0 to 1 and reveals how closely the estimated y-values correlate to your actual y-values. The closer R2 is to 1, the more perfect the correlation-this correlation indicates that the regression equation is very useful in accurately predicting a y-value. On the other hand, the closer R2 is to 0, the less helpful it will be in predicting a y-value.

In the previous example, the value for R2 returned by LINEST is .95, an excellent correlation. This indicates that, based on the collected data, the LINEST model can be used to make extremely accurate predictions of a child's weight given a specific age and height.

The disk and software contained on it, including any accompanying documentation (the "Software"), are provided to you at no additional charge. Microsoft Corporation owns all rights, title, and interest in and to the Software. The user assumes the entire risk as to the accuracy and the use of the Software.
###
COPYRIGHT NOTICE. Copyright (c) 1992-1997 Microsoft Corporation. Microsoft and/or its suppliers, One Microsoft Way, Redmond, Washington 98052-6399 U.S.A. All rights reserved.

TRADEMARKS. Microsoft, Windows, Windows NT, MSN, The Microsoft Network and/or other Microsoft products referenced herein are either trademarks or registered trademarks of Microsoft. Other product and company names mentioned herein may be the trademarks of their respective owners.

The names of companies, products, people, characters and/or data mentioned herein are fictitious and are in no way intended to represent any real individual, company, product or event, unless otherwise noted.

NO WARRANTY. THE SOFTWARE IS PROVIDED "AS-IS," WITHOUT WARRANTY OF ANY KIND, AND ANY USE OF THIS SOFTWARE PRODUCT IS AT YOUR OWN RISK. TO THE MAXIMUM EXTENT PERMITTED BY APPLICABLE LAW, MICROSOFT AND ITS SUPPLIERS DISCLAIM ALL WARRANTIES AND CONDITIONS, EITHER EXPRESS OR IMPLIED, INCLUDING BUT NOT LIMITED TO THE IMPLIED WARRANTIES AND CONDITIONS OF MERCHANTABILITY AND/OR FITNESS FOR A PARTICULAR PURPOSE, TITLE, AND NON- INFRINGEMENT, WITH REGARD TO THE SOFTWARE.

LIMITATION OF LIABILITY. TO THE MAXIMUM EXTENT PERMITTED BY APPLICABLE LAW, IN NO EVENT SHALL MICROSOFT OR ITS SUPPLIERS BE LIABLE FOR ANY SPECIAL, INCIDENTAL, INDIRECT, OR CONSEQUENTIAL DAMAGES WHATSOEVER (INCLUDING, WITHOUT LIMITATION, DAMAGES FOR LOSS OF BUSINESS PROFITS, BUSINESS INTERRUPTION, LOSS OF BUSINESS INFORMATION, OR ANY OTHER PECUNIARY LOSS) ARISING OUT OF THE USE OF OR INABILITY TO USE THE SOFTWARE, EVEN IF MICROSOFT HAS BEEN ADVISED OF THE POSSIBLITY OF SUCH DAMAGES. BECAUSE SOME STATES AND JURISDICTIONS DO NOT ALLOW THE EXCLUSION OR LIMITATION OF LIABILITY FOR CONSEQUENTIAL OR INCIDENTAL DAMAGES, THE ABOVE LIMITATION MAY NOT APPLY. MICROSOFT'S ENTIRE LIABILITY AND YOUR EXCLUSIVE REMEDY UNDER THIS EULA SHALL NOT EXCEED FIVE DOLLARS (US$5.00).

The following conditions also apply to your use of the Software:

The Software may be copied and distributed internally only, subject to the following conditions: All text must be copied without modification and all pages must be included; If software is included, all files on the disk(s) |must be copied without modification [the MS-DOS(R) utility diskcopy is appropriate for this purpose]; All components of this Software must be distributed together; and This Software may not be distributed to any third party.

If you are not a Microsoft Premier customer, Microsoft shall not provide technical support for this Software.

The Software is provided with RESTRICTED RIGHTS. Use, duplication, or disclosure by the Government is subject to restrictions set forth in subparagraph (c)(1)(ii) of the Rights in Technical Data and Computer Software clause at DFARS 252.227-7013 or subparagraphs (c)(1) and (2) of the Commercial Computer Software-Restricted Rights at 48 CFR 52.227- 19, as applicable. Manufacturer is Microsoft Corporation, One Microsoft Way, Redmond, WA 98052-6399. Any transfer of the Software must be accompanied by this statement and may only be transferred if first approved by Microsoft.

You agree that you will not export or re-export the Software to any country, person, entity or end user subject to U.S.A. export restrictions, and you are responsible for complying with all applicable U.S. and local export laws in connection with the use of this Software. You warrant and represent that neither the U.S.A. Bureau of Export Administration nor any other federal agency has suspended, revoked or denied you export privileges.

This EULA is governed by the laws of the State of Washington, U.S.A.

Additional query words: appnote regress stats tool pack toolpack graph chart charting


Keywords          : kbappnote kbhowto xladdin xlchart 
Version           : MACINTOSH:3.0,4.0,5.0,5.0a; WINDOWS:3.0,4.0,4.0a,5.0,5.0c,7.0,7.0a
Platform          : MACINTOSH WINDOWS 
Issue type        : kbhowto 

Last Reviewed: April 7, 1999