1.10 Simple Linear Regression.pdf

Content text 1.10 Simple Linear Regression.pdf

1. An investor receives a research report that analyzes two securities through a simple linear regression. The study discloses the coefficient of determination (R ) and the sum of squares error (SSE). To estimate the accuracy of the model using an absolute measure of the errors, the investor will most likely need to know the:  A. sample size. B. regression coefficients. C. correlation between the securities. Explanation The standard error of the estimate (s ) is an absolute (ie, not relative) measure of the accuracy of the regression, as reflected by the average distance between each predicted (Ŷ) and observed (Y) values of the dependent variable. Since s is the square root of the mean square error (MSE) , it is presented in the same units as Y. Smaller values of s mean that the regression line is a better fit of the data. In this scenario, s is the correct statistic to use since the investor wants an absolute measure of the accuracy of the model. Other measures of goodness of fit, such as the coefficient of determination (R ) and the F-statistic, provide a relative measure and are not appropriate for this case. Since this scenario does not provide the MSE, it must be calculated as the ratio of: the sum of squares error (SSE), or residual sum of squares, to the degrees of freedom, derived from the sample size (n) and the number of variables. Since there are two variables and SSE is known, the investor will need to know n to calculate s . (Choices B and C) The regression coefficients (ie, the intercept and the slope, and the correlation between the variables are not necessary to calculate s . Things to remember: The standard error of the estimate (s ) is an absolute (ie, not relative) measure of the accuracy of the regression, presented in the same data units as the dependent variable. s is the square root of the mean square error (MSE), which is the ratio of the residual sum of squares to the degrees of freedom. Describe the use of analysis of variance (ANOVA) in regression analysis, interpret ANOVA results, and calculate and interpret the standard error of Estimate in a simple linear regression LOS Copyright © UWorld. Copyright CFA Institute. All rights reserved. 2 e e e e 2 e e e e
2. An economist runs the following regression models between two macroeconomic datasets with variable transformations: Regression M1 M2 Model lnY = b + b X + ε lnY = b + b lnX + ε Intercept 5.723 5.338 Slope 0.233 1.184 Coefficient of determination (R ) 0.830 0.573 Standard error of the estimate (s ) 0.415 0.658 F-statistic 137.001 37.624 Based only on this data, which of the following functional forms is the most appropriate fit for the sample data? A. Lin-log  B. Log-lin C. Log-log Explanation Linearity is one of the four key assumptions of a simple linear regression. However, even if data sets do not have a linear relationship, a simple linear regression can be used by transforming one or both variables by using, for example, the square, the reciprocal, or the log of the variable. This results in distinct functional forms, allowing the regression to better fit a curvature. These transformed models have linear parameters (ie, the intercept b and the slope b ), although the variables are not linear. In this scenario: M1 is a log-lin model, with a logarithmic dependent variable (lnY) and a linear independent variable (X), and M2 is a log-log model, with both variables in their logarithmic forms. The study show that M1 has the largest coefficient of determination and F-statistic, and the smallest standard error of the estimate. Therefore, the log-lin model displays the best fit for the data (Choice C). It is important to remember that regression statistics can be compared for regressions only when each regression's dependent variable has the same form. For instance, linear and lin-log models can be compared since the dependent variable for each is linear, but a lin-log model cannot be directly compared with a log-log model. In this question, M1 and M2 can be compared since both use lnY as the dependent variable. (Choice A) Neither regression is in the lin-log functional form, which has a linear dependent variable (Y) and a logarithmic independent variable (lnX). Things to remember: Simple linear regressions may have different functional forms, including logarithmic transformations of one or both variables (ie, lin-log, log-lin, log- log). These transformed models have linear parameters, although the variables are not linear. The statistics of models with different forms of dependent variable (eg, Y and lnY) are not comparable. Describe different functional forms of simple linear regressions LOS Copyright © UWorld. Copyright CFA Institute. All rights reserved. 0 1 0 1 2 e 0 1
3. An economist runs a linear regression and analyzes the regression's residual plot: The most appropriate conclusion is that the pattern of the residual plot indicates the presence of: A. outliers.  B. nonlinearity. C. heteroskedasticity. Explanation A residual is the difference between the observed value of a dependent variable Y and the value predicted by a regression. In a simple linear regression (SLR), residuals are the vertical distance between the actual observations and the regression line, which is the straight line that minimizes the sum of the squares of all residuals (ie, the sum of squares error). The residual plot ideally should be scattered randomly; the presence of a pattern often indicates the violation of one or more assumptions underlying the SLR model. In this residual plot (graph on the right), there is a pattern: a concentration of positive residuals in the lower and upper ranges of the independent variable X and of negative residuals in the middle range. This indicates a violation of linearity, which assumes that X and Y have a linear relationship. This violation is called nonlinearity and is evident in the scatterplot of Y against X (graph on the left). Things to remember: The presence of a pattern in a residual plot often indicates a violation of the assumptions of a linear regression model. If the residual plot shows a concentration of positive residuals in a range and of negative residuals in other range(s), then the relationship cannot be displayed as a straight line and the linearity assumption is violated. This is called nonlinearity. Explain the assumptions underlying the simple linear regression model, and describe how residuals and residual plots indicate if these assumptions may have been violated LOS Copyright © UWorld. Copyright CFA Institute. All rights reserved.
4. A linear regression analysis generates the following ANOVA (analysis of variance) table: ANOVA Table Source Degrees of freedom Sum of squares Mean squares F- statistic Regression 1 9.5231 9.5231 163.4274 Residual 78 4.5452 0.0583 Total 79 14.0683 Based on this data, the standard error of the estimate (s ) is closest to:  A. 0.241 B. 0.323 C. 0.677 Explanation The standard error of the estimate (s ) measures the accuracy of the regression as reflected by the average distance between each predicted Ŷ and observed Y. It is the square root of the mean square error (MSE) and is presented in the same units as Y. In this scenario: Smaller s values mean that a regression has smaller errors, so the regression line is a better fit for the data. This important measure of goodness of fit can be derived from the ANOVA table for a simple linear regression since the table includes the MSE. (Choice B) 0.323 results from dividing the sum of squares error (SSE) by the sum of squares total (SST) (1 − R ); it is the proportion of the change in the dependent variable Y not explained by the model. (Choice C) 0.677 is the coefficient of determination (R ) that measures the proportion of the change in Y explained by the independent variable X. R = sum of squares regression (SSR) / SST. Things to remember: The standard error of the estimate (s ), derived from the ANOVA table, is a key measure to evaluate how the regression fits the data. It quantifies the accuracy of the regression (ie, the average error) and is measured in the same units as the dependent variable. It is the square root of the mean square error (MSE); smaller s values mean that a regression has smaller errors, so the regression line is a better fit for the data. Describe the use of analysis of variance (ANOVA) in regression analysis, interpret ANOVA results, and calculate and interpret the standard error of Estimate in a simple linear regression LOS Copyright © UWorld. Copyright CFA Institute. All rights reserved. e e e 2 2 2 e e

PDF Google Drive Downloader v1.1

Content text 1.10 Simple Linear Regression.pdf

Related document

PDF Google Drive Downloader v1.1

Title 1.10 Simple Linear Regression.pdf ✅

Content text 1.10 Simple Linear Regression.pdf

Related document