A 標準化後的回歸直線 regression line is a statistical measure that indicates the magnitude of an explanatory variable’s effect on a dependent variable. It is often used in multiple regression analyses when the variables are measured on different scales (for example, income measured in dollars and family size measured in number of individuals). In these cases, standardization is a method of adjusting for differences in the sample variances of the variables.
Using standardized coefficients enables comparisons across studies that use the same model and population. However, a change in the standardized coefficient estimate due to a change in sample standard deviations could be attributed to either a real difference in the underlying relationship between the two variables or to a bias caused by the particularities of the involved data samples (e.g., skewed, asymmetric or multimodal distributions).
Step-by-Step Tutorial on Calculating the Standardized Regression Line
In general, a standardized regression coefficient is interpreted in the same way as a correlation coefficient, and the closer the value of the standardized coefficient is to 1 or -1, the stronger the relationship between the variables. However, some authors report unstandardized regression coefficients in addition to standardized ones. This can lead to confusion because the interpretation of unstandardized regression coefficients depends on the original scale of the outcome and predictor variables, whereas standardized regression coefficients depend only on the mean and standard deviation of the sample standard deviations of the original variables.
This article explains how the standardized regression coefficients computed by the STB option in PROC REG (and other SAS regression procedures) relate to the parameter estimates for the original variables, and it demonstrates how to calculate the same t-test statistic for a standardized regression coefficient as for an unstandardized one. Furthermore, it shares some practical recommendations for meta-analysts wanting to combine standardized and unstandardized regression coefficients in their pooling results. Finally, it defines four different conditions that have to be satisfied in complete data for a regression coefficient to be a valid standardized coefficient.