Correlation, Regression, and Causation

Correlation and regression, how they are similar, how they are different, and for what each is used

Correlation is a statistical technique of data analysis that is used to evaluate the relationship between two groups of numbers. The relationship tested is what is the effect of decrease or increase of one group of numbers to the other group. Correlation technique is also used to test a hypothesis about a relationship, to check the independence assumption between samples and establish the interactions between variables (Godfrey K, 1985). The relationship between two samples or variables can either be direct or indirect. A direct relationship is where the decrease in one sample decreases the values of the other group of variable. Correlation Coefficient is a value used to test to what extents to samples are related. This value ranges between -1.00 and + 1.00. Perfect correlation occurs when the value at 1.00 and +1.00. On the other hand, at 0, the numbers or two samples are non-correlated, hence, there is no relationship between them. The variables to be correlated must be continuous and normal. A correlation coefficient value closer to 0.00 indicates a weak relation while that close to 1 shows a strong relationship between the variables being tested. Correlation enables the researcher to establish the strength of the relationship between two variables. The statistical significance is tested at 0.05 where a value greater than this is rejected.

Regression analysis just like correlation establishes a relationship between two variables. However, regression analysis further draws a line between the two variables. There must be a dependent variable and one or more independent variables (O’Brien PC, 1981). The main aim of this analysis is to predict how dependent variable will behave in relation to the values of independent variable.

Summary of a public health example

This paper will use correlation and simple linear regression to test the relationship between Sodium (Na) and Blood Pressure (Bp). Blood Pressure is the dependent variable while Sodium the independent variable. This will be used when carrying out regression analysis given that there must be independent and dependent variable to get a directional line.

Regression analysis in SPSS

Steps

Go to Analyze on the menu bar move to Regression then select linear.

Move BP to box with label y and Na to a box labeled x., then OK.

Interpretation of coefficient r.

Regression coefficient R = 0.920 (table 1a appendix) which indicates a strong relationship between the two variables with a coefficient determinant R2 =0.847. This shows regression on sodium intake explained by 84.7% variability in BP. The level of sodium in blood explains the differences observed in blood pressure persons. The y intercept is -280.920 while the slope is 64.010 (table 1 b appendix) which forms an equation as follows: Y= -280.910+ 64.01x. This equation implies that for every increase in Sodium by 64.01g causes an increase in blood pressure by 1 mm/Hg. The significant of the r of 0.920 is < 0.001(table 2b appendix), hence, statistical significance in the relationship between the two variables. Correlation analysis table also indicates Pearson correlation of 0.920 with sig. (2-tailed) of 0.000 >0.001(table 2a appendix).

Comparison between correlation and regression using an example from public health

Regression analysis requires the researcher to define dependent and independent variables unlike correlation where there is no direction. Coefficient of the independent variable and its significant and overall model fit are used in the interpretation of the results while correlation uses correlation coefficient, significance and coefficient of determination to interpret the results (Wassertheil-Smoller S., 1990). While correlation only indicates the relationship between two numbers, regression further predicts the behavior of dependent variable basing on the independent ones.

Correlation is not causation,” implies that Sodium should not be taken as the only cause for blood pressure. There are other factors that can cause the change in blood pressure (Holland, 1986).

References

Godfrey, K. (1985). Simple linear regression in medical research. N Engl J Med, 313(26): 1629-36.

Holland, P. W. (1986). Statistics and Causal Inference. Journal of the American Statistical Association, 81(396): 945-960.

O’Brien P. C., & Shampo, M. A. (1981). Statistics for clinicians. 7. Regression. Mayo Clin Proc, 56(7):452-4.

Wassertheil-Smoller, S. (1990). Biostatistics and epidemiology: a primer for health professionals. New York: Springer-Verlag.

Appendix

Model Summary
Model R R Square Adjusted R Square Std. Error of the Estimate
1 .920a .847 .827 6.163
a. Predictors: (Constant), Na (g)

Table 1 a regression analysis for BP and Na

Coefficientsa
Model Unstandardized Coefficients Standardized Coefficients t Sig. 95.0% Confidence Interval for B
B Std. Error Beta Lower Bound Upper Bound
1 (Constant) -280.910 68.541 -4.098 .003 -438.965 -122.854
Na (g) 64.010 9.636 .920 6.643 .000 41.789 86.231
a. Dependent Variable: Blood pressure (mm Hg)

Table 1 b regression analysis for BP and Na

Correlations
Na (g) Blood pressure (mm Hg)
Na (g) Pearson Correlation 1.000 .920**
Sig. (2-tailed) .000
N 10.000 10
Blood pressure (mm Hg) Pearson Correlation .920** 1.000
Sig. (2-tailed) .000
N 10 10.000

Table 2a correlation analysis for BP and Na.

ANOVA
Model Sum of Squares df Mean Square F Sig.
1 Regression 1675.776 1 1675.776 44.125 .000a
Residual 303.824 8 37.978
Total 1979.600 9
a. Predictors: (Constant), Na (g)
b. Dependent Variable: Blood pressure (mm Hg)

Table 2b Table ANOVA table for regression analysis for BP and Na.

Cite this paper

Select style

Reference

StudyCorgi. (2022, April 25). Correlation, Regression, and Causation. https://studycorgi.com/correlation-regression-and-causation/

Work Cited

"Correlation, Regression, and Causation." StudyCorgi, 25 Apr. 2022, studycorgi.com/correlation-regression-and-causation/.

* Hyperlink the URL after pasting it to your document

References

StudyCorgi. (2022) 'Correlation, Regression, and Causation'. 25 April.

1. StudyCorgi. "Correlation, Regression, and Causation." April 25, 2022. https://studycorgi.com/correlation-regression-and-causation/.


Bibliography


StudyCorgi. "Correlation, Regression, and Causation." April 25, 2022. https://studycorgi.com/correlation-regression-and-causation/.

References

StudyCorgi. 2022. "Correlation, Regression, and Causation." April 25, 2022. https://studycorgi.com/correlation-regression-and-causation/.

This paper, “Correlation, Regression, and Causation”, was written and voluntary submitted to our free essay database by a straight-A student. Please ensure you properly reference the paper if you're using it to write your assignment.

Before publication, the StudyCorgi editorial team proofread and checked the paper to make sure it meets the highest standards in terms of grammar, punctuation, style, fact accuracy, copyright issues, and inclusive language. Last updated: .

If you are the author of this paper and no longer wish to have it published on StudyCorgi, request the removal. Please use the “Donate your paper” form to submit an essay.