The correlation study focuses on the assumption that there is some interrelation between two variables that cannot be controlled by the researcher. In other words, the correlation is not the causation. For example, the correlation study might suggest that there is an interrelation between public health and self-esteem, but it cannot be proven as some factors such as social relationships, individual features, and others might play a role in the formation of self-esteem.
The criterion for the quantitative assessment of the relation between variables is called correlation coefficients. The interval estimation for correlation coefficients helps to evaluate the strength and weakness of this relationship as well as its form and orientation. If the coefficient of correlation between variables belongs to an ordinal scale, then Spearman coefficient is used while Pearson correlation coefficient is relevant for those variables that belong to the interval scale (Rosner, 2010). Pearson correlation (typically just a “correlation”) between variables might be positive, negative, or absent. Two variables are positively correlated if there is a direct relation between them. In the case of the unidirectional relation, small values of one variable correspond to small values of other variables. Two variables are correlated negatively if there is an inverse or multidirectional relationship between them. In multidirectional ratio of small values, one variable corresponds to large values of the other variable and vice versa. The values of correlation coefficients are always in the range from -1 to +1. Thus, the correlation coefficients estimation allows establishing direct links between. The formula for calculating the correlation coefficients is constructed in such a way that if the relationship between the values is linear, then Pearson’s coefficient accurately establishes the closeness of this connection (Rosner, 2010). Therefore, it is also known as the coefficient of linear Pearson correlation. The calculation of Pearson’s correlation coefficient assumes that x and y variables are normally distributed.
At the same time, intervals might be equal when the difference between the maximum and minimum values in each of the intervals is the same; unequal when, for example, the interval width is gradually increased while the upper range is not completely closed; open, when there is only either the upper or lower boundary; and closed, when there are lower and upper bounds.
The reliability of the interval estimation for the correlation coefficients is determined by the probability of the interval built on the results of the sample containing the unknown parameter of the total population. Probability interval estimation parameter is called the confidence interval. Scholars usually choose it close to the unit as it can be then expected that a series of observations would be properly assessed. In other words, the confidence interval would uncover the true value of the particular parameter. If the confidence interval is close to the unit, then the risk of the error is insignificant. The risk of error is the level of significance, also called the confidence level corresponding to the given interval. The public health studies primarily prefer to determine the confidence level close to 0.95 or 95 %.
The interval estimation for correlation coefficient is not only resolves the issue of dependence between variables but also measures the degree of their relation in the two-dimensional normal distribution of variables. Therefore, in the normal case, one can test the hypothesis and indicate the confidence intervals.
Reference
Rosner, B. (2010). Fundamental of Biostatistics (7th ed.). New York, NY: Cengage Learning.