Introduction
This paper contains the solutions to week 8’s application assignment, namely,
Independent-Samples t Test with SPSS. In that assignment there are nine tasks to be done. The first task is to state the statistical assumptions that underlie an independent-Samples t Test test. The second task is to select a dataset and from it select the independent and dependent variables. The third task is to formulate a null and an alternative hypothesis. The fourth task is to use SPSS to calculate an independent-samples t test. The fifth task is to report on the p value and the confidence interval obtained from the test. The sixth task is to interpret the confidence interval. The seventh task is to make a decision on whether or not to accept the null hypothesis. The eighth task is to generate the SPSS syntax and output files. The final task is to report the results of the SPSS analysis using correct APA format.
An Independent-Samples t test is a statistical inference procedure that compares the sample means of two samples derived from two different populations for a given variable. In statistical inference, the descriptive statistic sample mean is taken as the best estimator for the true mean.
Assumptions underlying test
A number of assumptions underlie an independent-samples t test. One of the assumptions is that each piece of data or observation in the datasets being compared in a t test is independent of the other. Furthermore, it is also assumed that the datasets being compared in the t test are themselves independent of each other. Another assumption is that the datasets are assumed to be from two different populations that follow a normal distribution (Frankfort-Nachmias and Nachmias, 2008). In special cases of the t test it is also assumed that these populations have equal variances.
Dependent and independent variables
In this assignment, the goal of the independent-samples t test will be to determine whether the number of people living with HIV/AIDS in two distinct African regions is significantly different. The regions to be compared in the test are eastern and western Africa. Therefore, the datasets used for this assignment comprise of data on HIV/AIDS in Eastern and Western Africa. The datasets are obtained from the article “Sub-Saharan Africa HIV/AIDS statistics”, which gives statistics on HIV/AIDS in Sub-Saharan Africa for the year 2009 (Avert, 2011). The dataset for eastern Africa comprises of data from six countries that make up the Eastern Africa region and is captured in the table shown in Appendix A. The dataset for western Africa comprises of data from fifteen countries that make up the Western Africa region and is captured in the table shown in Appendix B. For these two datasets, the dependent variable is Region and the independent variable is People living with HIV/AIDS.
Null and alternative hypotheses
Given that, the goal of the independent-samples t test in this case is to determine whether the number of people living with HIV/AIDS in eastern and western African is significantly different we formulate the following hypotheses. H0 : σ1 = σ2 and (or versus) H1: σ1 ≠ σ2. H0 is the null hypothesis and when translated it means that there is no significant difference between the number of people living with HIV/AIDS in eastern and western Africa. H1 is the alternative hypothesis and when translated it means that there is a significant difference between the number of people living with HIV/AIDS in eastern and western Africa. From the above hypotheses, it should be noted that the independent-samples t test to be carried out is going to be 2-tailed.
When the null hypothesis is denied when in actuality it should be accepted a first kind of error occurs and when the reverse of this happens a second kind occurs. The first kind of error described is a type I error while the second is a type II error. In hypothesis testing we accept the null hypothesis with a certain level of confidence that there is no a type I error or we accept it on the basis that there is strong or very strong evidence indicating that there is no a type II error. To accept the null hypothesis using the first basis the sample statistic, which is computed from the samples (datasets) being used in the test must lie inside an appropriate confidence interval. To accept the null hypothesis using the second basis the probability of committing a type II (known as p value) error must be less than the probability of committing a type I error (Mason et al, 1999).
The test
To test the above hypotheses using SPSS’s Independent-samples t Test, a SPSS file is created with two variables. The first variable is of a numeric data-type and is called Number_of_people_with_HIVAIDS. The second variable is also of a numeric data-type and is called Region. Region is the grouping variable and is of scale measure whereas Number_of_people_with_HIVAIDS is the test variable and is of nominal measure. The data has been grouped using SPSS’s use specified values option whereby the eastern Africa region is assigned the value 1 and the eastern Africa region is assigned the value 2. Having defined these variables appropriately the SPSS syntax shown in Appendix C is run. The output of this syntax is two tables, which are also shown in Appendix C. By default SPSS uses a 95% level of confidence.
Test results
The first table, which is titled group statistics, shows some descriptive statistics for the two datasets. The sample mean and sample standard deviation obtained from the dataset for eastern Africa are 718833.33 and 718264.691 respectively. The standard error that is used in calculating the test statistic for the test is found to be 293230.332. The sample mean and sample standard deviation obtained from the dataset for western Africa are 314333.33 and 833757.045 respectively. The standard error from this dataset is 215275.143.
The second table shows the results of the independent samples test and is known as independent sample test table. The independent samples test procedure in SPSS constitutes of two tests. In the first test, the variance of the first sample and the variance of the second sample are taken to be equal. The Lavene’s test for equality is used to test if this assumption should be adopted.. In the second test, the variance of the first sample and the variance of the second sample are taken not to be equal. To choose between the results of these two tests which should be adopted we consider the significance value from the result of the Lavene’s test. For these datasets the Lavene’s significance value is 0.445. Since it is greater than 0.10, it is safe to assume that the datasets have equal variances and thus the results of the first test are taken. Therefore, the observed t statistic is 1.040, degrees of freedom are 19, sample statistic is 404500, p value is 0.311and the confidence interval lies between – 409348.613 and 1218348.613.
Interpreting confidence interval and conclusion
Confidence intervals are interpreted as the frequency that they contain the parameter of interest, which for this case is the true mean. The region outside a confidence interval is referred to as the rejection region. If the parameter of interest, which is estimated using a sample statistic, does not fall inside the rejection region, the null hypothesis is taken to be true.
In the test above the sample statistic has a value of 404500, which lies inside the confidence interval. Furthermore, the p value of 0.311 is greater than 0.5, which is the level of significance used in the test. Thus, based on these two facts, the null hypothesis is accepted and the conclusion made that there is no significant difference between the number of people living with HIV/AIDS in eastern and western Africa.
References
Avert. (2011). Sub-Saharan Africa HIV/AIDS statistics. Web.
Frankfort-Nachmias, C., and Nachmias, D., (2008). Research Methods in Social Sciences (7th ed.). Worth publishers; New York. 434-453.
Mason, R. D., Lind, D, A. and Marchal, W. G.. (1999). Statistical techniques in business and economics. (10th ed.) Irwin/McGraw-Hill; USA. 316.