Introduction
The current paper will use the chi-square test in order to check whether there was an association between the class that female passengers traveled in on Titanic, and their chances of being saved from that ship when it sank. After a brief data file description, the assumptions of the test are checked, the research question, hypotheses, and alpha-level are stated, and the results of the test are reported. The paper is concluded with an analysis of the strengths and limitations of the chi-square test.
Data File Description
The current paper will use the chi-square test to compare the probability of being saved for first-class female passengers of Titanic to that of the third class female passengers (Warner, 2013, p. 343). The first variable is Class, which is nominal and dichotomous (1=first class, 2=third class). The second variable is Saved, which is also nominal and dichotomous (1=saved, 2=lost). The sample size N=309.
Testing Assumptions
The two main assumptions of the test are met: there are two categorical variables, each of which consists of two independent groups (Field, 2013).
Another assumption is that of the expected frequencies: for a 2×2 table, no expected frequency should be lower than 5 (George & Mallery, 2016).
As can be seen from Table 1 below, the expected frequencies of survival for the first-class female passengers are 100.7, whereas the expected frequencies of dying for these passengers is 43.3. On the other hand, the expected frequencies of survival for the third class female passengers are 115.4, whereas the expected frequencies of dying for these passengers is 49.7.
Therefore, the assumption of expected frequencies is met. Consequently, Pearson’s χ2 statistic should be reported in the chi-square test section below (Field, 2013).
Table 1. SPSS crosstabulation output.
Research Question, Hypothesis, and Alpha Level
The research question for the χ2 test is as follows: “Is there an association between the class that female passengers traveled in (first vs. third), and the chances of survival after the Titanic sank?” The null hypothesis is that there was no association between the class that women traveled in, and their chances of survival. The alternative hypothesis is that there was an association between the class that women traveled in, and their chances of survival. No additional rationale for choosing the α-level is given, so the standard α=.05 will be used.
Interpretation
Because the expected frequencies assumption was met, Pearson’s χ2 statistic will be reported.
Observed Frequencies
Table 2. Observed frequencies for the given variables (according to the example provided by Warner (2013, p. 322)).
Table 2 above provides the numbers of observed frequencies. As can be seen, in the first class, 140 female passengers were saved, whereas 4 passengers were not saved. On the other hand, in the third class, only 76 passengers were saved, whereas 89 passengers were not saved.
The Phi Coefficient and Effect Size
Table 3. The SPSS symmetric measures output.
Table 3 above supplies the symmetric measures output. It can be seen that φ=.556. It measures the association between two categorical variables (Field, 2013, sec. 18.5.3), falls between 0 and 1, and is a version of Pearson’s r for such variables (Warner, 2013). The obtained value of φ can be interpreted as a positive association of moderate strength (Warner, 2013).
Chi-Square Tests
Table 4. The SPSS chi-square tests output.
Table 4 above supplies the SPSS output for the chi-square test. It is easy to see that χ2(1)=95.666, and it is highly significant: p<.001. Therefore, it is possible to reject the null hypothesis of the test, and state that evidence has been found to support the alternative hypothesis that there was a statistically significant association between the class that female passengers traveled in on Titanic and their chances of being saved from the sinking ship.
Odds
From Table 2 above, it can be seen that from the first class, 140 females were saved, and 4 died. Thus, the odds of survival for women from the first class were 140/4=35. On the other hand, for the third class, 76 females survived, while 89 were abandoned. Thus, the odds of survival for women from the third class were 76/89=0.854.
Therefore, the odds ratio of survival comparing the first-class females to the third class women was: 35/0.854=40.984. Therefore, for the female passengers of the first class, the odds of survival were much higher than those for women from the third class.
Conclusion
Therefore, the chi-square analysis was conducted in order to find out whether the chances of being saved from the sinking Titanic were different for female passengers of the first and the third classes. It was unveiled that there was a significant association between the class that the passengers traveled in, and whether or not they would be saved: χ2(1)=95.666; the difference was highly significant, p<.001. The odds ratio was 40.984; in other words, the odds of surviving for the first-class female passengers were 40.984 times higher than those for the third class female passengers. In addition, there was a positive association of moderate strength between the class and the survival status of the passengers: φ=.556.
As for the weaknesses of the chi-square tests, these tests are rather susceptible to the limited sample size (Field, 2013). In addition, if the expected values are low, this results in a need for correction. Furthermore, the test requires that the data is frequency data, so percentage data cannot be used for this analysis. On the other hand, there are a number of advantages to the chi-square test. For example, it allows for comparing categorical data; it requires a very low number of assumptions to be met; and it is relatively easy to calculate and interpret (Field, 2013; Warner, 2013).
References
Field, A. (2013). Discovering statistics using IBM SPSS Statistics (4th ed.). Thousand Oaks, CA: SAGE Publications.
George, D., & Mallery, P. (2016). IBM SPSS Statistics 23 step by step: A simple guide and reference (14th ed.). New York, NY: Routledge.
Warner, R. M. (2013). Applied statistics: From bivariate through multivariate techniques (2nd ed.). Thousand Oaks, CA: SAGE Publications.