Repeated Measures ANOVA: Analysis of the Statistics

Words: 2678 Pages: 10

Table of Contents

Exploratory Data Analysis
Repeated Measures ANOVA
Applying Analytical Strategies to an Area of Research Interest
References
Appendix

Exploratory Data Analysis

A high school teacher is interested in determining whether his students’ test scores increase over the course of a 12 Week period. Descriptive statistics for students’ test scores for the 12 Week period are provided in Table 1. Of the 12 students who participated in this study, the mean pretest score was 29.58 with a standard deviation of 12.22, the mean for Week 2, Week 4, Week 6, Week 8, Week 10 and Week 12 scores with standard deviations were: 33.08, SD = 10.11; 35.42, SD = 9.89; 35.67, SD = 10.67; 39.92, SD = 9.72; 45.67, SD = 8.69 and 50.00, SD =10.19 respectively. The minimum pretest score was 16 and the maximum score was 59 whereas the minimum Week 2 score was 22 and the maximum was 60.

During Week 4, the minimum score increased to 27 whereas the maximum score increased to 63. The minimum and maximum scores for Week 6 were 20 and 60 respectively (indicating a slight decrease in the maximum score) while the minimum and maximum scores for Week 8 were 28 and 65 respectively. During Week 10, the minimum score increased to 33 whereas the maximum score increased to 67. Finally, in Week 12, the minimum test score reached the highest score in the entire course period (34) which was similar to the maximum score (73). All the skewness and kurtosis values for the scores in the entire course period were positive indicating that the scores were symmetrical.

Figure 1 gives a visual display of test scores over the 12 weeks course according to gender. The clustered bar charts with error bars clearly indicate that the mean test scores for females had an increasing trend throughout the course. However, test scores for males were high in Week 2, declined slightly in Week 4 and Week 6 and then increased consecutively up to the end of the course (Week 12). It is also evident that in general, males had higher mean scores in all the test periods except in Week 6 where the mean scores for females were slightly higher. These findings were confirmed in Table 2 which displays mean test scores for males and females over the 12-weeks long course.

Table 2 clearly indicates that the mean pretest score for males (32.25, SD =19.43) was higher than that of females (28.27, SD = 8.17). Week 2 mean score was also higher for males (39.75, SD = 13.89) compared to the mean score for females (29.75, SD = 6.32). This was the same trend in Week 4, males (39.00, SD = 16.43) against females (33.62, SD = 5.18). It is notable that the mean for males reduced slightly compared to Week 2 scores. During Week 6, the mean score for males dropped (35.25, SD = 17.80) and females had a slightly higher mean score (35.88, SD = 6.56).

The mean test scores for males however increased and exceeded those of females during Week 8, males (41.00, SD = 16.63) against females (39.38, SD = 5.37). The mean test scores for males in Week 10 (47.25, SD = 13.96) was still higher compared to that of females (44.88, SD = 5.74) and this trend was maintained in Week 12 test, males (53.25, SD = 10.19) versus female’s (48.38, SD = 8.52).

Repeated Measures ANOVA

Repeated-measures ANOVA was conducted to determine whether student’s scores increased over a 12 weeks course. The Wilk’s lambda for scores is significant, F = 20.44, p =.002 and this is greater than.05 but this it is not significant, F =.80, p =.607 for scores*gender (Table 3). All the other F values in Table 3 are signs indicating that the assumption of sphericity has been violated. Table 6 shows that there is no interaction between scores*gender, F(1, 25.54) =.40, p>.05 since the F critical value is greater than.05 (p =.539). This implies that gender and different time periods when tests are taken do not combine to affect the test scores attained by students.

The lower bound for epsilon as per Table 4 is.167 and this is the lowest value the epsilon can be. Sphericity for this data has been violated since Mauchly’s W is significant, W(20, 56.88) =.001, p =.001 (this is less than.05) as shown in Table 4. This indicates that the variances differences between the differences of scores in different time periods are unequally calling for adjustment using the Greenhouse-Geisser epsilon or a multivariate test. In this case, the Greenhouse-Geisser epsilon was used.

From the sphericity assumed row in Table 5 sphericity F(6, 3246.54) = 20.61, p<.001 has been violated. This has been corrected using the Greenhouse-Geisser epsilon but the F value still remains significant F(2.65, 3246.54) = 20.61, p<.05 hence sphericity is still violated and the critical values F are not valid. It is however notable that in the Scores*Gender test, the F value is significant, F(2.65, 3246.54) = 20.61, p=.341 (greater than.05) after Greenhouse-Geisser epsilon adjustment. The F value is therefore valid and the variances differences between the differences of gender and times scores are assumed to be equal.

There is no main effect of gender, F( 1, 290.72) =.480, p>.05 as indicated in Table 7 since the F value for gender is not significant (p =.504 and this greater than.05). This indicates that the gender of the student does not act significantly to determine the scores attained by students at any time of the test during the 12-week course. There was no need for post-hoc tests since gender has less than three groups yet post hoc tests are performed if the a variable has at least three groups.

There is a main effect of time of test scores, F(1, 2962.68) = 46.91, p<.05 as indicated by a significant F value of p =.001 (Table 6). This implies that there is a change (increase) of scores from Week 0 to Week 12. To understand at what times scores increased or decreased in the course period, a post hoc test (LSD) was conducted.

After conducting LSD post hoc test (Table 8), it was evident that there was no significant increase in test score, p =.213 from Week 0 to Week 2 (91 CI range from -12.04 to 3.04). However, there was a significant increase in test score, p =.031 from Week 0 to Week 4 as was the case in Week 0 to Week 6, p =.031. A comparison between Week 0 and Week 8 shows that there was a significant increase in test scores, p =.004 and this was also observed in Week 10, p =.001. From Week 0 to the end of the course (Week 12), there was a significant increase in test score, p =.001. Comparing Week 2 to Week 4, there was no significant increase in test score (p =.335 and this is greater than.05).

In the same way, there was no significant increase in test score from Week 2 to Week 6, p =.734 but there was a significant increase in test score when comparing Week 2 with Week 8, p =.018. Comparing Week 4 with Week 6, there was a significant increase in test score, p =.510 but there was a significant increase in test score when comparing Week 4 with Week 8 scores, p =.004. Comparing Week 6 to Week 8 scores, there was a significant increase in test scores, p =.001. From Week 8 to Week 12, the test scores increased significantly, p <.05. The findings in Table 8 therefore indicate that there was an overall increase in test score with time except in a few cases as explained by the post-hoc analysis.

Applying Analytical Strategies to an Area of Research Interest

I have an interest in working as an International Police Advisor (IPA) for the benefit of our Homeland Security. As an International Police Advisor, there are several responsibilities that I will be required to handle. Partnering with the U.S. military personnel to provide skilled persons in civilian law enforcement will be one of the main responsibilities. International Police Advisors can serve in Border and Point of Entry in order to enhance law at the borders and particularly prevent entry of illegal persons, drugs, and weapons among other crimes. A great concern to the United States Homeland Security has been the identification of persons who are likely to engage in acts of terrorism or general crime.

Positive identification of likely criminals requires a thorough understanding of human behavior. In specific, an understanding of past criminal activities can be used to predict the likelihood of engaging in unlawful practices. In relation to my area of interest as an IPA, I would suggest conducting research to identify whether there is any relationship between prior engagement in misconduct, past experiences and delinquency and the likelihood of engaging in serious criminal activities.

This can form a concrete and dependable ground for criminal investigations. In this study, it would be advisable to examine individual’s prior marriage relationships in order to predict involvement in crime. This is based on Sampson and Laub (1993) argument that if persons who are in the verge of entering adulthood get involved in marriage relationships, their likelihood in engaging in crime is significantly reduced. As such, persons with unstable or no marriage relationships in the transitory period are more likely to engage in criminal activities.

Xu (2006) observes that there is tendency of having persistent antisocial behaviors in the entire life of a person. In that case, it is possible to study the existence of juvenile delinquency as a predictor of adulthood delinquency. On the same aspect, identifying various social factors such as social networks as well as social capital would help predict involvement in criminal behaviors. In addition, this study would look into whether gender influences involvement in antisocial activities. From the understanding of the above questions, it would be possible to prevent criminals from perpetrating their heinous acts thus proving beneficial to the Department of Homeland Security.

According to Leech, Barrett and Morgan (2005), repeated measures ANOVA is most suitable when there is a single independent variable which has at least two levels that are repeated measures and a single dependent variable. Also notable is that repeated measures ANOVA is carried out when “the same measurement is made several times on each subject or the same measurement is made on several related subjects” (Leech, Barrett and Morgan 2005, p. 163). In this case, gender of the participant would be an independent nominal variable (0 = Female, 1 = Male). On the other handle, number of criminal records is the dependent variable and this would be scale data.

The number of criminal records would be determined for the participants (males and females) during the age of 15 years, 20 years, 25, 30 years, years, 35 years and 40 years indicating that there would be six levels of measurement. Gender would be a fixed factor whereas number of criminal records would be a repeating factor measured during the six time periods. Using these variables, it would be possible to determine whether there exists a main effect of gender on delinquency and the main effect of different ages on delinquency.

To identify whether the assumption of sphericity is violated in the data, one is supposed to look at the Mauchly’s Test of Sphericity output. If the Mauchly’s W value is significant (p<.05), this indicates that sphericity is violated, otherwise the assumption of sphericity holds. If the assumption of sphericity is violated, it implies that the critical F values are invalid (the variances of differences between the variables are unequal) and this would require correction. Correction of the F-value is made by adjusting the degrees of freedom for the F-value with a factor that is generated as the Greenhouse-Geisser epsilon or Huynh and Feldt epsilon (Field, 2009).

References

Field, A. (2009) Discovering statistics using SPSS (3rd Ed.). Los Angeles: Sage. Web.

Leech, N. L., Barrett, K. C. and Morgan, G. A. (2005). SPSS for intermediate statistics: use and interpretation (2^nd Ed.). New Jersey, NJ: Lawrence Erlbaum Associates, Inc., Publishers. Web.

Sampson, R. J., and Laub, J. H. (1993). Crime in the making: pathways and turning points through life. Cambridge, MA: Harvard University Press.

Xu, Q. (2006). From juvenile delinquency to adult criminal behavior: expanding the state dependence perspective on persistent criminal behavior. Not Published. Web.

Appendix

Table 1: Descriptive Statistics for Student’s Test Scores over a 12 Week Period.

Descriptive Statistics
	N	Range	Minimum	Maximum	Mean		Std. Deviation	Variance	Skewness		Kurtosis
	Statistic	Statistic	Statistic	Statistic	Statistic	Std. Error	Statistic	Statistic	Statistic	Std. Error	Statistic	Std. Error
Participant	12	11	1	12	6.50	1.041	3.606	13.000	.000	.637	-1.200	1.232
Pre-test score	12	43	16	59	29.58	3.528	12.221	149.356	1.254	.637	1.889	1.232
Week 2 score	12	38	22	60	33.08	2.919	10.113	102.265	1.797	.637	4.305	1.232
Week 4 score	12	36	27	63	35.42	2.854	9.885	97.720	2.199	.637	5.778	1.232
Week 6 score	12	40	20	60	35.67	3.081	10.671	113.879	.771	.637	1.363	1.232
Week 8 score	12	37	28	65	39.92	2.805	9.718	94.447	1.483	.637	3.676	1.232
Week 10 score	12	34	33	67	45.67	2.509	8.690	75.515	1.119	.637	2.747	1.232
Week 12 score	12	39	34	73	50.00	2.941	10.189	103.818	.769	.637	1.359	1.232
Valid N (listwise)	12

Table 2: Descriptive Statistics for Test Scores according to Gender during the 12 Weeks Course.

Descriptive Statistics
	Gender	Mean	Std. Deviation	N
Pre-test score	Female	28.25	8.172	8
	Male	32.25	19.432	4
	Total	29.58	12.221	12
Week 2 score	Female	29.75	6.319	8
	Male	39.75	13.889	4
	Total	33.08	10.113	12
Week 4 score	Female	33.62	5.181	8
	Male	39.00	16.432	4
	Total	35.42	9.885	12
Week 6 score	Female	35.88	6.556	8
	Male	35.25	17.802	4
	Total	35.67	10.671	12
Week 8 score	Female	39.38	5.370	8
	Male	41.00	16.633	4
	Total	39.92	9.718	12
Week 10 score	Female	44.88	5.743	8
	Male	47.25	13.961	4
	Total	45.67	8.690	12
Week 12 score	Female	48.38	8.518	8
	Male	53.25	13.793	4
	Total	50.00	10.189	12

Table 3: Multivariate Tests for Test Scores and Gender.

Multivariate Tests^b
	Effect	Value	F	Hypothesis df	Error df	Sig.	Partial Eta Squared
scores	Pillai’s Trace	.961	20.439^a	6.000	5.000	.002	.961
	Wilks’ Lambda	.039	20.439^a	6.000	5.000	.002	.961
	Hotelling’s Trace	24.526	20.439^a	6.000	5.000	.002	.961
	Roy’s Largest Root	24.526	20.439^a	6.000	5.000	.002	.961
scores * Gender	Pillai’s Trace	.491	.804^a	6.000	5.000	.607	.491
	Wilks’ Lambda	.509	.804^a	6.000	5.000	.607	.491
	Hotelling’s Trace	.965	.804^a	6.000	5.000	.607	.491
	Roy’s Largest Root	.965	.804^a	6.000	5.000	.607	.491
a. Exact statistic
b. Design: Intercept + Gender Within Subjects Design: scores

Table 4: Mauchly’s Test of Sphericity for Test Score Data.

Mauchly’s Test of Sphericity^b
Measure:MEASURE_1
Within Subjects Effect	Mauchly’s W	Approx. Chi-Square	df	Sig.	Epsilon^a
					Greenhouse-Geisser	Huynh-Feldt	Lower-bound
scores	.001	56.876	20	.000	.441	.674	.167
Tests the null hypothesis that the error covariance matrix of the orthonormalized transformed dependent variables is proportional to an identity matrix.
a. May be used to adjust the degrees of freedom for the averaged tests of significance. Corrected tests are displayed in the Tests of Within-Subjects Effects table.
b. Design: Intercept + Gender Within Subjects Design: scores

Table 5: Tests of Within-Subjects Effects of Scores and Scores*Gender Interaction.

Tests of Within-Subjects Effects
Measure:MEASURE_1
Source		Type III Sum of Squares	df	Mean Square	F	Sig.	Partial Eta Squared
scores	Sphericity Assumed	3246.536	6	541.089	20.609	.000	.673
	Greenhouse-Geisser	3246.536	2.646	1227.164	20.609	.000	.673
	Huynh-Feldt	3246.536	4.045	802.659	20.609	.000	.673
	Lower-bound	3246.536	1.000	3246.536	20.609	.001	.673
scores * Gender	Sphericity Assumed	182.155	6	30.359	1.156	.342	.104
	Greenhouse-Geisser	182.155	2.646	68.853	1.156	.341	.104
	Huynh-Feldt	182.155	4.045	45.035	1.156	.344	.104
	Lower-bound	182.155	1.000	182.155	1.156	.307	.104
Error(scores)	Sphericity Assumed	1575.321	60	26.255
	Greenhouse-Geisser	1575.321	26.456	59.546
	Huynh-Feldt	1575.321	40.447	38.948
	Lower-bound	1575.321	10.000	157.532

Table 6: Tests of Within Subjects Contrasts for Scores and Scores*Gender.

Tests of Within-Subjects Contrasts
Measure:MEASURE_1
Source	scores	Type III Sum of Squares	df	Mean Square	F	Sig.	Partial Eta Squared
scores	Linear	2962.680	1	2962.680	46.905	.000	.824
	Quadratic	143.040	1	143.040	4.305	.065	.301
	Cubic	51.361	1	51.361	2.242	.165	.183
	Order 4	73.724	1	73.724	2.765	.127	.217
	Order 5	3.584	1	3.584	.405	.539	.039
	Order 6	12.147	1	12.147	4.444	.061	.308
scores * Gender	Linear	25.537	1	25.537	.404	.539	.039
	Quadratic	21.254	1	21.254	.640	.442	.060
	Cubic	66.694	1	66.694	2.911	.119	.225
	Order 4	55.767	1	55.767	2.092	.179	.173
	Order 5	5.060	1	5.060	.572	.467	.054
	Order 6	7.841	1	7.841	2.869	.121	.223
Error(scores)	Linear	631.638	10	63.164
	Quadratic	332.272	10	33.227
	Cubic	229.083	10	22.908
	Order 4	266.594	10	26.659
	Order 5	88.403	10	8.840
	Order 6	27.330	10	2.733

Table 7: Tests of Between-Subjects Effects Significance Table.

Tests of Between-Subjects Effects
Measure:MEASURE_1 Transformed Variable:Average
Source	Type III Sum of Squares	df	Mean Square	F	Sig.	Partial Eta Squared
Intercept	114349.339	1	114349.339	188.733	.000	.950
Gender	290.720	1	290.720	.480	.504	.046
Error	6058.804	10	605.880

Table 8: Post-hoc Test to Explain the Main Effect of Time.

Pairwise Comparisons
Measure:MEASURE_1
(I) scores	(J) scores	Mean Difference (I-J)	Std. Error	Sig.^a	95% Confidence Interval for Difference^a
(I) scores	(J) scores	Mean Difference (I-J)	Std. Error	Sig.^a	Lower Bound	Upper Bound
1	2	-4.500	3.383	.213	-12.039	3.039
	3	-6.062^*	2.412	.031	-11.437	-.688
	4	-5.312^*	2.117	.031	-10.029	-.596
	5	-9.937^*	2.693	.004	-15.938	-3.937
	6	-15.813^*	2.683	.000	-21.791	-9.834
	7	-20.563^*	3.524	.000	-28.415	-12.710
2	1	4.500	3.383	.213	-3.039	12.039
	3	-1.562	1.542	.335	-4.998	1.873
	4	-.812	2.324	.734	-5.990	4.365
	5	-5.437^*	1.914	.018	-9.701	-1.174
	6	-11.312^*	2.063	.000	-15.909	-6.716
	7	-16.062^*	3.154	.000	-23.089	-9.036
3	1	6.062^*	2.412	.031	.688	11.437
	2	1.562	1.542	.335	-1.873	4.998
	4	.750	1.097	.510	-1.693	3.193
	5	-3.875^*	1.058	.004	-6.233	-1.517
	6	-9.750^*	1.336	.000	-12.726	-6.774
	7	-14.500^*	2.739	.000	-20.603	-8.397
4	1	5.312^*	2.117	.031	.596	10.029
	2	.812	2.324	.734	-4.365	5.990
	3	-.750	1.097	.510	-3.193	1.693
	5	-4.625^*	1.019	.001	-6.895	-2.355
	6	-10.500^*	1.202	.000	-13.177	-7.823
	7	-15.250^*	2.711	.000	-21.291	-9.209
5	1	9.937^*	2.693	.004	3.937	15.938
	2	5.437^*	1.914	.018	1.174	9.701
	3	3.875^*	1.058	.004	1.517	6.233
	4	4.625^*	1.019	.001	2.355	6.895
	6	-5.875^*	.716	.000	-7.471	-4.279
	7	-10.625^*	2.065	.000	-15.226	-6.024
6	1	15.813^*	2.683	.000	9.834	21.791
	2	11.312^*	2.063	.000	6.716	15.909
	3	9.750^*	1.336	.000	6.774	12.726
	4	10.500^*	1.202	.000	7.823	13.177
	5	5.875^*	.716	.000	4.279	7.471
	7	-4.750^*	1.705	.019	-8.548	-.952
7	1	20.563^*	3.524	.000	12.710	28.415
	2	16.062^*	3.154	.000	9.036	23.089
	3	14.500^*	2.739	.000	8.397	20.603
	4	15.250^*	2.711	.000	9.209	21.291
	5	10.625^*	2.065	.000	6.024	15.226
	6	4.750^*	1.705	.019	.952	8.548
Based on estimated marginal means
a. Adjustment for multiple comparisons: Least Significant Difference (equivalent to no adjustments).
*. The mean difference is significant at the.05 level.

*Figure 1: Clustered bar chart with error bars for gender against mean scores over 12 week period course.*