* Display syntax commands in the Output Viewer .
SET Printback=On Length=59 Width=80.

* ---------------------------------------------------------------------- 
File:		anova1.sps
Author:   	Bruce Weaver, weaverb@mcmaster.ca
Date:		9-May-2002
Notes:	Example of one-way ANOVA using SPSS .
* ---------------------------------------------------------------------- .

* NOTE:  This syntax has been tested in version 11 of SPSS. It may
*	    be that some things are slightly different if you have another
*	    version of SPSS .

* Read in some data on 3 different painkillers vs control .

DATA LIST LIST / group (f2.0) y (f3.1).
BEGIN DATA.
1	3.6
1	4.4
1	6.9
1	4.2
1	5.8
1	6.6
2	6.2
2	6.3
2	7.6
2	5.4
2	4.7
2	5.9
3	7.4
3	7.7
3	8.2
3	3.4
3	5.4
3	6.0
4	8.2
4	8.8
4	8.7
4	6.5
4	7.5
4	7.7
end data.

var lab y 'Score'.
val lab group
	1	'Aspirin'
	2	'Aceto'
	3	'Ibuprof'
	4	'Placebo'.

* 	List cases to show the structure of the data file .

list all.

* 	Note that there are 2 columns in the data file, 1 to code 
*	group membership (must be a numeric code in SPSS), and
*	one to record the DV for that subject (variable Y is the DV) .

* Graphs-->Bar .

GRAPH
  /BAR(SIMPLE)=MEAN(y) BY group
  /MISSING=REPORT
  /TITLE= 'Painkillers vs Placebo'.

*	Use INTERACTIVE GRAPHICS to produce graph with error bars .

* 	Graphs-->Interactive-->Bar .

IGRAPH /VIEWNAME='Bar Chart' 
	/X1 = VAR(group) TYPE = CATEGORICAL 
	/Y = VAR(y) TYPE = SCALE 
	/COORDINATE = VERTICAL  
	/TITLE='Painkillers vs Placebo' 
	/X1LENGTH = 3.0 
	/YLENGTH = 3.0 
	/X2LENGTH = 3.0 
	/CHARTLOOK = 'Grayscale.clo'
	/BAR(MEAN) KEY=ON SHAPE = RECTANGLE BASELINE = AUTO 
	/ERRORBAR CI(95.0)  DIRECTION = BOTH CAPWIDTH (45) CAPSTYLE = T.
EXE.


* 	There are several ways to perform a one-way ANOVA in SPSS.

* -----------  Using MEANS Procedure to perform one-way ANOVA  -------- .

* 	Perhaps the simplest is to use the /statistics=ANOVA subcommand 
*	of the MEANS procedure, as follows.

* 	ANALYZE-->COMPARE MEANS-->MEANS .

means y by group 
 /cells = count mean stddev variance 
 /stat=anova.

* 	The eta-squared you see in the output is computed as SS_Between 
*	divided by SS_Total; it is sometimes called the "correlation ratio",
*	and is similar to r-squared for a regression analysis.  In other 
*	words, it indicates the proportion of total variance that is
*	explained by the indpendent variable.

* ------  Using One-way ANOVA Procedure to perform one-way ANOVA  -------- .

* 	If you look under Analyze-->Compare Means in the pull-down menus, 
*	you will see that the last option is One-way ANOVA.  Here is some
*	syntax for that procedure (including descriptive statistics and a 
*	plot of the treatment means) .

* 	ANALYZE-->COMPARE MEANS-->ONE-WAY ANOVA .

ONEWAY
  y BY group
  /STATISTICS DESCRIPTIVES
  /PLOT MEANS
  /POSTHOC = DUNNETT ALPHA(.05)
  /MISSING ANALYSIS .


* 	Eta-squared is not available in the output using this method.
* 	However, a number of multiple comparison methods are available.
* 	For example, Dunnett's test compares each of k-1 treatment groups 
* 	to a control.  You will find it in the Post-hoc dialog box.

*	Dunnett's test shows that Aspirin is signicantly different
*	than Control ( p = .006); the difference between ACETO and
*	control is marginal (p = 0.054); and the difference between
*	IBUPROF and Control is not statistically significant (p = .128).

* 	Finally, note that a line graph showing the means is optional.
* 	Given that the IV is categorical and not ordinal, however, a
*	bar graph (as shown above) would be a better graph for these data.

* ------  Using GLM UNIVARIATE Procedure to perform one-way ANOVA  -------- .

* 	The first option under Analyze-->General Linear Model in the pull-down
*	menus is UNIVARIATE.  Here's how to perform a one-way ANOVA using
*	GLM UNIVARIATE (again with descriptive statistics and a plot of means).

* 	ANALYZE-->GENERAL LINEAR MODEL-->UNIVARIATE .

UNIANOVA
  y  BY group
  /METHOD = SSTYPE(3)
  /INTERCEPT = INCLUDE
  /POSTHOC = group ( DUNNETT)
  /PLOT = PROFILE(group)
  /EMMEANS = TABLES(group)
  /PRINT = DESCRIPTIVE
  /CRITERIA = ALPHA(.05)
  /DESIGN = group .

* 	Note that the ANOVA summary table produced by this procedure is a bit
*	different--it has some extra rows we have not seen before.

* 	The rows you have seen before are the ones labeled GROUP, ERROR
* 	and CORRECTED TOTAL; the SS, MS, and F values on these rows
*	match what you saw on the BETWEEN, WITHIN, and TOTAL rows
*	in the previous analyses .

* 	You will also see an R-squared value of .397 below the ANOVA
*	summary table; this is identical to the eta-squared value we
*	saw earlier (in the output from the MEANS procedure). It 
*	indicates that 39.7% of the total variation in the DV is
*	explained by the IV .

* 	Finally, note that in all of the preceding output, the good folks
*	at SPSS have use the column heading "Sig." where they should
*	have use "p" .

* =========================================================================== .

* 	APPENDIX:  DOING ONE-WAY ANOVA CALCUATIONS "BY HAND" IN SPSS .

* 	In class, you learned that the total sum of squares can be
*	partitioned into 2 components:  SS(between-groups) and
*	SS(within-groups).

* 	Another way to think of this is to start by observing that
*	not all subjects have the same score.  In other words,
*	there is variability in Y, the dependent variable.

* 	Each person's deviation from the overall mean of Y can be 
*	broken down into 2 components:  
*	[1] The deviation of the raw score from the GROUP MEAN; and
*	[2] The deviation of the GROUP MEAN from the GRAND MEAN .

* 	If we generate the GROUP MEANS and the OVERALL MEAN, we can then
*	use COMPUTE statements to carry out the ANOVA "by hand", so
*	to speak, before we do it the easy way (i.e., using the
*	built-in procedures).  There is some benefit in doing it
*	"by hand" first, because you have to understand what you
*	are doing to do it that way.

* 	First, use AGGREGATE to save the group means to file "C:\grpmeans.sav".
*	Data must be sorted by GROUP.

sort cases by group(a).

*	In the pull down menus, when the Data Editor is active:
*	Data-->Aggregate .

AGGREGATE
  /OUTFILE='C:\grpmeans.sav'
  /BREAK=group
  /n_grp = n
  /grpmean = MEAN(y).

*	Now save GRAND mean to file "c:\grand.sav".
*	Need a BREAK variable with the same value for all cases in the file.
*	ALL is a reserved word in SPSS, so use ALL_ for this variable.

compute all_ = 1.		/* ALL_ = 1 on all records in the file .
exe.

AGGREGATE
  /OUTFILE='C:\grand.sav'
  /BREAK=all_
  /N_tot=N
  /grndmean = MEAN(y).

*	Now use MERGE FILES to add Group Means and Grand Mean to 
	the working file.

*	Data-->Merge Files-->Add variables.
*	Use /TABLE subcommand to write Grand Mean to ALL records .

MATCH FILES /FILE=*
 /TABLE='C:\grand.sav'
 /by all_ .
EXECUTE.

*	Now add Group Means (using MATCH FILES with /TABLE subcommand) .

MATCH FILES /FILE=*
 /TABLE='C:\grpmeans.sav'
 /BY group.
EXECUTE.

* 	Now we can calcuate the deviation scores we will need .

compute dev1 = y - grndmean.
compute dev2 = grpmean - grndmean.
compute dev3 = y - grpmean.
compute sum23 = dev2 + dev3.
exe.

formats grpmean grndmean dev1 to dev3 sum23 (f8.3).
var lab
	grpmean 	'Group Mean'
	grndmean	'Grand Mean'
	dev1		'(Y - Grand Mean)'
	dev2		'(Grp Mean - Grand Mean)'
	dev3		'(Y - Grp Mean)'
	sum23 	'(Grp Mean - Grand Mean) + (Y - Grp Mean)'.

SUMMARIZE
  /TABLES=group y grndmean grpmean dev1 dev2 dev3 sum23
  /FORMAT=VALIDLIST NOCASENUM TOTAL
  /TITLE='Case Summaries'
  /MISSING=VARIABLE
  /CELLS=COUNT SUM .

* 	There are two things to note about the preceding table:
*	[1] Each of the "deviation score" columns sums to 0;
*	[2] The final column shows the sum of the 2 columns
*		immediately to its left; and that sum is exactly
*		equal to the (Y - Grand Mean) deviation score .

* 	To get the sums of squares needed for one-way ANOVA, we
*	need to square the deviation scores we calculated above.

* 	~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ .
* 	NOTE:  In SPSS, x**2 = x-squared .
* 	~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ .

compute dev1sq = dev1**2.
compute dev2sq = dev2**2.
compute dev3sq = dev3**2.
exe.
formats dev1sq to dev3sq (f8.3).
var lab
 dev1sq	'(Y - Grand Mean)**2'
 dev2sq	'(Grp Mean - Grand Mean)**2'
 dev3sq	'(Y - Grp Mean)**2'.

SUMMARIZE
  /TABLES=group y grndmean grpmean dev1sq dev2sq dev3sq 
  /FORMAT=VALIDLIST NOCASENUM TOTAL
  /TITLE='Case Summaries'
  /MISSING=VARIABLE
  /CELLS=COUNT SUM .

* 	Now compare the sums for the last 3 columns above with the sums of
*	squares you obtained earlier when using the built-in procedures
*	for one-way ANOVA.

* 	SS(total) = Sum(Y - Grand Mean)**2.
* 	SS(between-groups) = Sum(Grp Mean - Grand Mean)**2.
* 	SS(within-groups) = Sum(Y - Grp Mean)**2.

*	Now use AGGREGATE again to sum up the squared deviation
*	columns to produce sums of squares.  Also, keep N_tot,
*	k (the number of groups), and df_within groups .

compute grpflag = ~missing(n_grp).	/* Flag first record of each group .
compute dfwg = n_grp - 1.
exe.

AGGREGATE
  /OUTFILE= *
  /BREAK=all_
  /n_tot = first(n_tot)
  /k = sum(grpflag)
  /sstot = sum(dev1sq)
  /ssbg = sum(dev2sq)
  /sswg = sum(dev3sq)
  /dfwg = sum(dfwg) .

compute dftot = n_tot-1.	/*  24 subjects in total, minus 1 */
compute dfbg = k-1.		/*  4 groups minus 1 */
exe.
formats k dftot dfbg dfwg (f5.0).

* 	Compute Mean Squares, F-ratio, and p-value .

compute MSbg = ssbg / dfbg.
compute MSwg = sswg / dfwg.
compute F = MSbg/MSwg.
compute p = 1 - cdf.f(f,dfbg,dfwg).
exe.
formats sstot to sswg MSbg to p (f8.3).

list var ssbg sswg sstot dfbg dfwg dftot .
list var msbg mswg f p.

* 	The results of these "hand" calcuations (using the conceptual 
*	formulae) are identical to the results we obtained earlier using 
*	the built-in ANOVA procedures .

*	Finally, erase the temporary files from C: .

erase file = 'C:\grpmeans.sav'.
erase file = 'C:\grand.sav'.

* ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ .