* Display syntax commands in the Output Viewer . SET Printback=On Length=59 Width=80. * ---------------------------------------------------------------------- File: anova1.sps Author: Bruce Weaver, weaverb@mcmaster.ca Date: 9-May-2002 Notes: Example of one-way ANOVA using SPSS . * ---------------------------------------------------------------------- . * NOTE: This syntax has been tested in version 11 of SPSS. It may * be that some things are slightly different if you have another * version of SPSS . * Read in some data on 3 different painkillers vs control . DATA LIST LIST / group (f2.0) y (f3.1). BEGIN DATA. 1 3.6 1 4.4 1 6.9 1 4.2 1 5.8 1 6.6 2 6.2 2 6.3 2 7.6 2 5.4 2 4.7 2 5.9 3 7.4 3 7.7 3 8.2 3 3.4 3 5.4 3 6.0 4 8.2 4 8.8 4 8.7 4 6.5 4 7.5 4 7.7 end data. var lab y 'Score'. val lab group 1 'Aspirin' 2 'Aceto' 3 'Ibuprof' 4 'Placebo'. * List cases to show the structure of the data file . list all. * Note that there are 2 columns in the data file, 1 to code * group membership (must be a numeric code in SPSS), and * one to record the DV for that subject (variable Y is the DV) . * Graphs-->Bar . GRAPH /BAR(SIMPLE)=MEAN(y) BY group /MISSING=REPORT /TITLE= 'Painkillers vs Placebo'. * Use INTERACTIVE GRAPHICS to produce graph with error bars . * Graphs-->Interactive-->Bar . IGRAPH /VIEWNAME='Bar Chart' /X1 = VAR(group) TYPE = CATEGORICAL /Y = VAR(y) TYPE = SCALE /COORDINATE = VERTICAL /TITLE='Painkillers vs Placebo' /X1LENGTH = 3.0 /YLENGTH = 3.0 /X2LENGTH = 3.0 /CHARTLOOK = 'Grayscale.clo' /BAR(MEAN) KEY=ON SHAPE = RECTANGLE BASELINE = AUTO /ERRORBAR CI(95.0) DIRECTION = BOTH CAPWIDTH (45) CAPSTYLE = T. EXE. * There are several ways to perform a one-way ANOVA in SPSS. * ----------- Using MEANS Procedure to perform one-way ANOVA -------- . * Perhaps the simplest is to use the /statistics=ANOVA subcommand * of the MEANS procedure, as follows. * ANALYZE-->COMPARE MEANS-->MEANS . means y by group /cells = count mean stddev variance /stat=anova. * The eta-squared you see in the output is computed as SS_Between * divided by SS_Total; it is sometimes called the "correlation ratio", * and is similar to r-squared for a regression analysis. In other * words, it indicates the proportion of total variance that is * explained by the indpendent variable. * ------ Using One-way ANOVA Procedure to perform one-way ANOVA -------- . * If you look under Analyze-->Compare Means in the pull-down menus, * you will see that the last option is One-way ANOVA. Here is some * syntax for that procedure (including descriptive statistics and a * plot of the treatment means) . * ANALYZE-->COMPARE MEANS-->ONE-WAY ANOVA . ONEWAY y BY group /STATISTICS DESCRIPTIVES /PLOT MEANS /POSTHOC = DUNNETT ALPHA(.05) /MISSING ANALYSIS . * Eta-squared is not available in the output using this method. * However, a number of multiple comparison methods are available. * For example, Dunnett's test compares each of k-1 treatment groups * to a control. You will find it in the Post-hoc dialog box. * Dunnett's test shows that Aspirin is signicantly different * than Control ( p = .006); the difference between ACETO and * control is marginal (p = 0.054); and the difference between * IBUPROF and Control is not statistically significant (p = .128). * Finally, note that a line graph showing the means is optional. * Given that the IV is categorical and not ordinal, however, a * bar graph (as shown above) would be a better graph for these data. * ------ Using GLM UNIVARIATE Procedure to perform one-way ANOVA -------- . * The first option under Analyze-->General Linear Model in the pull-down * menus is UNIVARIATE. Here's how to perform a one-way ANOVA using * GLM UNIVARIATE (again with descriptive statistics and a plot of means). * ANALYZE-->GENERAL LINEAR MODEL-->UNIVARIATE . UNIANOVA y BY group /METHOD = SSTYPE(3) /INTERCEPT = INCLUDE /POSTHOC = group ( DUNNETT) /PLOT = PROFILE(group) /EMMEANS = TABLES(group) /PRINT = DESCRIPTIVE /CRITERIA = ALPHA(.05) /DESIGN = group . * Note that the ANOVA summary table produced by this procedure is a bit * different--it has some extra rows we have not seen before. * The rows you have seen before are the ones labeled GROUP, ERROR * and CORRECTED TOTAL; the SS, MS, and F values on these rows * match what you saw on the BETWEEN, WITHIN, and TOTAL rows * in the previous analyses . * You will also see an R-squared value of .397 below the ANOVA * summary table; this is identical to the eta-squared value we * saw earlier (in the output from the MEANS procedure). It * indicates that 39.7% of the total variation in the DV is * explained by the IV . * Finally, note that in all of the preceding output, the good folks * at SPSS have use the column heading "Sig." where they should * have use "p" . * =========================================================================== . * APPENDIX: DOING ONE-WAY ANOVA CALCUATIONS "BY HAND" IN SPSS . * In class, you learned that the total sum of squares can be * partitioned into 2 components: SS(between-groups) and * SS(within-groups). * Another way to think of this is to start by observing that * not all subjects have the same score. In other words, * there is variability in Y, the dependent variable. * Each person's deviation from the overall mean of Y can be * broken down into 2 components: * [1] The deviation of the raw score from the GROUP MEAN; and * [2] The deviation of the GROUP MEAN from the GRAND MEAN . * If we generate the GROUP MEANS and the OVERALL MEAN, we can then * use COMPUTE statements to carry out the ANOVA "by hand", so * to speak, before we do it the easy way (i.e., using the * built-in procedures). There is some benefit in doing it * "by hand" first, because you have to understand what you * are doing to do it that way. * First, use AGGREGATE to save the group means to file "C:\grpmeans.sav". * Data must be sorted by GROUP. sort cases by group(a). * In the pull down menus, when the Data Editor is active: * Data-->Aggregate . AGGREGATE /OUTFILE='C:\grpmeans.sav' /BREAK=group /n_grp = n /grpmean = MEAN(y). * Now save GRAND mean to file "c:\grand.sav". * Need a BREAK variable with the same value for all cases in the file. * ALL is a reserved word in SPSS, so use ALL_ for this variable. compute all_ = 1. /* ALL_ = 1 on all records in the file . exe. AGGREGATE /OUTFILE='C:\grand.sav' /BREAK=all_ /N_tot=N /grndmean = MEAN(y). * Now use MERGE FILES to add Group Means and Grand Mean to the working file. * Data-->Merge Files-->Add variables. * Use /TABLE subcommand to write Grand Mean to ALL records . MATCH FILES /FILE=* /TABLE='C:\grand.sav' /by all_ . EXECUTE. * Now add Group Means (using MATCH FILES with /TABLE subcommand) . MATCH FILES /FILE=* /TABLE='C:\grpmeans.sav' /BY group. EXECUTE. * Now we can calcuate the deviation scores we will need . compute dev1 = y - grndmean. compute dev2 = grpmean - grndmean. compute dev3 = y - grpmean. compute sum23 = dev2 + dev3. exe. formats grpmean grndmean dev1 to dev3 sum23 (f8.3). var lab grpmean 'Group Mean' grndmean 'Grand Mean' dev1 '(Y - Grand Mean)' dev2 '(Grp Mean - Grand Mean)' dev3 '(Y - Grp Mean)' sum23 '(Grp Mean - Grand Mean) + (Y - Grp Mean)'. SUMMARIZE /TABLES=group y grndmean grpmean dev1 dev2 dev3 sum23 /FORMAT=VALIDLIST NOCASENUM TOTAL /TITLE='Case Summaries' /MISSING=VARIABLE /CELLS=COUNT SUM . * There are two things to note about the preceding table: * [1] Each of the "deviation score" columns sums to 0; * [2] The final column shows the sum of the 2 columns * immediately to its left; and that sum is exactly * equal to the (Y - Grand Mean) deviation score . * To get the sums of squares needed for one-way ANOVA, we * need to square the deviation scores we calculated above. * ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ . * NOTE: In SPSS, x**2 = x-squared . * ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ . compute dev1sq = dev1**2. compute dev2sq = dev2**2. compute dev3sq = dev3**2. exe. formats dev1sq to dev3sq (f8.3). var lab dev1sq '(Y - Grand Mean)**2' dev2sq '(Grp Mean - Grand Mean)**2' dev3sq '(Y - Grp Mean)**2'. SUMMARIZE /TABLES=group y grndmean grpmean dev1sq dev2sq dev3sq /FORMAT=VALIDLIST NOCASENUM TOTAL /TITLE='Case Summaries' /MISSING=VARIABLE /CELLS=COUNT SUM . * Now compare the sums for the last 3 columns above with the sums of * squares you obtained earlier when using the built-in procedures * for one-way ANOVA. * SS(total) = Sum(Y - Grand Mean)**2. * SS(between-groups) = Sum(Grp Mean - Grand Mean)**2. * SS(within-groups) = Sum(Y - Grp Mean)**2. * Now use AGGREGATE again to sum up the squared deviation * columns to produce sums of squares. Also, keep N_tot, * k (the number of groups), and df_within groups . compute grpflag = ~missing(n_grp). /* Flag first record of each group . compute dfwg = n_grp - 1. exe. AGGREGATE /OUTFILE= * /BREAK=all_ /n_tot = first(n_tot) /k = sum(grpflag) /sstot = sum(dev1sq) /ssbg = sum(dev2sq) /sswg = sum(dev3sq) /dfwg = sum(dfwg) . compute dftot = n_tot-1. /* 24 subjects in total, minus 1 */ compute dfbg = k-1. /* 4 groups minus 1 */ exe. formats k dftot dfbg dfwg (f5.0). * Compute Mean Squares, F-ratio, and p-value . compute MSbg = ssbg / dfbg. compute MSwg = sswg / dfwg. compute F = MSbg/MSwg. compute p = 1 - cdf.f(f,dfbg,dfwg). exe. formats sstot to sswg MSbg to p (f8.3). list var ssbg sswg sstot dfbg dfwg dftot . list var msbg mswg f p. * The results of these "hand" calcuations (using the conceptual * formulae) are identical to the results we obtained earlier using * the built-in ANOVA procedures . * Finally, erase the temporary files from C: . erase file = 'C:\grpmeans.sav'. erase file = 'C:\grand.sav'. * ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ .