Frequency Distributions
Frequency Distribution is a table that lists the scores on a variable and the number of individuals who obtained each value -- always list from highest to lowest
- Absolute frequencies (frequency) counting the number of individuals who received each score on the variable (always sum to the number of observations N)
- Relative frequencies number of scores of a given value divided by total number of scores (always sum to 1.00) is the proportion of time the score occurred.
- Percentage multiplying relative frequency by 100 reflects the percentage of time that score occurred. (always sums to 100)
- Cumulative frequencies adding successively the entries in the frequency column
- Frequency associated with that score plus sum of all frequencies below that score.
- Cumulative relative frequency is cumulative frequency divided by total number of scores.
- Proportion of individuals who had that score value or lower
- Cumulative percentage multiply crf by 100 percentage of people at that score or lower.
Frequency Distributions for Quantitative Variables: Grouped Scores
- Often have such a wide variety of scores on the variable that it is impractical, uninformative to construct frequency distribution w/out consolidating the information.
- Central questions to grouping data:
- How many groups?
- Want a balance between too many and too few.
- Generally, use 5 to 15 groups
- What should the interval size be? (this is the range of scores within each group)
- Typically, interval size of two, three, or interval of 5 is used.
- Subtract lowest observed score from highest observed score, divide this difference by the desired number of groups, round this to the nearest of commonly used interval-size values.
- What is the lowest value at which the first interval starts?
Frequency Distributions for Qualitative Variables
- Often well have information on a nominal level variable that we want to share with others in a concise manner.
- Begin by listing the variable categories.
- Followed by frequency, relative frequency, and/or percentage columns.
Frequency Graphs
- Frequency Histogram (Bar Graph)
- X axis abscissa lists the score values from low to high, extending from one unit below the lowest score to one unit above the highest score.
- Y axis ordinate represents the frequency w/which each score occurred (should go up to highest frequency + 1)
- Label that clearly names the variable in study should appear beneath the score values.
- Frequency Polygon (Line Graph)
- Similar to frequency histogram uses same ordinate and abscissa
- Major difference: Bars arent used, rather dots corresponding to appropriate frequencies placed directly above score values
- Dots connected by solid lines
- Always "closed" with the abscissa in that they always include a value that is a unit higher than the highest observed value and a unit lower than the lowest observed score, with a frequency of 0 for each.
- Line Plot
- Constructed exactly like the frequency polygon, except that it is not "closed"
- Frequency Graphs for Qualitative Variables
- Bar Graphs
- Values of the variable are listed on the abscissa, frequencies are listed on the ordinate
- Major difference from frequency histogram is that the bars are drawn such that they do not touch one another.
- Because each bar represents a distinct category.
Misleading Graphs
- Presentation of data in graphic form can be highly informative, but it can also be misleading.
- Rules to reduce misleading graphs:
- Ordinate height for highest frequency should be Ύ to 2/3 length of abscissa
- Ordinate should start w/frequency of zero and "jumps" indicated by zigzag if not drawn to scale.