Statistics for Two Categorical Variables - AP Statistics
Card 1 of 30
What does a joint distribution show?
What does a joint distribution show?
Tap to reveal answer
The frequency of each combination of categories of two variables. Cross-classifies observations by both variables simultaneously in each cell.
The frequency of each combination of categories of two variables. Cross-classifies observations by both variables simultaneously in each cell.
← Didn't Know|Knew It →
Identify the expected count formula in a contingency table.
Identify the expected count formula in a contingency table.
Tap to reveal answer
Expected count = $\frac{(\text{row total}) (\text{column total})}{\text{grand total}}$. Assumes independence to calculate what count would be expected in each cell.
Expected count = $\frac{(\text{row total}) (\text{column total})}{\text{grand total}}$. Assumes independence to calculate what count would be expected in each cell.
← Didn't Know|Knew It →
What is an association in statistics?
What is an association in statistics?
Tap to reveal answer
A relationship between two categorical variables. Variables are dependent when one variable's distribution changes across levels of another.
A relationship between two categorical variables. Variables are dependent when one variable's distribution changes across levels of another.
← Didn't Know|Knew It →
State the purpose of a chi-square test for independence.
State the purpose of a chi-square test for independence.
Tap to reveal answer
To test if there is an association between two categorical variables. Determines whether observed frequencies differ significantly from expected under independence.
To test if there is an association between two categorical variables. Determines whether observed frequencies differ significantly from expected under independence.
← Didn't Know|Knew It →
Which condition must be met for chi-square test validity?
Which condition must be met for chi-square test validity?
Tap to reveal answer
All expected cell counts should be at least 5. Ensures the chi-square distribution is a good approximation for the test statistic.
All expected cell counts should be at least 5. Ensures the chi-square distribution is a good approximation for the test statistic.
← Didn't Know|Knew It →
What does a chi-square statistic measure?
What does a chi-square statistic measure?
Tap to reveal answer
The difference between observed and expected frequencies. Larger values indicate greater deviation from what independence would predict.
The difference between observed and expected frequencies. Larger values indicate greater deviation from what independence would predict.
← Didn't Know|Knew It →
Calculate degrees of freedom for a 3x4 table.
Calculate degrees of freedom for a 3x4 table.
Tap to reveal answer
$(3-1)(4-1) = 6$. Uses formula $df = (r-1)(c-1)$ where $r$ and $c$ are rows and columns.
$(3-1)(4-1) = 6$. Uses formula $df = (r-1)(c-1)$ where $r$ and $c$ are rows and columns.
← Didn't Know|Knew It →
What is the alternative hypothesis in a chi-square test for independence?
What is the alternative hypothesis in a chi-square test for independence?
Tap to reveal answer
The two categorical variables are not independent. Claims there is some relationship or association between the variables.
The two categorical variables are not independent. Claims there is some relationship or association between the variables.
← Didn't Know|Knew It →
What does a significant chi-square statistic indicate?
What does a significant chi-square statistic indicate?
Tap to reveal answer
There is evidence of an association between the variables. The variables are dependent rather than independent of each other.
There is evidence of an association between the variables. The variables are dependent rather than independent of each other.
← Didn't Know|Knew It →
What is the formula for calculating chi-square statistic?
What is the formula for calculating chi-square statistic?
Tap to reveal answer
$\text{Chi-square} = \frac{(O - E)^2}{E}$ summed over all cells. Sums squared standardized deviations across all cells in the table.
$\text{Chi-square} = \frac{(O - E)^2}{E}$ summed over all cells. Sums squared standardized deviations across all cells in the table.
← Didn't Know|Knew It →
Identify the role of the observed counts in a chi-square test.
Identify the role of the observed counts in a chi-square test.
Tap to reveal answer
Observed counts are the actual data collected in the study. These are the frequencies actually recorded in the research study.
Observed counts are the actual data collected in the study. These are the frequencies actually recorded in the research study.
← Didn't Know|Knew It →
What role do expected counts play in a chi-square test?
What role do expected counts play in a chi-square test?
Tap to reveal answer
Expected counts are under the assumption of independence. Represent what frequencies would occur if variables were truly independent.
Expected counts are under the assumption of independence. Represent what frequencies would occur if variables were truly independent.
← Didn't Know|Knew It →
What is the general shape of a chi-square distribution?
What is the general shape of a chi-square distribution?
Tap to reveal answer
Right-skewed, becoming more symmetric with higher df. Always positive with mean equal to degrees of freedom.
Right-skewed, becoming more symmetric with higher df. Always positive with mean equal to degrees of freedom.
← Didn't Know|Knew It →
Determine if this statement is true: High chi-square means high association.
Determine if this statement is true: High chi-square means high association.
Tap to reveal answer
True, a high chi-square indicates strong association. Larger test statistics provide stronger evidence against independence.
True, a high chi-square indicates strong association. Larger test statistics provide stronger evidence against independence.
← Didn't Know|Knew It →
What is a cell in a contingency table?
What is a cell in a contingency table?
Tap to reveal answer
A cell is an intersection of a row and a column in the table. Contains the count for one specific combination of the two variable categories.
A cell is an intersection of a row and a column in the table. Contains the count for one specific combination of the two variable categories.
← Didn't Know|Knew It →
Calculate expected count for a cell with row total 20, column total 30, and grand total 200.
Calculate expected count for a cell with row total 20, column total 30, and grand total 200.
Tap to reveal answer
Expected count = $\frac{(20)(30)}{200} = 3$. Multiplies marginal totals and divides by the overall sample size.
Expected count = $\frac{(20)(30)}{200} = 3$. Multiplies marginal totals and divides by the overall sample size.
← Didn't Know|Knew It →
Identify the test statistic used in a chi-square test for independence.
Identify the test statistic used in a chi-square test for independence.
Tap to reveal answer
Chi-square statistic. Measures how much observed frequencies deviate from expected frequencies.
Chi-square statistic. Measures how much observed frequencies deviate from expected frequencies.
← Didn't Know|Knew It →
What is the critical value in hypothesis testing?
What is the critical value in hypothesis testing?
Tap to reveal answer
The value that separates the region where the null hypothesis is rejected. Determines the boundary for statistical significance in hypothesis testing.
The value that separates the region where the null hypothesis is rejected. Determines the boundary for statistical significance in hypothesis testing.
← Didn't Know|Knew It →
What is a p-value in the context of a chi-square test?
What is a p-value in the context of a chi-square test?
Tap to reveal answer
The probability of observing a chi-square statistic as extreme as the one calculated. Used to determine statistical significance by comparing to alpha level.
The probability of observing a chi-square statistic as extreme as the one calculated. Used to determine statistical significance by comparing to alpha level.
← Didn't Know|Knew It →
State the relationship between p-value and significance level.
State the relationship between p-value and significance level.
Tap to reveal answer
If p-value < significance level, reject the null hypothesis. Compare p-value to chosen significance level to make decision.
If p-value < significance level, reject the null hypothesis. Compare p-value to chosen significance level to make decision.
← Didn't Know|Knew It →
How do you interpret a p-value of 0.03 in a chi-square test?
How do you interpret a p-value of 0.03 in a chi-square test?
Tap to reveal answer
There is a 3% chance the observed association is due to random chance. Low probability suggests the association is unlikely due to chance alone.
There is a 3% chance the observed association is due to random chance. Low probability suggests the association is unlikely due to chance alone.
← Didn't Know|Knew It →
What does a low p-value indicate in hypothesis testing?
What does a low p-value indicate in hypothesis testing?
Tap to reveal answer
Strong evidence against the null hypothesis. Small p-values suggest the null hypothesis is likely false.
Strong evidence against the null hypothesis. Small p-values suggest the null hypothesis is likely false.
← Didn't Know|Knew It →
What is the purpose of a hypothesis test?
What is the purpose of a hypothesis test?
Tap to reveal answer
To determine if there is enough evidence to reject a null hypothesis. Provides statistical framework for making decisions about population parameters.
To determine if there is enough evidence to reject a null hypothesis. Provides statistical framework for making decisions about population parameters.
← Didn't Know|Knew It →
Identify the effect of sample size on chi-square statistic.
Identify the effect of sample size on chi-square statistic.
Tap to reveal answer
Larger sample sizes generally increase the chi-square statistic. More data can detect smaller associations as statistically significant.
Larger sample sizes generally increase the chi-square statistic. More data can detect smaller associations as statistically significant.
← Didn't Know|Knew It →
Find the chi-square statistic when $O = 15$ and $E = 10$.
Find the chi-square statistic when $O = 15$ and $E = 10$.
Tap to reveal answer
Chi-square = $\frac{(15 - 10)^2}{10} = 2.5$. Applies the formula for one cell contribution to the overall statistic.
Chi-square = $\frac{(15 - 10)^2}{10} = 2.5$. Applies the formula for one cell contribution to the overall statistic.
← Didn't Know|Knew It →
What is a Type I error in hypothesis testing?
What is a Type I error in hypothesis testing?
Tap to reveal answer
Rejecting the null hypothesis when it is true. False positive - finding significance when none actually exists.
Rejecting the null hypothesis when it is true. False positive - finding significance when none actually exists.
← Didn't Know|Knew It →
Define marginal distribution.
Define marginal distribution.
Tap to reveal answer
The distribution of values of one variable among all individuals. Found by summing across rows or columns to get totals for each category.
The distribution of values of one variable among all individuals. Found by summing across rows or columns to get totals for each category.
← Didn't Know|Knew It →
What is a Type II error in hypothesis testing?
What is a Type II error in hypothesis testing?
Tap to reveal answer
Failing to reject the null hypothesis when it is false. False negative - missing a real association that actually exists.
Failing to reject the null hypothesis when it is false. False negative - missing a real association that actually exists.
← Didn't Know|Knew It →
What assumption is made about data in a chi-square test?
What assumption is made about data in a chi-square test?
Tap to reveal answer
Data are a random sample from the population. Ensures results can be generalized to the broader population.
Data are a random sample from the population. Ensures results can be generalized to the broader population.
← Didn't Know|Knew It →
What is a significant result in the context of a chi-square test?
What is a significant result in the context of a chi-square test?
Tap to reveal answer
A result that leads to the rejection of the null hypothesis. Indicates sufficient evidence to conclude variables are associated.
A result that leads to the rejection of the null hypothesis. Indicates sufficient evidence to conclude variables are associated.
← Didn't Know|Knew It →