Confidence Intervals: Slope of Regression Models

Help Questions

AP Statistics › Confidence Intervals: Slope of Regression Models

Questions 1 - 10
1

A real estate analyst selects 30 houses in a region and records $x$ = size (hundreds of square feet) and $y$ = selling price (thousands of dollars). The regression of price on size yields a 95% confidence interval for the population slope $\beta$ of $(8,\ 14)$. Which interpretation is correct?

There is a 95% probability that the selling price of a randomly selected house is between $\$8{,}000 and $\$14{,}000.

We are 95% confident that the slope of the sample regression line will be between 8 and 14 for these same 30 houses if we recompute it.

Because the interval does not include 0, the correlation between size and price is between 8 and 14.

We are 95% confident that 95% of houses increase in price by between $\$8{,}000 and $\$14{,}000 when size increases by 100 square feet.

We are 95% confident that for each additional 100 square feet, the mean selling price in the population increases by between $\$8{,}000 and $\$14{,}000.

Explanation

This question assesses interpreting a 95% confidence interval for the slope β in regressing house price on size. The interval (8, 14) implies we are 95% confident that β, the average increase in mean price per 100 square feet, is between $8,000 and $14,000. Choice C distracts by equating the interval to correlation, but correlation is not measured in the same units or scale. Choice D wrongly extends the interval to percentages of houses rather than the population mean. Mini-lesson: Slope confidence intervals reflect the range where the true population rate of change likely falls, incorporating sampling error; positive endpoints indicate an upward trend, and proper unit interpretation is key, as slopes depend on variable scales without implying causality or individual predictions.

2

A restaurant manager samples 18 days, recording $x$ = number of online ads purchased that day and $y$ = total sales (dollars). The regression of sales on ads gives a 95% confidence interval for the population slope $\beta$ of $(15,\ 60)$. Which interpretation is correct?

We are 95% confident that for each additional online ad purchased, the mean daily sales in the population increase by between $\$15 and $\$60.

Because 0 is not in the interval, 95% of days will have sales between $\$15 and $\$60.

We are 95% confident that increasing ads by 1 will cause sales to increase by between $\$15 and $\$60.

There is a 95% probability that the true slope equals the midpoint of the interval.

We are 95% confident that the correlation between ads and sales is between 15 and 60.

Explanation

This question tests interpreting a 95% confidence interval for the slope β in regressing sales on online ads. The interval (15, 60) indicates we are 95% confident that β, the average increase in mean daily sales per additional ad, lies between $15 and $60. Choice A distracts by implying causation from the interval, but confidence intervals do not confirm causal links. Choice E confuses the slope with correlation, which is unitless and between -1 and 1. Mini-lesson: A slope confidence interval encapsulates the uncertainty around the estimated population parameter β, representing mean change per unit; positive intervals suggest an increasing relationship, and the level like 95% means repeated sampling would capture β in 95% of such intervals, not that 95% of data points fall within it.

3

A biologist uses linear regression to predict plant height (cm) from hours of sunlight per day for a random sample of 22 plants. A 98% confidence interval for the slope is $(-0.3,\ 1.9)$ cm per hour. Which interpretation is correct?

We are 98% confident that for each additional hour of sunlight, the population mean plant height changes by between -0.3 and 1.9 cm, on average.

We are 98% confident that the correlation is between -0.3 and 1.9.

Because 0 is in the interval, the correlation between sunlight and height is 0.

There is a 98% chance that the true slope is between -0.3 and 1.9 for this sample.

About 98% of plants will grow between -0.3 and 1.9 cm for each extra hour of sunlight.

Explanation

This question examines interpretation when a confidence interval includes both positive and negative values. The interval (-0.3, 1.9) contains 0, meaning we cannot determine if the relationship is positive or negative at the 98% confidence level. Option B correctly interprets this: we are 98% confident that for each additional hour of sunlight, the population mean plant height changes by between -0.3 and 1.9 cm. Option A incorrectly concludes the correlation is exactly 0. Option C wrongly assigns probability to this sample's slope. Option D misapplies the interval to individual plants. Option E confuses slope with correlation values. Key insight: when an interval contains 0, the relationship could be positive, negative, or zero in the population.

4

A marketing team samples 22 weeks, recording $x$ = number of promotional emails sent (in thousands) and $y$ = weekly revenue (in thousands of dollars). The regression of revenue on emails gives a 95% confidence interval for the population slope $\beta$ of $(0.0,\ 2.5)$ (with the lower endpoint rounded to 0.0). Which interpretation is correct?

We are 95% confident that for each additional 1,000 emails sent, the mean weekly revenue in the population increases by between 0.0 and 2.5 thousand dollars.

We are 95% confident that the correlation between emails and revenue is between 0.0 and 2.5.

Since the interval's lower endpoint is 0.0, it proves sending more emails cannot decrease revenue.

There is a 95% probability that weekly revenue will increase by between $\$0 and $\$2{,}500 when 1,000 more emails are sent.

Because the interval includes 0.0, the slope is exactly 0, so revenue and emails are uncorrelated.

Explanation

This question examines interpreting a 95% confidence interval for the slope β of revenue on emails sent. The interval (0.0, 2.5) means we are 95% confident that β, the average increase in mean weekly revenue per 1,000 additional emails, is between $0 and $2,500. Choice B is a distractor, incorrectly asserting that including zero means the slope is exactly zero and variables are uncorrelated, but zero is just one plausible value. Choice C misapplies probability to individual revenue changes rather than the mean. Mini-lesson: Slope confidence intervals provide a range of feasible values for the population's average effect, with the lower bound at zero indicating non-negative plausibility; they do not prove directions or apply to correlations, and the confidence level pertains to the method's reliability over many samples, not single instances.

5

An environmental scientist models ozone level (ppb) as a function of daily high temperature ($^\circ$F) using data from 25 randomly selected days. A 90% confidence interval for the regression slope is $(-1.8,\ -0.4)$ ppb per $^\circ$F. Which interpretation is correct?

Because the interval is negative, the correlation must be between -1.8 and -0.4.

There is a 90% probability that the true slope is negative.

If we repeated the study many times, 90% of the time the sample slope would equal a value between -1.8 and -0.4 exactly.

We are 90% confident that for each 1$^\circ$F increase in temperature, the population mean ozone level decreases by between 0.4 and 1.8 ppb, on average.

About 90% of individual days will have ozone levels that drop by between 0.4 and 1.8 ppb for each 1$^\circ$F increase.

Explanation

This question involves interpreting a confidence interval for slope when predicting ozone from temperature. The interval (-1.8, -0.4) is entirely negative, indicating an inverse relationship. Option A correctly states that we are 90% confident the population mean ozone level decreases by between 0.4 and 1.8 ppb for each 1°F increase in temperature. Option B incorrectly assigns probability to the parameter. Option C confuses slope values with correlation values (correlation must be between -1 and 1). Option D wrongly applies the interval to individual days rather than the population mean. Option E misunderstands what repeated sampling would show. Key insight: negative slopes indicate inverse relationships, and confidence intervals describe population parameters, not individual observations.

6

A biologist measures 20 plants of the same species, recording $x$ = hours of sunlight per day and $y$ = weekly growth (cm). The regression of growth on sunlight yields a 99% confidence interval for the population slope $\beta$ of $(-0.4,\ 1.6)$. Which interpretation is correct?

There is a 99% chance that plants will grow between $-0.4$ and $1.6$ cm more each week for every extra hour of sunlight.

Since the interval includes both negative and positive values, sunlight has no effect on growth for any plant.

We are 99% confident that the correlation between sunlight and growth is between $-0.4$ and $1.6$.

Because 0 is in the interval, the slope is 0, so there is no relationship between sunlight and growth.

We are 99% confident that for each additional hour of sunlight, the mean weekly growth in the population changes by between $-0.4$ and $1.6$ cm.

Explanation

This question tests understanding a 99% confidence interval for the slope β in regressing plant growth on sunlight hours. The interval (-0.4, 1.6) suggests we are 99% confident that β, the average change in mean weekly growth per extra hour of sunlight, ranges from -0.4 to 1.6 cm. A frequent distractor is choice A, which erroneously concludes that including zero means the slope is exactly zero and no relationship exists, but it only means zero is plausible. Choice E overgeneralizes the interval's inclusion of negatives and positives to claim no effect for any plant, ignoring variability. Mini-lesson: Confidence intervals for slopes estimate the plausible range for the population's average response change per unit predictor increase; wider intervals at higher confidence levels reflect greater certainty, and overlapping zero indicates the data is consistent with no linear association without proving it.

7

A meteorologist uses data from a random sample of 20 days to relate humidity ($x$, percent) to the maximum temperature ($y$, degrees F). A least-squares regression line predicts $y$ from $x$. A 90% confidence interval for the true slope is $(-0.30,\ 0.05)$ degrees F per percent humidity. Which interpretation is correct?

We are 90% confident that for each 1% increase in humidity, the mean maximum temperature in the population changes by between $-0.30$ and $0.05$ degrees F, on average.

We are 90% confident that the correlation between humidity and maximum temperature is between $-0.30$ and $0.05$.

Because 0 is in the interval, the slope in the population must be 0.

There is a 90% chance that the true slope is between $-0.30$ and $0.05$ after seeing the data.

90% of individual days will have maximum temperature changes between $-0.30$ and $0.05$ degrees F for each 1% increase in humidity.

Explanation

This question involves a confidence interval (-0.30, 0.05) that contains zero. Option D correctly interprets this as being 90% confident that for each 1% increase in humidity, the mean maximum temperature changes by between -0.30 and 0.05 degrees F. Option A confuses slope with correlation, Option B incorrectly concludes the slope must be 0, Option C misinterprets confidence as posterior probability, and Option E applies the interval to individual days rather than the population mean. Key insight: when 0 is in the confidence interval, we cannot determine the direction of the relationship at that confidence level - the true slope could be positive, negative, or zero.

8

An economist uses data from a random sample of 25 cities to study the relationship between median rent ($y$, dollars) and distance from the city center ($x$, miles). A least-squares regression line predicts rent from distance. A 95% confidence interval for the true slope is $(-85,\ -20)$ dollars per mile. Which interpretation is correct?

We are 95% confident that the mean rent in the population decreases by between $20 and $85 for each additional mile from the city center, on average.

95% of individual city rents will decrease by between $20 and $85 when distance increases by 1 mile.

There is a 95% chance that the true slope is between $-85$ and $-20$ because this interval was computed from the sample.

Because the interval does not include 0, the rent must decrease by exactly $85 per mile in the population.

We are 95% confident that $r$ is between $-85$ and $-20$.

Explanation

This question tests interpretation of a negative confidence interval for slope in an economics context. The interval (-85, -20) means we're 95% confident the true slope is between -85 and -20 dollars per mile. Option A correctly interprets this as the mean rent decreasing by between $20 and $85 for each additional mile from city center (note the positive phrasing of a negative relationship). Option B incorrectly assigns probability after seeing the data, Option C confuses slope with correlation, Option D makes an unfounded claim about the exact value, and Option E misapplies the interval to individual cities. Remember: confidence intervals describe our uncertainty about population parameters, not variability in individual observations.

9

A nutrition scientist samples 22 adults and measures daily fiber intake ($x$, grams) and LDL cholesterol ($y$, mg/dL). A least-squares regression line predicts LDL from fiber intake. A 95% confidence interval for the true slope is $(-1.9,\ -0.2)$ mg/dL per gram. Which interpretation is correct?

There is a 95% chance that the slope is between $-1.9$ and $-0.2$ mg/dL per gram.

If fiber intake increases by 1 gram, then 95% of individuals will reduce LDL by between 0.2 and 1.9 mg/dL.

Because the interval is negative, fiber intake causes LDL to decrease for every individual.

We are 95% confident that the correlation between fiber and LDL is between $-1.9$ and $-0.2$.

We are 95% confident that for each additional gram of fiber, the mean LDL cholesterol in the population decreases by between 0.2 and 1.9 mg/dL, on average.

Explanation

This question presents a negative confidence interval (-1.9, -0.2) in a health context. Option B correctly states we're 95% confident that for each additional gram of fiber, the mean LDL cholesterol decreases by between 0.2 and 1.9 mg/dL in the population. Option A incorrectly applies this to individual people, Option C misinterprets confidence as probability, Option D confuses slope with correlation (correlation has no units), and Option E makes an unfounded causal claim about every individual. Important distinction: regression describes associations on average, not deterministic relationships for every individual, and confidence intervals quantify our uncertainty about population parameters.

10

A city planner records data from 15 neighborhoods on $x$ = distance (miles) from downtown and $y$ = average monthly rent (dollars). A least-squares regression of rent on distance gives a 90% confidence interval for the slope $\beta$ of $(-220,\ -40)$. Which interpretation is correct?

There is a 90% probability that the slope for this fitted line is between $-220$ and $-40$.

Since 0 is not in the interval, exactly 90% of neighborhoods farther from downtown have lower rent.

We are 90% confident that each additional mile causes rent to drop by between $\$40 and $\$220 per month.

We are 90% confident that for each additional mile from downtown, the mean rent in the population decreases by between $\$40 and $\$220 per month.

Because the interval contains negative values, the correlation must be between $-220$ and $-40$.

Explanation

This question evaluates the interpretation of a 90% confidence interval for the slope β in a regression of rent on distance from downtown. The interval (-220, -40) means we are 90% confident that the true β, the average change in mean monthly rent per additional mile, is between -220 and -40 dollars, or a decrease of 40 to 220 dollars. Choice E is a distractor as it incorrectly assumes the interval implies causation, but confidence intervals do not establish cause-and-effect relationships. Choice C mistakenly equates the slope interval with the correlation coefficient, which is bounded between -1 and 1. Mini-lesson: A confidence interval for the regression slope provides a range where the true population average rate of change is likely to fall, accounting for sampling error; negative endpoints here indicate a plausible negative association, and the confidence level reflects the long-run success rate of the interval method in capturing β.

Page 1 of 6