Estimating Population Proportions with Random Samples
🎒 with Lumi
Lumi stands at the front of a statistics classroom, reaching into a large bowl of numbered tiles and pulling out handfuls at a time while a bar chart fills in on the whiteboard behind her; next to the bar chart is a thick shaded band drawn around a percentage — a visual bracket with arrows on both ends labeled '±4%' — showing students that the true value could land anywhere inside that highlighted zone, not at a single point.
Explain why a random sample allows defensible estimates about a larger population and how to frame a conclusion with a confidence-level-qualified margin of error.
Identify conditions that make a sample biased and explain how bias distorts conclusions regardless of sample size.
Calculate a sample proportion, approximate a margin of error using the 1/√n rule, and state a conclusion with a 95% confidence interval.
Compare two sampling methods and predict which will yield more reliable inferences by naming the source of any bias.
Describe how simulation models for random sampling can be used to develop and verify margin-of-error estimates.
Key terms
Population
The entire group of people or objects you want to draw conclusions about.
Simple random sample
A subset where every population member has an equal independent chance of selection.
Sample proportion
The fraction of sampled members having the trait, count divided by sample size.
Margin of error
The plus-or-minus span around an estimate reflecting sampling variability at a confidence level.
Voluntary response bias
Distortion arising when only self-selected, motivated members choose to respond.
Bias Is About Selection, Not Size
A sample is biased when the method of choosing members systematically over- or under-represents part of the population, and no increase in sample size can repair that flaw. Voluntary online polls, convenience samples from one classroom, and surveys that exclude unreachable members all introduce predictable distortions. A small simple random sample is more trustworthy than a huge biased one, because randomness is what guarantees the sample resembles the population on average. The first question for any inference is therefore whether every member had an equal chance of selection, not how many were collected.
Estimating With a Confidence Interval
From an unbiased sample you compute the sample proportion and surround it with a margin of error to form an interval estimate. The classroom rule of thumb, margin of error ≈ 1/√n, captures how the spread shrinks as the sample grows. A 95% confidence level means the procedure produces intervals that capture the true population proportion in about 95 of every 100 repetitions; it is a statement about the long-run reliability of the method, not a probability about any single computed interval. Always report the interval, never a bare point estimate, so the uncertainty stays visible.
Diminishing Returns of Larger Samples
Because the margin of error scales like 1/√n, quadrupling the sample size only halves the margin of error. Moving from n = 100 to n = 400 cuts ±10% to ±5%, and reaching ±2.5% requires n = 1600. The interval narrows toward zero only as n approaches infinity, so it never literally vanishes for a finite sample. Recognizing this square-root relationship explains why pollsters rarely sample more than a couple thousand people: beyond that, the precision gained per additional respondent becomes too small to justify the cost.
Worked examples
From 225 randomly chosen households, 45 have solar panels; estimate the population proportion with a margin of error.
Compute the sample proportion: 45 ÷ 225 = 0.20, or 20%.
Approximate the margin of error: 1/√225 = 1/15 ≈ 0.067, about ±6.7%.
Form the 95% interval: 20% ± 6.7%, roughly 13.3% to 26.7%.
Answer: About 95% confident the true proportion of solar households is between roughly 13% and 27%.
How much does the margin of error change from n = 100 to n = 400 voters?
At n = 100: margin ≈ 1/√100 = 1/10 = 0.10, about ±10%.
At n = 400: margin ≈ 1/√400 = 1/20 = 0.05, about ±5%.
Quadrupling n from 100 to 400 halved the margin from ±10% to ±5%, matching the 1/√n rule.
Answer: The margin halves from ±10% to ±5%, because quadrupling n halves 1/√n.
Lumi sets down the bowl and turns to face you. "Here's the situation," she says. "You almost never get to measure every single person or object in a group you care about — that group is called the **population**. So instead, you measure a smaller **sample** and use what you find to make an educated estimate about the whole population."
She taps the whiteboard. "The critical question is: *which* sample members did you pick? If your selection method systematically favors some members over others, your sample is **biased**, and any conclusion you draw is not trustworthy — even if the sample is large. Bias is about *how* you selected, not *how many* you selected."
"A **simple random sample (SRS)** fixes this. When every member of the population has an equal, independent chance of being chosen — exactly like drawing numbered tiles from this bowl without looking — no systematic preference sneaks in. That fairness is what makes your conclusion defensible."
"Once you have an unbiased sample, you compute a **sample proportion**: divide the count of the trait you're tracking by the total sample size. For example, if 34 out of 80 randomly chosen students prefer morning classes, the sample proportion is 34 ÷ 80 = 0.425, or about 42.5%. You then argue — carefully — that the population proportion is *likely near* 42.5%."
"But how near? That is where the **margin of error** comes in — and it must always be paired with a **confidence level**. A 95% confidence interval is the standard: it means that if you repeated this sampling process many times, about 95 of every 100 resulting intervals would contain the true population proportion. To *approximate* the margin of error, use this rule of thumb: **margin of error ≈ 1/√n**, where *n* is your sample size. With n = 80: 1/√80 ≈ 1/8.9 ≈ 0.112, or about ±11%. A more precise calculation uses the sample proportion and a z-score, but 1/√n gives you the right order of magnitude quickly."
"A larger random sample produces a smaller margin of error. With n = 400: 1/√400 = 1/20 = 0.05, so ±5%. With n = 1600: 1/√1600 = 1/40 = 0.025, so ±2.5%. You can also use **simulation**: draw thousands of virtual random samples from a model population, compute each sample proportion, and observe the spread of results — that spread directly shows you what a margin of error means in practice."
"When you write a conclusion, always state (1) what population you are describing, (2) how the sample was drawn, (3) whether the method was genuinely random, and (4) the confidence level for your interval. Those four elements are what make statistical inference defensible rather than just a guess."
**Non-stranding hint — if you get stuck:** Ask yourself three questions: Did every member of the population have an equal chance of being selected? If no, name the bias. If yes, compute the sample proportion and then approximate the margin of error with 1/√n. That sequence will get you through any inference problem in this lesson.
Activity
For each sampling scenario below, write a one-sentence defensible conclusion or explain the specific flaw that makes the conclusion indefensible — then calculate the approximate margin of error using 1/√n where applicable.
Practice
A random sample of 64 shoppers shows 24 used coupons; find the sample proportion and approximate margin of error.
A radio host tallies call-in responses and reports a citywide opinion; name the bias and explain why more calls do not fix it.
Common mistakes to avoid
A large sample fixes biasBias comes from how members are selected, so a larger biased sample only repeats the same systematic error more loudly.
Bigger samples drive margin of error to zeroThe margin shrinks like 1/√n and only approaches zero as n approaches infinity, so any finite sample keeps a nonzero margin.
Check your understanding
A quality-control engineer randomly selects 50 light bulbs from a production run of 4,000 and finds that 3 are defective. Which statement is the most defensible inference?
A student wants to estimate what fraction of the 800 students at her school walk to school. She surveys every student in her first-period class and finds that 8 out of 25 walk. Why is this sample likely biased?
Two students each conduct a survey about after-school study habits. Student A randomly selects 40 students from the full school roster. Student B posts a survey link on a school social-media page and gets 120 responses. Whose results support a more defensible inference about the full student body, and why?
A survey of 100 randomly selected voters finds that 54% plan to vote yes on a ballot measure. Using the 1/√n approximation, the margin of error is about ±10%. What is the correct 95% confidence interval interpretation?
Recap
Valid inference rests on a simple random sample where every member has an equal chance of selection; from it you compute a sample proportion and an interval whose margin of error shrinks like 1/√n, and you report a confidence level because randomness, not size, is what makes the estimate defensible.
Reflect
How might the way a question's respondents are chosen quietly shape the answer you get?