
| A | B | C | D | |
|---|---|---|---|---|
| 1 | Inputs | |||
| 2 | Parameter | Value | Unit | Notes |
| 3 | Corpus size (N) | docs | Total documents in the population you're sampling from. | |
| 4 | Expected prevalence / richness (p) | decimal | Best guess at the responsive rate. Use 0.5 for maximum conservatism. | |
| 5 | Confidence level (C) | decimal | Typically 0.95 (courts have accepted 0.90 in some matters). | |
| 6 | Margin of error (E) | decimal | Half-width of confidence interval. 0.03 = ±3%. | |
| 7 | Outputs | |||
| 8 | Result | Value | Unit | Interpretation |
| 9 | Required sample size (n) | — | docs | Randomly draw exactly this many documents for your sample. |
| 10 | Expected responsive in sample | — | docs | n × p. Your reviewers should see roughly this many responsive. |
| 11 | Sample as % of corpus | — | % | Rule of thumb — anything under 5% is generally proportionate. |
| 12 | z-score (from C) | — | — | 1.96 at 95%, 1.645 at 90%, 2.576 at 99%. |
| 13 | Estimated review cost (@ $2.50 / doc) | — | USD | Adjust in-house rate; contract-review typical range $1.50–$5/doc. |
| 14 | Reference · Common Sample Sizes at 95% Confidence | |||
| 15 | Corpus size | ±5% MoE, p=0.5 | ±3% MoE, p=0.5 | ±3% MoE, p=0.1 |
| 16 | 10,000 | 370 | 964 | 376 |
| 17 | 50,000 | 381 | 1,045 | 381 |
| 18 | 100,000 | 383 | 1,056 | 382 |
| 19 | 500,000 | 384 | 1,065 | 384 |
| 20 | 1,000,000 | 384 | 1,066 | 384 |
| 21 | 10,000,000 | 384 | 1,067 | 384 |
| 22 | Notes on Use | |||
| 23 | For control sets in TAR 1: use richness (p) close to expected responsive rate on the seed set. Lower p = smaller required n (up to a point) but higher variance on precision estimates. | |||
| 24 | For elusion sampling in TAR 2: p is your expected elusion rate (usually low — 0.01 to 0.05). E is how tight you need the ceiling on missed responsives. | |||
| 25 | For validation of a GenAI pass: sample the excluded set, not the included set. p = expected false-negative rate. | |||
| 26 | Court-tested defaults: C=0.95, E=0.03 for control sets; C=0.95, E=0.02 for high-stakes elusion sampling. | |||

Sampling proves — to a stated statistical confidence — that a decision made on a large corpus (which items to review, which to exclude, which to produce) is not producing systematically wrong outcomes. It replaces the alternative of reviewing everything, which is often infeasible and always expensive.
To estimate the responsive rate of a corpus within ±E at confidence C, draw n = (z² × p × (1−p)) / E², adjusted for finite population when n is a meaningful fraction of N. z is the standard-normal quantile at C (1.96 at 95%). p is your prior estimate of responsive rate; use 0.5 when you have no prior. The formula is symmetric — it works for any binary classification: responsive/not, privileged/not, correctly-coded/not.
| Protocol | When | Numbers |
|---|---|---|
| TAR 1 Control Set | Predictive coding, static training | C=0.95, E=0.03, p=your best richness estimate. Typical n = 400–1,100. |
| TAR 2 Elusion (round n) | Continuous Active Learning | C=0.95, E=0.02, p=0.01–0.05. Typical n = 500–2,400. |
| Production QC | Before shipping a production | C=0.95, E=0.05, p=0.5 on responsive-tag accuracy. Typical n = 385. |
| Privilege QC | Before shipping a priv log | C=0.99, E=0.02, p=your priv rate. Typical n = 1,000–4,000. |
"We sampled [n] documents randomly drawn from a population of [N]. At [C]% confidence, the observed [responsive / elusion / privilege] rate of [x]% is within ±[E]% of the true population rate. The sample was reviewed by [role] on [dates] under the same protocol as the main population."