Sources & References

Every empirical claim on this site traces to one of the sources below. Where the original data is publicly available, the link goes to it. Where the source is a peer-reviewed paper, we cite the canonical reference.

Each finding page carries numbered citation markers — [1], [2], etc. — that map directly to the entries below.

The masthead acronyms, in plain words. The site header credits four data sources:

Census CPS — the U.S. Census Bureau's Current Population Survey, November Voting and Registration Supplement (our primary turnout microdata). [1]
McDonald VEP — Michael McDonald's Voting-Eligible Population turnout benchmark, from the U.S. Elections Project. [2]
NCSL — the National Conference of State Legislatures election-policy database. [6]
State election offices — the certified vote totals and election-administration data states produce, reaching this site through the compilations noted just below.

A note on "state election offices." The site masthead credits state election offices as a source. This site does not hold raw files directly from those offices; rather, the official certified vote totals and election-administration data they produce reach the analysis through three compilations listed below — the McDonald Voting-Eligible Population (VEP) turnout benchmark [2], the U.S. Election Assistance Commission's Election Administration and Voting Survey (EAVS) of state and local election offices [3], and the MIT Election Data and Science Lab (MEDSL) state returns [4].

Primary microdata

[1] U.S. Census Bureau, Current Population Survey (CPS), November Voting and Registration Supplement, 2000–2024. IPUMS-CPS extract cps_00002. 1,641,940 raw rows; 1,195,957 citizen-adult observations with valid voter-supplement weights. cps.ipums.org
Used in: every finding (1 · 2 · 3 · 4 · 5 · 6 · 7), the home-page hero, the methodology page, The Solution, and the worked example How do we know this?.

Turnout benchmarks

[2] Michael P. McDonald, United States Elections Project — Voting-Eligible Population (VEP) Turnout Rates, 1980–2022 v1.2. October 2024. election.lab.ufl.edu
Used in: turnout calibration on the methodology page (every state-level rate is checked against this benchmark), plus the CPS over-report cross-check on Finding 1 and Finding 2, and the policy-lever chart on Finding 6.

[3] U.S. Election Assistance Commission, Election Administration and Voting Survey (EAVS), Sections A / C / D / F + Policy Survey, May 2025 time-series release. 2004–2022 jurisdiction-level data. Scope: 61,262 county/jurisdiction rows × 125 variables per section (DATA_INVENTORY.md §EAVS). eac.gov/research-and-data/eavs-retrospective
Used in: the methodology page — registration-rate cross-validation and election-administration context.

[4] MIT Election Data and Science Lab (MEDSL), state-level presidential / senate / house returns, 1976–2024. github.com/MEDSL
Used in: Finding 7 — Institutional structure and the methodology page — state returns and presidential-margin competitiveness.

[5] U.S. Census Bureau, American Community Survey Citizen Voting-Age Population (CVAP) Special Tabulation 2019–2023. State × race/ethnicity. Scope: 677 state-level rows × 13 race/ethnicity categories (DATA_INVENTORY.md §ACS CVAP). census.gov/data/datasets/2023/dec/rdo/2019-2023-CVAP.html
Used in: the methodology page — 2024 population denominators and demographic reference.

State policy layer

[6] National Conference of State Legislatures (NCSL), state election policy taxonomy — automatic voter registration, same-day registration, online voter registration, pre-registration at 16/17, no-excuse absentee, universal vote-by-mail. Current as of April 2026. Scope: 155 state-policy rows across 6 policy dimensions (DATA_INVENTORY.md §NCSL). ncsl.org/elections-and-campaigns
Used in: Finding 3, Finding 4, Finding 5, Finding 6 — State policy as the lever, The Solution, and the methodology page.

Institutional structure

[7] State congressional redistricting method taxonomy, derived from NCSL Redistricting Commissions page and Ballotpedia. Independent commission / advisory commission / backup commission / legislative-nonpartisan-staff / legislature classifications across all 50 states + DC. Scope: 51 state × redistricting-method classifications (DATA_INVENTORY.md §Redistricting).
Used in: Finding 7 — Institutional structure and the methodology page — redistricting-method classification.

[8] Commission-adoption event dataset, 1968–2021. Compiled from Ballotpedia ballot initiatives, NCSL records, and state constitutional histories. 13 commission-adoption events spanning the analytical window.
Used in: Finding 7 — Institutional structure and the methodology page — the within-state pre/post natural experiment.

Attitudinal complements

[9] American National Election Studies (ANES), Time Series Study — external efficacy, political trust, political interest series, 1960–2024. electionstudies.org
Used in: the efficacy supplement, Finding 2, Finding 7, and the methodology page — national-level attitudinal trends.

[10] Harvard Institute of Politics (IOP), Youth Poll — pre-election waves, 2016–2024. 18–29 cohort intent, trust, and country-direction measures. iop.harvard.edu/youth-poll
Used in: Finding 7 — Institutional structure and the methodology page — youth intent and attitudinal reference.

[11] Pew Research Center, political engagement cross-tabs, 2020–2023. Voter-file-validated turnout by age and race; always-vote habit measures. pewresearch.org/politics
Used in: Finding 1, Finding 2, and the methodology page — CPS over-report benchmarking.

Standard-error methodology

[12] U.S. Census Bureau, CPS Generalized Variance Parameters, November 2022 Supplement, Technical Documentation Tables 8–11. Used in place of replicate weights (not published for the November supplement). Scope: 51 state parameter rows (50 states + DC) (DATA_INVENTORY.md §GVF). census.gov/programs-surveys/cps/techdocs
Used in: Finding 1, Finding 2, and the methodology page — standard-error computation for published rates.

Intervention effect-size catalog

[13] Donald P. Green & Alan S. Gerber, Get Out the Vote: How to Increase Voter Turnout, 4th ed. Brookings Institution Press, 2019. Foundational meta-analysis of GOTV interventions across canvassing, phone banking, mail, digital, and peer-to-peer modalities.
Used in: Findings 2 · 3 · 4 · 5 · 6, the methodology page, and How do we know this? — intervention effect sizes and costs.

[14] Brennan Center for Justice, cost-benefit analyses of election reforms — automatic voter registration, same-day registration, mail-ballot expansion, polling-place access. brennancenter.org
Used in: Findings 2 · 3 · 4 · 5 · 6 · 7, the methodology page, and How do we know this? — reform cost-benefit analyses.

[15] Results for America, evidence-based policy infrastructure for voter participation. results4america.org
Used in: Finding 6, Finding 7, and the methodology page — evidence-based policy infrastructure.

Methodology adjustments

[16] Aram Hur & Christopher H. Achen (2013). "Coding Voter Turnout Responses in the Current Population Survey." Public Opinion Quarterly, 77(4), 985–993. CPS over-reporting adjustment baseline.
Used in: the methodology page — the CPS over-report adjustment.

[17] Stephen Ansolabehere, Bernard L. Fraga, & Brian F. Schaffner (2022). "The Current Population Survey Overstates Turnout, but Race-Specific Estimates Are Robust." Caveat on Hur-Achen weighting for groups disaggregated by race.
Used in: the methodology page — the race-specific robustness caveat on the over-report adjustment.

Mobile- and internet-voting research (background)

Background research — not cited in the v1 report. These references support the project's internal review of online and mobile voting; they do not back any finding on this analytical site, and are listed here for completeness. They would be activated if mobile- or internet-voting interventions enter the analysis scope.

[18] Mihkel Solvak & Kristjan Vassil (2018). E-voting in Estonia: Technological Diffusion and Other Developments Over Ten Years (2005–2015). Republic of Estonia. Foundational empirical evidence on Estonia's i-voting system.

[19] Nicole Goodman & Leah C. Stokes (2020). "Reducing the Cost of Voting: An Evaluation of Internet Voting's Effect on Turnout." British Journal of Political Science, 50(3), 1155–1167.

[20] Daniel Stockemer (2024). Reanalysis of Goodman & Stokes (2020). Policy & Internet. Methodological reanalysis finding the Goodman-Stokes lift effect was largely a novelty effect plus spatial non-randomness.

[21] Michael A. Specter, James Koppel, & Daniel Weitzner (2020). "The Ballot is Busted Before the Blockchain: A Security Analysis of Voatz, the First Internet Voting Application Used in U.S. Federal Elections." USENIX Security Symposium.

[22] Andrew W. Appel et al., Princeton Center for Information Technology Policy (CITP), 2025–2026. Critique of Tusk Philanthropies' VoteSecure SDK, including the January 2026 statement: "Internet voting is insecure and should not be used in public elections."

Analytical samples used per finding

Each finding's central numbers come from the pooled CPS Voter Supplement microdata [1]. The specific analytical samples that drive the visible numbers on each finding page:

Finding	Analytical sample size (n)	Source file in this repo
1. The persistent gap	Per-cycle: youth (18–29) n = 12,303–18,524; seniors (65+) n = 14,759–21,642 across 13 federal cycles	`src/data/findings/finding-01/gap_by_year.json`
2. The midterm amplifier	Same per-cycle samples as Finding 1, split by presidential vs midterm cycle	`src/data/findings/finding-02/by_cycle.json`
3. Who didn't register, and why	n = 47,011 pooled youth non-registrants — access barrier 20,556; engagement 10,300; other 9,889; personal 6,266	`src/data/findings/finding-03/reason_national.json`
4. Registered, didn't vote, and why	n = 34,031 pooled registered youth non-voters — logistical 17,216; engagement 7,769; other 4,795; access 2,992; personal 1,259	`src/data/findings/finding-04/reason_national.json`
5. How young voters actually vote	Youth voters' voting-method shares by state regime + cell; cell-level n in source JSON (range ~1,208–30,622 by race × gender pooled)	`src/data/findings/finding-05/method_cell.json`, `…/state_regime.json`
6. State policy as lever	State × year × youth panel; per-state-year n ranges 93 (ME 2020) to 1,326 (CA 2020); suppression threshold n < 75	`src/data/findings/finding-06/policy_cross.json`
7. Institutional structure	51 states × 13 federal cycles aggregated by redistricting method (state-level analysis; no per-respondent n)	`src/data/findings/finding-07/method_youth.json`
Supplement — efficacy (Does my vote matter?)	ANES [9] national-level efficacy / trust series. Race-disaggregated youth cells too small for stable per-cell rates (n = 56 Black youth 2020; n = 79 Hispanic youth 2020 — see methodology page)	`src/finding-efficacy.md`

Cell-level samples (race × gender × age) are exposed inline on each finding page via the provenance chip ⓘ — hover or focus reveals n, generalized-variance-bounded SE, year range, and suppression status. Representative examples from the precomputed cell JSONs:

Finding 4 — Black female registered non-voters: n = 1,924 · White male registered non-voters: n = 11,958 · Hispanic male registered non-voters: n = 1,987 (pooled 2000–2024)
Finding 5 — Black male youth voters: n = 3,739 · White female youth voters: n = 30,622 · Asian female youth voters: n = 1,342 (pooled across all methods + years)
Finding 6 — CA 2020 youth respondents: n = 1,326 · MA 2024: n = 267 · ME 2020: n = 93 (cells below n = 75 are suppressed)

Sources for cell-level n: src/data/findings/finding-04/reason_cell.json, finding-05/method_cell.json, finding-06/policy_cross.json. The SQL that produces these is in scripts/precompute/manifest.py.

A note on external sample sizes

References [2], [4], and [9]–[22] point to external data products and peer-reviewed papers (McDonald VEP, MIT MEDSL, ANES, IOP, Pew, Green & Gerber, Brennan, etc.). Their sample sizes are documented in each source's own technical materials — codebooks, methodology statements, or the papers themselves. We don't restate those n's here because we don't carry the per-respondent microdata for those sources in this repo (we use them as aggregates or as cited findings). For verifiable n, follow the link to the source.

The one exception we do document: ANES race-disaggregated youth cells (n = 56–79 for 2020 Black/Hispanic youth) are noted on the methodology page because we explicitly tested ANES as a per-cell cross-validator of CPS and disclosed why it doesn't work at that disaggregation.

Data provenance and reproducibility

The full data inventory — every acquired asset, its vintage, scope, known limitations, and analytical role — is maintained internally in DATA_INVENTORY.md and is available on request. Pipeline and analysis code are similarly maintained internally; methodology decisions are fully documented on the methodology page.

For methodological choices — the CPS coding decision, the Hur-Achen adjustment, the generalized-variance approach, the group-level cross-validation work that did and didn't pan out — see the methodology page.