Imperial China Elites

Empirical assessment of claims about elite structure, keju, and state capacity. Tested against CBDB (657k persons), Wang's replication data, and prefectural census figures.

Elite dispersal

Rise & Fall — Wang

Rise & Fall — Huang

Where the elite came from

Jiguan (registered hometown) of top officials and their patrilineal kinsmen, from CBDB. Each dot is one county-level address; size = persons at that location.

Tang

N. Song

S. Song

Ming

The question: To produce 80% of top officials and their families, how much of the empire's population do you need? A low number means a narrow geographic enclave dominates. A high number means officials are drawn broadly from where people live.

How broad was the elite's geographic base?

What share of the empire's population lived in the prefectures that produced 80% of top officials' families? Higher = officials drawn from a wider slice of the population. Lower = a narrow enclave dominates.

Steady de-imperialization from Tang through Ming. The share of the population needed to produce 80% of the elite rises from 12.5% (Late Tang) to 43.9% (Ming). By Ming, top officials are drawn from nearly half the empire's population, spread across 39 prefectures. No single prefecture accounts for more than 6.8% of elite kinsmen. The overrepresentation ratio falls from 3.5x (N. Song) to 1.8x (Ming): the elite increasingly looks like the population.

Southeastern exam-culture regions still overrepresented. All top 10 Ming prefectures are in Zhejiang, Jiangxi, Fujian, and the Lower Yangzi. But the gap is narrowing, not widening. This explains the Ming 55% southern quota (Huang p.65): even with progressive dispersal, keju amplified regional advantages enough that the court had to impose external constraints to maintain geographic balance.

Top 10 prefectures: elite share vs. population share

Sorted by elite share. Overrepresentation = elite share / population share. 1.0x = proportional.

N. Song

S. Song

Ming

Comparison with Tackett and why the numbers diverge

Period	Our method	Tackett	Officials	Kinsmen
Late Tang	—	12.5%	Tackett's data only
N. Song	22.7%	12.9%	452	1,890
S. Song	29.3%	56.7%	184	506
Ming	43.9%	—	1,821	4,761

The two methods agree that de-imperialization occurred from Tang through Song, but produce different absolute numbers. This reflects real methodological differences, not errors.

Tackett's method

Location data: Burial sites of kinsmen, identified from epitaphs. Measures where elite families physically lived and died.

Metropole definition: Hand-drawn contiguous regional polygons around visible clusters. Includes all prefectures within the polygon boundary, even if a given prefecture produced no officials.

Denominator: 742 Tianbao census (Tang), 1080 census (Song). Real population counts.

Strengths: Burial sites capture actual physical presence. Deep source expertise for Tang (epitaphs are primary material). Contiguous polygons reflect real geographic regions.

Weaknesses: Small samples. Burial sites lag residence by generations (family graveyards). Polygons are subjective. Tang methodology (epitaph corpus) fundamentally differs from Song methodology, making cross-period comparison less clean. Not extensible to Ming/Qing.

Our method

Location data: Jiguan (registered ancestral hometown) from CBDB. Measures where elite families were administratively registered.

Metropole definition: Rank prefectures by elite kinsmen count, accumulate from most to least concentrated until 80% captured. Only counts prefectures that actually produced kinsmen.

Denominator: Song Shi census (Song), Cao Shuji 1393 census (Ming). Real population counts.

Strengths: Consistent methodology across all periods. Same data source (CBDB) throughout. Large samples, especially Ming (4,761 kinsmen). Fully reproducible. Extensible to any CBDB-covered period.

Weaknesses: Jiguan ≠ actual residence. CBDB coverage varies by period. Nearest-prefecture assignment is crude. Tang jiguan is often geocoded to ancestral choronyms rather than actual hometowns (Tackett, JCH 2024).

Why the N. Song numbers diverge

Our figure (22.7%) is higher than Tackett's (12.9%). This is because jiguan is more geographically dispersed than burial location. Many N. Song officials were registered to hometowns across the empire but buried near Kaifeng where they served. Tackett's burial-site measure captures the physical concentration of the ruling class around the capital; ours captures the wider geographic origins from which they were drawn. Both are valid measures of different things.

Why the S. Song numbers diverge

Our figure (29.3%) is much lower than Tackett's (56.7%). This reflects the polygon-vs-prefecture difference. Tackett draws a contiguous polygon covering most of Jiangxi, Zhejiang, and Fujian. That polygon encompasses many prefectures, including those that produced few or no officials, and their combined population reaches 57% of the Song total. Our method only counts the specific prefectures that actually produced elite kinsmen. Additionally, our S. Song sample is small (506 kinsmen from 184 officials), reducing geographic coverage.

Which is more reliable for cross-period comparison?

Our method is better suited for comparing across periods because it applies the same procedure to each one. Tackett's Tang analysis relies on a specific epitaph corpus processed through deep Sinological expertise; his Song analysis uses different sources and a different geographic frame. The two are not strictly comparable, as he acknowledges. For any single period, Tackett's granular knowledge of the sources may produce more accurate results. For tracking a trend across four centuries, consistency of method matters more than accuracy at any single point.

The key finding is robust across both methods: the geographic base of the elite expanded progressively from Tang through Ming. They disagree on levels but agree on direction.

Data sources

Period	Population data	Elite data
Late Tang	742 Tianbao census (Tackett)	Tackett epitaph corpus
N. Song	Song Shi census (DGSD v1.1, Mostern & Meeks)	CBDB top officials + kin
S. Song	Song Shi census, southern prefectures	CBDB top officials + kin
Ming	Cao Shuji 1393 census (Brill, 2024)	CBDB top officials + kin

Officials: Grand Secretaries, Grand Councilors, Six Ministry Ministers, Privy Council heads, Censors-in-chief. From CBDB office records (602k).

Kinsmen: Patrilineal relatives with geocoded jiguan, assigned to nearest census prefecture (300km max). From CBDB's 472k patrilineal ties.

References: Tackett, "Imperial Elites, Bureaucracy, and the Transformation of the Geography of Power" (De Gruyter). Mostern & Meeks, Digital Gazetteer of Song Dynasty China v1.1 (2022). Cao Shuji, The Population History of China 1368-1953 (Brill, 2024), Appendix 1.

Caveat: jiguan lag and Hongwu localization

Jiguan (registered ancestral hometown) can persist for generations after a family relocates. A family registered to Ji'an might have been living near the capital for two generations by the time a grandson enters office. If common, this overstates dispersal: the elite looks geographically broad because we're mapping ancestral registrations, not where officials were actually recruited from.

This bias likely varies across periods. For Tang, Tackett (JCH 2024) showed it is severe: the Li imperial clan is geocoded to Longxi (Gansu) when they lived in Chang'an. For Song, jiguan should be somewhat more accurate as families increasingly identified with actual residence. For Ming, the Hongwu Yellow Register system imposed strict household registration with decennial updates and legal restrictions on geographic mobility. Jiguan should be most accurate here.

This creates a second confound. Hongwu's lijia system actively prevented geographic mobility: people stayed where they were registered because they were legally required to. So the Ming result (43.9%) may partly reflect enforced population immobility rather than genuine broadening of elite recruitment. The exam system reached into localized communities because communities were frozen in place. Once those controls weakened in late Ming and Qing, exam-culture families may have naturally agglomerated in centers like Jiangnan, re-concentrating the elite. Part of the Tang-to-Ming dispersal trend could also reflect improving registration fidelity rather than real structural change.

The Rise and Fall of Imperial China (Princeton, 2022). Replication data available. Tested against CBDB.

Claims tested

Supported

Weak

CBDB verification

Wang's Song sample (1,789 officials) is 99.7% from CBDB. His Tang data (134 patrilines) is from Tackett. Marriage tie density varies 40x across periods (Song 214 vs Ming 5.6 per 1k persons), making cross-dynasty comparison impossible. Source types shift: Tang/Song from epitaphs, Ming/Qing from genealogies.

Verdict

The network topology argument (star → bowtie → ring) requires comparable cross-dynasty data that does not exist. Wang's strong evidence (exam recruitment shift, fiscal decline) was established by Tackett, Hymes, and Hartwell before his book. Credit to Wang for making his data public.

The Rise and Fall of the EAST (Yale, 2023). No replication data published.

Replication crisis

None of Huang's four core empirical claims have publicly available replication data. His premier database (2,225), exam candidate dataset (~9,500), Chinese Development Index (3,988 achievements), and ideological diversity index are all proprietary. Potential partial remedy: Wang, Erik H. & Yang, Clair (2025), Political Economy of China's Imperial Examination System, Cambridge Elements (open access).

Claims tested

No data

Partial

What CBDB can test indirectly

Test	Finding	Implication
Law of avoidance	Officials served far from home (0.94-0.97)	State maintained separation regardless of keju
Intergenerational	Sons of exam fathers: 75-83% take exams	Keju reproduced elites, didn't disrupt them
Geographic mobility	Exam and yin entrants: identical distances	Keju didn't create geographic dispersion
Exam share	Song 96%, Yuan 42%, Qing 54%	Keju was never the singular channel

Verdict

Huang's arguments about keju as state-capacity instrument may be correct. We cannot know, because the empirical foundations are sealed behind proprietary datasets. Wang (Princeton) published his replication data. Huang (Yale) did not.