The World Wildlife Fund's terrestrial ecoregion dataset classifies the earth's land surface into 867 distinct ecoregions grouped by biome type. The Eurasian steppe falls under the "Temperate Grasslands, Savannas, and Shrublands" biome (biome code 8), encompassing ecoregions such as the Pontic Steppe, Kazakh Steppe, Mongolian-Manchurian Grassland, and several Central Asian intermediaries. These polygons provide a standardized, peer-reviewed ecological boundary for the grassland corridor stretching from the Hungarian Plain to Manchuria. This is a common source in quantitative historical research because its boundaries are georeferenced, reproducible, and available as open-access shapefiles through the WWF/RESOLVE dataset (Olson et al. 2001; Dinerstein et al. 2017). However, biome 8 defines the steppe too narrowly for this application, excluding the semi-arid Kazakh steppe, Gobi margins, and Karakum/Kyzylkum zones that were integral to the nomadic corridor.
Peter Turchin's 2009 "Theory for Formation of Large Empires" operationalizes steppe proximity as a key variable driving imperial formation along "meta-ethnic frontiers" where agrarian and nomadic civilizations clashed. Turchin constructs distance-to-steppe measures using a combination of ecological data and historical atlases of nomadic polity extents, calibrated against the record of steppe confederacies (Xiongnu, Türk, Mongol). His approach is more historically dynamic than the WWF classification: the effective "steppe frontier" shifts over time as pastoral groups expand or contract. This framework is well suited for time-varying analyses of state formation but introduces subjective judgment about which historical periods define the frontier. Ko, Koyama, and Sng (2018) cite Turchin's work and his finding that large empires cluster along these frontiers.
Our polygon combines two sources. For the western and Central Asian sections (27–82°E), we derive the steppe boundary from the replication dataset published by Currie, Turchin, Turner, and Gavrilets (2020, Harvard Dataverse doi:10.7910/DVN/8TP2S7). Their gridded dataset classifies each cell's distance from the steppe under both "maximum" and "minimum" extent definitions; we use cells with maxSteppe_distance=0 to identify the steppe boundary. Because Turchin's "maximum extent" includes the full Sahara-to-Gobi arid belt (dipping to 30°N in the Levant), we filter to the pastoral steppe corridor relevant to Ko, Koyama, and Sng's argument, excluding Mesopotamian and Arabian desert cells. For the eastern section (82–126°E), the Turchin grid contains no cells because the Mongolian steppe interior has no agricultural land. Here we hand-trace the boundary from Ko, Koyama, and Sng (2018) Figure 2, incorporating the Ordos Loop dip toward Xi'an and the Manchurian grassland terminus. Points are smoothed through two sparse regions in the data: the Caucasus/Caspian corridor (33–49°E) and the Kazakh interior (50–64°E).