CAusal trees

Sociologists routinely partition their samples into subgroups to explore how the effects of particular events or interventions, or treatments, vary by covariates like race and gender. Scholars with interest in causal inference also explore how effects vary by selection into treatment. In both cases, the key subpopulations are determined by the researcher based on theoretical priors. Machine-learning techniques, however, allow researchers to explore sources of variation they may not have previously considered or envisaged, i.e. to explore data-driven treatment effect heterogeneity based on recursive partitioning. In this paper, Brand, Geraldo, Jiahui Xu (Pennsylvania State University), and Bernard Koch (UCLA) analyze an important topic in the stratification literature, the effects of higher education on unemployment and low wage work, with well-defined theoretical guidelines as to effect heterogeneity of interest, and compare what they learn from conventional interaction and propensity methods to machine learning methods. They encourage researchers to follow similar practices in their work on variation in sociological effects, and offer simple yet powerful tools by which to do so.