Joint work with Aslan Bakirov and Francesco Del Prato. Under review.

Abstract. How much do worker skills, firm pay policies, and their interaction contribute to wage inequality? Standard approaches rely on latent fixed effects identified through worker mobility, but sparse networks inflate variance estimates, additivity assumptions rule out complementarities, and the resulting decompositions lack interpretability. We propose TWICE—Tree-based Wage Inference with Clustering and Estimation—a framework that models the conditional wage function directly from observables using gradient-boosted trees, replacing latent effects with interpretable, observable-anchored partitions. This trades off the ability to capture idiosyncratic unobservables for robustness to sampling noise and out-of-sample portability. Applied to Portuguese administrative data, TWICE outperforms linear benchmarks out of sample and reveals that sorting and non-additive interactions explain substantially more wage dispersion than implied by standard AKM estimates.

Joint work with Vít Illichmann. Previously circulated as Convolutional Peer Effects. Python package available here.

Note: this paper is a prototype. Work is in progress on a more general framework. This paper is mostly intended to showcase its computational approach.

Abstract. We study structural estimation on networks in the empirically common case of a single large observed graph. We propose an adversarial estimator that minimizes statistical distance between observed and simulated node-specific distributions of local network neighborhoods. The paper provides two theoretical results: population identification via a divergence characterization of the estimation objective, and consistency under growing-graph asymptotics with cross-observation dependence. A key contribution is computational. We provide a reproducible estimation workflow that integrates fixed-point simulation, efficient focal-neighborhood data construction, and alternating minimax training with stabilization tools suitable for large-scale runs. The workflow is model-agnostic in a broad class of network structural models and is straightforward to implement with modern software. In benchmark simulations, the procedure scales to large graphs and recovers structural parameters with high precision.