What a Wage Hides When Insurance Comes With the Job
Three Series, One Object
We have arrived at the synthesis. The labor-economics series ended in Search and Matching 5 by re-reading the AKM firm effect \(\psi_j\) as a composite of three forces: productivity rent, posted-wage rent, and amenity offset. The health-insurance series ended in Health Insurance 5 by identifying the central institutional fact of the U.S. system: coverage is bundled with employment, and the bundle creates a tax-subsidized, partially portable, employer-mediated risk pool. The job-lock bridge has shown how that institutional fact distorts the outside option, the acceptance set, and the mobility margin.
This last note connects the three. The claim is that the institution we developed in the health series, ESI and its associated tax and portability structure, is not merely one more amenity in the compensating-differentials decomposition of \(\psi_j\). It is a quantitatively important amenity that operates through a mechanism the standard sorting models do not contain: a retention rent generated by the non-portability of the firm pool.
Before we develop the argument, fix the conclusion as a single picture.

The figure decomposes the firm wage premium \(\psi_j\) into four components for three illustrative firm archetypes. The first three components — productivity rent (navy), posted-wage rent (slate), amenity offset (green) — are the standard decomposition from Search and Matching 5. The fourth, in orange, is the insurance retention rent, the piece of the firm premium that exists because the firm controls a non-portable risk pool. The retention rent is zero at the low-coverage firm, a modest +0.04 log points at the mid-coverage firm, and +0.09 log points at the high-coverage firm. The net \(\psi_j\), marked by the diamond at the right of each bar, is the same observed number an AKM regression would deliver: +0.11, +0.15, and +0.18 respectively. Two firms with similar net \(\psi_j\) can have completely different compositions; that observation will do most of the work in the section after next.
The remainder of the note develops the analytics behind the figure, and asks three concrete questions that the empirical literature is poorly equipped to answer with its current tools.
What We Knew Before This Note
In Search and Matching 5 we wrote the AKM firm effect, in any structural model with productivity, rent, and amenities, as \[
\psi_j \;=\; \underbrace{\delta\, y_j}_{\text{productivity rent}} \;+\; \underbrace{r_j}_{\text{posted-wage rent}} \;-\; \underbrace{\beta\, v(a_j)}_{\text{amenity offset}} \;+\; \varepsilon_j .
\] The amenity offset term collected, into a single composite \(\beta\,v(a_j)\), every non-wage characteristic of the firm that workers value. Health insurance was an amenity. So was schedule flexibility. So was prestige. The model did not distinguish among them.
For most amenities, that pooling is harmless. Schedule flexibility and prestige are properties of the firm-job pair that the worker either gets or does not, and the only thing the worker can do in response is take or refuse the offer. Whether one calls these amenities valuable for consumption reasons or production-of-utility reasons, the comparative statics of the standard hedonic model apply directly.
ESI is different in two ways the standard amenity treatment cannot capture.
First, ESI is a risk-pooling institution, not just a consumption good. Its value to the worker depends on the size and composition of the firm pool, on the worker’s own health status, and on the alternative coverage options available outside employment. A worker’s \(v(a_j)\) from ESI varies by the health-risk type \(h\) in a way that schedule flexibility does not. This is what we built into the model of Job Lock 1, by letting the wedge \(w_{\text{lock}}(h)\) depend on \(h\).
Second, ESI is not portable. The value the worker derives from the amenity at firm \(j\) is partially destroyed by leaving firm \(j\). Schedule flexibility is portable in the sense that another firm can offer the same flexibility. ESI is not, because the risk pool itself does not move with the worker. The wedge between \(v(a_j)\) at firm \(j\) and the worker’s best alternative value of coverage outside firm \(j\) is a structural feature of ESI that other amenities do not share.
A simple way to see why this matters: imagine two firms, both of which let employees take Fridays off. A worker who quits one to go to the other still has Fridays off. Now imagine two firms, both of which offer a generous health plan. A worker who quits one to go to the other has to switch plans, possibly with a coverage gap, possibly with a new deductible reset, and possibly losing access to her current doctors if the new plan’s network is different. The portable amenity transfers cleanly across firms. The non-portable amenity does not.
The consequence is that \(\beta\, v(a_j)\), with a single offset rate \(\beta\) for all amenities, is misspecified. ESI’s offset operates through a different channel than other amenities, and conflating the two obscures both effects. We must decompose \(v(a_j)\) further.
Insurance Retention Rent
Let \(a_j\) have two components: \(a_j^{\text{other}}\) for portable amenities, and \(a_j^{\text{ins}}\) for the ESI component. Write the worker’s valuation of \(a_j^{\text{ins}}\) at firm \(j\), for a worker of health type \(h\), as \[ v_{\text{ins}}(a_j, h) \;=\; v^{\text{port}}(a_j) \,+\, w_{\text{lock}}(h), \] where \(v^{\text{port}}(a_j)\) is the part of ESI value that would be available even if the worker quit (subsidized exchanges, COBRA continuation, spousal coverage transitions), and \(w_{\text{lock}}(h)\) is the non-portable component, the value the worker would lose by separating, scaled by the worker’s health type.
This decomposition turns out to matter for \(\psi_j\). The Rosen full-offset prediction \(\psi_j = -v(a_j) + \text{const}\) from Search and Matching 5 was derived under frictionless mobility. Under frictions plus non-portability, the offset arithmetic is more complicated: a firm that provides generous ESI faces two competing pressures on its wage-setting.
- Compensating-differential pressure drives \(\psi_j\) down by the portable amenity value \(v^{\text{port}}(a_j)\), by the standard hedonic logic.
- Retention pressure keeps \(\psi_j\) above what the pure CD logic would predict. Because the firm’s high-\(h\) workers face a non-portable wedge \(w_{\text{lock}}(h)\), the firm has latitude to pay them less than their marginal product without losing them. It captures part of that latitude as an upward shift in its wage policy relative to firms offering only portable amenities. In any model in which firms set wages and workers face mobility frictions, some part of this latitude is consumed by the firm and some is passed back to workers, and the share that is passed back constitutes a positive component in \(\psi_j\).
The augmented decomposition becomes \[ \psi_j \;=\; \delta\, y_j \,+\, r_j \,-\, \beta\, v^{\text{port}}(a_j) \,+\, \rho\, w^{\text{lock}}_j \,+\, \varepsilon_j , \] where \(w^{\text{lock}}_j\) is the firm-level pool-weighted lock wedge and \(\rho \in [0, 1]\) is the share of retention rent that the firm passes back to workers.
The new term \(\rho\, w^{\text{lock}}_j\) has the same sign as the productivity-rent and posted-wage-rent terms, but a different microfoundation. It is a wage premium caused by the firm’s ability to retain workers through a non-portable amenity, not by its productivity or its position on the wage-posting distribution. It is the part of \(\psi_j\) that is created by ESI itself.
The intuition is simple once stated. A firm that offers a great health plan retains its sick employees more cheaply than a firm without it. The firm therefore does not have to pay those workers their full marginal product to keep them. Some of that surplus the firm keeps for itself (as profit). Some of it flows back into wages, because in a frictional labor market firms compete on the whole package, and a firm with a generous plan can still attract sicker workers at a slightly higher wage than a firm without one. The net effect on \(\psi_j\) is positive but partial. Part of the wedge becomes profit, part becomes a wage premium that is not productivity, not posted-wage rent, and not an offset.
This is the third reading of \(\psi_j\) we had begun to develop in Search and Matching 5 but could not give the right name to. The third reading is insurance retention rent, and it exists only because the institution we built up in Health Insurance 5 has the specific non-portability feature it has.
Splitting the Wage: How Much Is the Job, How Much Is the Coverage?
The retention-rent term has a clean implication for how individual wages should be decomposed. Consider a worker \(i\) employed at firm \(j\) at time \(t\). Their log wage has the AKM decomposition \[ y_{it} \;=\; \theta_i \,+\, \psi_{J(i,t)} \,+\, \varepsilon_{it} . \] The retention-rent decomposition above lets us split \(\psi_{J(i,t)}\) into a match-quality piece and a coverage premium: \[ y_{it} \;=\; \theta_i \,+\, \underbrace{\Big[\delta\, y_j + r_j - \beta\, v^{\text{port}}(a_j)\Big]}_{\psi_j^{\text{match}}} \,+\, \underbrace{\rho\, w^{\text{lock}}_j}_{\psi_j^{\text{coverage}}} \,+\, \varepsilon_{it} . \] Two corollaries follow.
Variance decompositions are biased. Standard inequality decompositions that read \(\mathrm{Var}(\psi_j)\) as a “firm-driven” component of wage inequality conflate \(\mathrm{Var}(\psi_j^{\text{match}})\), the part driven by productivity and rents in the no-ESI world, with \(\mathrm{Var}(\psi_j^{\text{coverage}})\), the part driven by the cross-firm distribution of ESI generosity and pool composition. The two have different welfare interpretations: the first is consistent with productivity-based heterogeneity, while the second is in part a transfer from workers (in the form of retained labor) to firms (in the form of below-marginal-product wages) that happens because of the institutional structure of U.S. health insurance.
Sorting estimates are biased. Estimates of \(\mathrm{Cov}(\theta_i, \psi_j)\), taken as evidence of assortative matching, conflate sorting on productivity with sorting on coverage. High-ability workers may sort into high-\(\psi\) firms not only because those firms are productive, but because those firms offer better ESI, and because high-ability workers are also more likely to have the resources to negotiate or shop on coverage. The two channels are observationally similar in standard data, and the magnitude of the contamination depends on how much of the cross-firm variation in \(\psi_j\) is driven by \(\psi_j^{\text{coverage}}\).
For a concrete reading, consider two firms that both have estimated \(\psi_j \approx 0.18\) in a standard AKM regression. Firm A is a tech firm with high productivity and a barebones health plan. Firm B is a hospital system with average productivity but an extremely generous health plan. AKM does not distinguish them. The decomposition above does: firm A’s premium is dominated by \(\delta\, y_j\), firm B’s by \(\rho\, w^{\text{lock}}_j\). A policy reform that makes coverage portable would leave A’s premium roughly unchanged but compress B’s. Treating the two firms as interchangeable, as the unadorned AKM coefficient implicitly does, will misread the effect of any portability reform.

The figure plots a simulated cross-section of firms in the \((\psi_j^{\text{match}}, \psi_j^{\text{coverage}})\) plane. Each light-blue dot is a firm. The dashed diagonal lines are iso-\(\psi_j\) contours: any two firms lying on the same line are observationally identical to an AKM regression. The thick orange line is the iso-\(\psi_j = 0.18\) contour. Firms A and B both sit on it. AKM cannot see the distance between them along the line, only the fact that both are at the same height above the origin. The arrows trace what a portability shock does to each: A barely moves (its premium is productivity, which the shock does not touch), while B’s \(\psi_j^{\text{coverage}}\) collapses toward zero and the firm slides down the iso-line to a lower-\(\psi\) contour. Any empirical strategy that reads \(\psi_j\) alone, without partitioning it, will average across firms that the policy is going to treat very differently.
Watching Where Workers Move, Once More
Sorkin (2018) gave us, in Search and Matching 5, a method for separating \(\phi_j\) (total worker value of firm \(j\)) from \(\psi_j\) (cash wage premium) using poaching flows. The residual \(\phi_j - \psi_j\) was the amenity value \(v(a_j)\). With the augmented decomposition above, we can take that separation a step further.
If the labor market is partitioned by worker health type \(h\), the revealed-preference flow \(\phi_j(h)\) for type-\(h\) workers will over-weight firms that offer ESI relative to flows for low-\(h\) workers. A firm with high \(w^{\text{lock}}_j\) will see disproportionate inflow of high-\(h\) workers and disproportionate retention of them once hired, both because their alternative-coverage options are worse and because the firm captures more of the bargaining surplus from them.
The difference between \(\phi_j(h)\) for a high-\(h\) subpopulation and \(\phi_j(h)\) for a low-\(h\) subpopulation therefore identifies the health-risk-specific component of the coverage premium. This is one direction in which the labor and health series literally combine, by partitioning Sorkin’s analysis on a health-risk dimension that the canonical revealed-preference setup did not contain.
The point of this synthesis is not to claim that anyone has yet implemented this decomposition cleanly. It is to identify what the right decomposition is, and to flag that several of the standard objects in the firm-effects literature (variance shares, sorting covariances, revealed-preference rankings) change their welfare interpretation once the coverage premium is recognized as a distinct force.
Three Open Questions
The synthesis suggests three questions that the existing literature is poorly equipped to answer with the tools it has. The first two are positive, the third is normative.
Question 1. How large is \(\rho\, w^{\text{lock}}_j\) relative to \(\delta\, y_j\) and \(r_j\) in the U.S. data? The cross-firm variance of \(\psi_j\) is typically dominated, in standard AKM estimates, by what is interpreted as productivity differences. The framework above predicts that some non-trivial share of that variance is, in fact, insurance retention rent. Quantifying this share requires variation that separately identifies \(w^{\text{lock}}_j\) from \(y_j\) at the firm level, for instance, by exploiting policy variation in portability (ACA exchanges, Medicaid expansions) that acts asymmetrically on firms with different pool compositions.
Question 2. Does the introduction of portable coverage (ACA exchanges) reduce \(\psi_j\) at high-ESI firms? A clean prediction of the retention-rent framework is that a portability shock should compress the firm-effect distribution. The retention rent that ESI-providing firms had been capturing should partly dissipate. An event-study design at the firm level, around the 2014 ACA rollout, with treatment intensity given by pre-ACA pool composition, is the natural test. The post-2014 literature on labor-market effects of the ACA has tended to look at worker-level mobility outcomes, not at firm-level wage-premium changes, leaving this prediction relatively under-tested.
Question 3. What is the welfare cost of the mismatch the institution generates? The combination of Search and Matching 5 and Health Insurance 3 gives us all the ingredients for a unified welfare accounting: a hedonic decomposition of \(\psi_j\) that prices each amenity correctly, and a selection-market framework that prices the inefficiency of the insurance contract itself. Putting the two together, that is, asking not just how much insurance is mispriced under selection, but how much labor is misallocated under the institution that bundles insurance with employment, is a research program that neither sub-literature has yet undertaken at scale.
Closing the Loop
We began the labor series with the simplest possible question: who earns what, and why? The AKM framework gave us a clean decomposition into worker and firm effects, and the structural search literature complicated that decomposition by re-interpreting both objects as outcomes of an underlying matching equilibrium. We then opened the health-insurance series with an even simpler question: why is a rational person willing to pay more than their expected loss? The framework that answered that question delivered a theory of optimal contracts that immediately broke down once private information about risk was introduced. The institutional fix, namely bundling coverage with employment, produced a labor-market wedge that the bridge series then translated back into the structural objects of the labor models.
The synthesis is that each part of the picture rewrites the others. The firm effect \(\psi_j\) in the labor series is not the same object once we recognize that part of it is an insurance retention rent generated by the institutional setup of Health Insurance 5. The welfare cost of selection in the health-insurance series is not the same object once we recognize that the institution that causes the selection problem also creates a labor-market distortion of its own. And the job-lock literature, which sometimes presents itself as a narrow question about mobility, is in fact a window into the question of how a country’s choice of insurance institution shapes the structure of its labor market at every level, from individual reservation wages, to the variance of firm wage premiums, to the welfare arithmetic of inequality.
The notes that follow this synthesis, on the personal research side of this site, are about taking these questions to data. The framework is now in place.
References
Card, D., Cardoso, A. R., Heining, J., & Kline, P. (2018). Firms and Labor Market Inequality: Evidence and Some Theory. Journal of Labor Economics, 36(S1), S13–S70.
Dey, M. S., & Flinn, C. J. (2005). An Equilibrium Model of Health Insurance Provision and Wage Determination. Econometrica, 73(2), 571–627.
Lamadon, T. (2016). Productivity Shocks, Long-Term Contracts, and Earnings Dynamics. Working Paper, University of Chicago.
Lamadon, T., Mogstad, M., & Setzler, B. (2022). Imperfect Competition, Compensating Differentials, and Rent Sharing in the U.S. Labor Market. American Economic Review, 112(1), 169–212.
Sorkin, I. (2018). Ranking Firms Using Revealed Preference. The Quarterly Journal of Economics, 133(1), 353–401.