Tiebout sorting with progressive income taxation and a fiscal equalization scheme

This paper develops a model of Tiebout sorting with decentrally determined progressive income taxation and a builtin fiscal equalization scheme that redistributes money from richer to poorer regions. Both aspects are central to policy makers: the progressivity for equity reasons and the fiscal equalization to prevent an underprovision of the publicly provided good and to limit the degree of segregation of households according to income. The model is calibrated to the metropolitan area of Zurich (Switzerland), and policy evaluations reveal that a progressive tax scheme as the basis for decentrally determined tax rates causes strong segregating forces that can only to some extent be compensated by the fiscal equalization scheme.


Introduction
In countries with a federal structure, local governments have at least some autonomy on the spending-or expenditure-side of their budget. Whether or not such decentralization is beneficial or what degree of decentralization would be efficient, however, is the subject of a long-standing debate. In this paper, I contribute to the discussion by offering two extensions that are central in the context of decentrally determined income taxation: First, I allow for a progressive income tax scheme, for which the residents of each local jurisdiction vote on a tax rate multiplier to determine the size of the municipal budget; and, second, I allow for a fiscal equalization scheme that redistributes money from the rich to the poor local jurisdictions.
This setup reflects the implementation of fiscal decentralization in many federal countries. For example, it corresponds to the situation in Switzerland, where the high degree of decentralized government autonomy down to the municipal level is widely believed to be one of the cornerstones for the well-functioning of the country. 1 The empirical literature shows that the municipalities engage in income tax competition, which induces rich households to sort into the municipalities with lower tax rates (see, e.g., Feld and Kirchgaessner 2001;Schmidheiny 2006a). Roller and Schmidheiny (2016) look at the effective average and marginal tax rates of Swiss households. They find that the redistributive character of using progressive taxes is weakened-if not reversed-when tax rates are set at the municipal level, simply because the rich can avoid taxes by residing in municipalities with low tax rates. 2 These observations are in line with theoretical models with decentrally determined residence-based taxation, that also predict a segregation of the population according to income across municipalities. The claim is that this Tiebout sorting may cause significant disparities in municipality characteristics, which include tax rates, public good provision, and housing prices. 3 For the case of decentrally determined property taxation, see, e.g., Epple and Platt (1998), Epple et al. (2001), and Calabrese et al. (2006). For the case of decentrally determined income taxation, see, e.g., Calabrese (2001), who investigates the (limited) ability of linear income taxation for within-jurisdictional redistribution in the presence of tax competition, and Schmidheiny (2006b), who calibrates a model to the metropolitan area around the city of Zurich. Schmidheiny assumes that the publicly provided good does not create inter-jurisdictional spillovers and is perfectly rival in consumption, he considers a linear tax rate scheme, and he ignores the existence of transfers between jurisdictions. The present paper can be interpreted as an extension to Schmidheiny's model in that it relaxes all of these restrictions.
On normative grounds and for the case of a linear income tax rate with a spillover-generating publicly provided good with imperfect rivalry in consumption, Kuhlmey and Hintermann (2019) identify two inefficiencies in the presence of Tiebout-like sorting: (i) some households free-ride on the other household's tax payments within a given municipality, which leads to an inefficient allocation of households (i.e., intra-municipal free-riding) and (ii) municipalities free-ride on the provision of the publicly provided good in the other municipalities (i.e., inter-municipal free-riding). The former has also been labeled the "Jurisdictional Choice Externality" (JCE) by Calabrese et al. (2012), the latter is the classical free-riding. For a good with intermediate levels of spillovers and rivalry, they quantify each inefficiency to account for about one-third of the total welfare loss from Tiebout sorting with decentrally determined income taxation when compared to the decisions of a utilitarian social planner with access to individualized lump-sum taxes. 4 (The last third of the welfare loss is due to imperfect redistribution.) Without further restrictions such as asymmetric information, therefore, decentrally determined income taxation is clearly welfare-diminishing. 5 In this setting, central governments aim for policies that limit these negative consequences from decentralization. They have different options at their disposal that can be employed to restrict the degree of strategic behavior among households and local governments: command and control strategies (of, e.g., tax rates or the definition of the tax scheme), subsidies for publicly provided goods and services, or matching grants from a higher-level government. A combination of these instruments can be used to design a fiscal equalization scheme (FES). In such a scheme, the central government forces rich municipalities to pay, while offering subsidies to the poor municipalities (such that the rich municipalities become less rich and the poor, less poor). As a consequence, employing 4 In reality, we typically find progressive income tax schemes (rather than linear ones), where the degree of progressivity is set to enable a welfare-enhancing redistribution of income. But this exacerbates the JCE as rich households have a (too) strong incentive to segregate according to income such that the corresponding distribution of the population is even more inefficient. 5 The size and composition of the inefficiency in the current setting remains unknown. In Kuhlmey and Hintermann (2019), we focus on the relative importance of these inefficiencies for differing degrees of spillovers and rivalry in consumption of the publicly provided good. In the case at hand, I have assumed fixed moderate levels for spillovers and for rivalry in consumption. Allowing for a progressive tax scheme should decrease the redistribution externality but increase the inefficiency due to intra-municipal free-riding. The fiscal equalization scheme should decrease inter-municipal free-riding as the degree of underprovision of the publicly provided good should decrease particularly for poorer municipalities (the strongest free-riders). Without further investigation, though, it is not possible to draw conclusions on the net effect or the composition of the total inefficiency in this new setup. Take the nonlinearity of the tax scheme, which has two effects that are of particular interest for normative analysis. First, it considerably complicates the analysis of the conditions for income segregation (see Additional file 1: Online Appendix), and, second, it requires a more thorough discussion of welfare weights (since redistributive statements are now 'woven' into the shape of the tax scheme). Comprehensive normative analysis in the setting of this paper, therefore, is a topic on its own and seems beyond the scope of the present paper. 3 For most of the models, however, the existence of a (segregating) equilibrium cannot be ensured. An exception is a series of papers which follow the seminal contribution by Gravel and Thoron (2007), who present a model in which income segregation occurs if, and only if, the publicly provided good is either a gross substitute or a gross complement to the private consumption for every household. Gravel and Oddou (2014) generalize this result for the existence of a land market. For the case of decentrally determined income taxation, Oddou (2016) extends this approach to the case of decentrally determined income taxation and a publicly provided good that exhibits spillovers and finds that the conditions identified in Gravel and Thoron (2007) remain sufficient. However, this work is purely theoretical and largely lacks calibration, policy evaluations, or other empirically relevant analysis.
2 Put differently, changes in the tax scheme lead to migration responses. Kleven et al. (2020) review the empirical literature on such elasticities. They report estimates of elasticities that range from 0.02 up to infinity, very much depending on the context. Factors are, for example, the regional granularity (e.g., mobility within Switzerland is larger than within the USA), and how the population is composed (e.g., foreigners have higher elasticities than the domestic population). Bazzi (2017) analyzed international migration responses to income shocks and find that there are two counteracting effects, when there is a positive income shock: first, it raises the opportunity costs of emigration, but, second, by easing liquidity constraints, it also gives new options for emigration. He finds that in developed (rural) areas the first effect outweighs the second such that emigration is reduced if income raises (persistently).
FESs should align the distinctive characteristics of the included municipalities, in the sense that the heterogeneity of municipality characteristics will be reduced in their presence. Previous approaches and methods for assessing FESs were mostly limited to the presence of capital tax schemes. 6 In the context of decentrally determined income taxation, the previous approaches inherently ignored adjustments in prices and quantities, and-most importantly-migration. 7 This is why I will use a calibrated general equilibrium model to assess the effect of progressive local income taxation and the effect of a local FES on the size of the two inefficiencies. The JCE translates to an inefficient segregation of households and the inter-municipal freeriding implies too low production and consumption levels of the publicly provided good, whenever the publicly provided goods exhibits spillovers. I will first gradually remove the fiscal equalization scheme to see to what extent the FES effectively reduces the JCE and contributes towards increasing public good levels. As a second policy evaluation, I will change the tax scheme, which is exogenous to the municipalities (which only set a tax rate multiplier) to quantify the effect of progression on the two inefficiencies. Both instruments are set by the central government and therefore taken as being exogenous to the local governments.
To perform the policy evaluations sketched above, I build on the model of Kuhlmey and Hintermann (2019). I allow for spillover-generating and imperfectly rival public goods, but also extend it in three dimensions: I add taste heterogeneity with respect to the publicly provided good, I model the local fiscal equalization scheme (as it is implemented in the canton of Zurich), and I allow for a progressive (cantonal) tax code. I then calibrate this model to the metropolitan area of Zurich. Municipalities in the canton of Zurich are (1) restricted to set a linear multiplier on the cantonal progressive income tax scheme, and (2), depending on their relative fiscal capacity, they also receive money from or pay money to a FES, which aims at aligning the fiscal capacity. 8 I can show that compared to a revenue-neutral linear tax scheme, the implemented progressive tax scheme in Zurich indeed increases the segregation of rich and poor households: Whereas the average income in the 'rich' municipality is on average only 30% higher than in the 'poor' municipality, this difference increases to 60% for the progressive tax scheme. The second policy that I evaluate is the FES. Here I can show that it effectively limits the degree of segregation: As I decrease its redistributional effects, I predict a considerable increase in the segregation of rich and poor households between municipalities. With regard to the underprovision of the publicly provided good, neither of the two policy instruments effectively curb this prevailing underprovision.
The structure of the paper is as follows: In the next section, I present and describe the model, which is calibrated to the metropolitan area of Zurich in Sect. 3. In Section 4, I gradually remove the fiscal equalization scheme to assess its impact and also discuss changes in the cantonal progressive tax code. Section 5 concludes.

Model
In this section, I first describe the general setup of the model; then, I specify the production technology, the preferences, and the budget balance conditions before showing some equilibrium properties.

Basic setup and structure
The model economy consists of j = 1, . . . , J municipalities. Each is defined by three characteristics: A housing price p j , a tax rate (multiplier) t j , and the level of public consumption g j . The tax rate is subject to majority voting and determines (together with the tax base of the municipality) the level of public consumption. The housing price depends on the aggregate demand for and the aggregate supply of housing, such that the characteristics of the municipalities depend on the endogenous residential choices of households.
Households gain utility from consuming the publicly provided good g j , housing h j , and a numeraire 6 The list of contributions includes Bucovetsky and Smart (2006), who show that a tax base equalization scheme helps the central government to establish equity and efficiency, even with an endogenous capital supply. For the case of German business taxation at the local level, Buettner (2006) unravels the incentive structure implied by the complex interplay of vertical and horizontal equalization instruments implemented at the municipal level. And Egger et al. (2010) examine the German municipalities' ability to effectively change their fiscal capacity in order to choose one of two alternative transfer schemes. 7 For example in Switzerland, the canton of Zurich applies a tax base equalization scheme. To evaluate its efficacy, the statistical office computes counterfactual tax rate multipliers, defined as the multiplier on the progressive cantonal tax code that is required in one municipality in the absence of the FES to maintain the given level of expenditure, if both the distribution of households and the level of public provision remain unchanged. This approach, however, is incomplete, as it ignores all second-round effects such as migration responses and adjustments to the level of public provision and housing prices, i.e., the general equilibrium effects of the FES. See "Handbuch Zürcher Finanzausgleich", available at http:// www. finan zausg leich. zh. ch/ inter net/ micro sites/ finan zausg leich/ de/ grund lagen/ unter lagen. html, last accessed July 2021.
consumption good x j . 9 They differ with respect to an exogenous income level y ∈ [y, y] and a preference parameter α ∈ [0, 1] , which describes the preference for the publicly provided good. Both are continuously distributed according to the probability density functions f(y) and f (α) , respectively. As a consequence, a continuum of households exists in a two-dimensional space such that a household is characterized by the pair (y, α). (For a graphical illustration, see Fig. 1 in the next Section.) Migration is costless, which implies that a household of type (y, α) resides in municipality j if the household prefers the triplet (p j , t j , g j ) to any other triplet (p i , t i , g i ) ∀ i � = j . If a household is indifferent between any two municipalities, it chooses its residence by chance. For more detailed explanations concerning the heterogeneity of households, be referred to Schmidheiny (2006b) and Epple and Platt (1998).
To further illustrate the decision making of households, I introduce an indirect utility function. It is the result of maximizing a household's utility function U (·) subject to its budget balance constraint with respect to its private consumption bundle. Mathematically, describes the utility that the household (y, α) achieves if it resides in municipality j for a given set of municipality characteristics. The budget balance constraint allows for a progressive tax scheme: The tax rate t j is multiplied by the tax base b(y), which allows for a progressive tax regime (see Sect. 3.1). The case of linear taxation is covered as the special case of b(y) = a · y and a constant.
The model is in equilibrium if the set of the following three conditions are satisfied, which are conceptually the same as in Kuhlmey and Hintermann (2019).
• Migration equilibrium No household has an incentive to move and (at least weakly) prefers the municipality it currently resides in to any other municipality. • Majority voting equilibrium The tax rate multiplier t j in every municipality constitutes a majority voting equilibrium. Without further restrictions of the household preferences (see below), I cannot easily predict what tax rate multiplier can win a majority. • Housing market equilibrium Housing demand equals housing supply in every municipality.
For each, I now discuss the implications and assumptions in the context of the present paper. Concerning the housing market equilibrium, for every household, the optimal housing demand h j (y, α) depends on the locational choice, its income and preference parameter, and follows from the utility function (1). For the supply of housing, which I label HS j (p j ) , I follow the previous literature and assume that it is supplied by absentee landlords according to a constant returns to scale technology. More specifically, I assume where L j is the available land in j and θ the price elasticity of the housing supply. Market clearing then requires that in every j where the double integral is aggregate housing demand, which needs to be scaled by multiplication with N, the measure of total population. For the definition of the integral borders, see below.
Concerning the migration equilibrium, the existence of an equilibrium per se cannot generally be guaranteed for this class of models. I focus on segregating equilibria in the numerical application. Segregation implies that households self-select into municipalities such that every municipality is inhabited by households from a single interval on the income and preference distribution. In terms of the indirect utility function (1), this implies for any municipality j that ∀ y ∈ [y j , y j ] and ∀ α ∈ [α j (y), α j (y)]: where y j , y j and α j (y), α j (y) describe the lower and upper limits of the income and taste intervals, within which households reside in municipality j. Households that are precisely at these limits are indifferent to the neighboring municipality. They define the municipality borders in the y-α-space by forming what is called the locus of indifferent households between any two 'adjacent' municipalities. All households in between these limits strictly prefer municipality j, while all households beyond these limits strictly prefer another municipality.
With linear taxes and without a fiscal equalization scheme, Schmidheiny (2002) shows that any equilibrium is characterized by perfect segregation, if the utility is described by a Stone-Geary utility function with at least one strictly positive level of subsistence consumption. Kuhlmey and Hintermann (2019) show that allowing for spillovers and imperfect rivalry in the consumption of the publicly provided good does not require stricter assumptions about preferences. In the Appendix, I discuss how this set of conditions can be refined to remain compatible with segregation in the presence of a progressive tax scheme. I cannot establish a formal definition for a set of necessary conditions that are required to guarantee income segregation (if an equilibrium is found). This implies that I need to check whether the implicitly assumed segregation in the resulting equilibrium is indeed incentive-compatible. Incentive compatibility (IC) has two components in this context: In the case of the moving equilibrium condition, IC means that only the actual border-households are indifferent between any two municipalities, and that those who are not indifferent prefer the municipality that they 'belong' to over any other municipality; in the case of the majority voting equilibrium condition (see below), IC implies that the households on one side of the locus of median voters all prefer a higher tax rate, while the households on the other side of the locus prefer a lower tax rate. In Additional file 1: Online Appendix B.3 I shows that the baseline calibration to the Zurich metropolitan area, that I present in the next section, is incentive-compatible.
Concerning the majority voting equilibrium, for each triplet of municipality characteristics ( p j , t j , g j ) which satisfies the municipality's budget constraint, the following holds: If the marginal rate of substitution between any pair of municipality characteristics from this triplet changes monotonically in both income y and the preference for the publicly provided good α , there exists a locus of households in the y-α-space for which this pair is optimal. Take the pair (t j , g j ) as an example. If 50% of voters prefer a higher t j and 50% a lower t j , then this locus is called the median voter locus (quite similar to the approach used above for the locus of indifferent households). If it exists, there is no other t j -g j -pair which would win a majority vote against the median voters' optimal t j -g j -pair and therefore constitutes a majority voting equilibrium for a given population in the municipality. In the Appendix you find a discussion on the necessary and sufficient conditions on the households' preferences required in the existence of a median voter locus in the presence of progressive taxes.
I assume that, when voting, households take the distribution of the households (as well as the households' level of housing demand) as given, i.e., are myopic with respect to the migrational consequences induced by changing the tax rate (for a further discussion of voter myopia, see Epple et al. 2001;Kuhlmey and Hintermann 2019). Moreover, in the presence of inter-municipal spillovers, I assume that households correctly anticipate the supply of the publicly provided good in the other municipalities. As a consequence, the optimal tax rate multiplier of household (y, α) follows from where g j (t j ) indicates that the public consumption level is determined by the level of t j via the budget balance constraint of the municipality and the production technology, which I specify in Sect. 2.2. Note that I restrict the voting process to determine a tax rate multiplier (and therefore not to determine the progressivity of the tax scheme per se). This one-dimensionality of the voting decision allows me to keep track of households' preferences and to identify potential segregation patterns. The equilibrium tax rate is then implicitly defined by: where α m j (y) defines the locus of median voters in j. It is the solution of (5) solved for α and with a tax rate chosen such that (6) where N is total population.

Revenue and expenditure of the local governments
After having sketched the basic setup and the general structure of the model, I now specify the production technology of the publicly provided good g j and the budget balance of the local governments, which includes the fiscal equalization scheme.
The amount of the publicly provided good available for consumption in municipality j is given by where G j denotes the level of production of the good in j. The level of production, G j , is determined by the budget balance constraint of the municipality, which is derived below in (11). Each municipality spends its entire revenue on G j , such that we can think of it as being the bundle of goods and services that are actually (and on average) provided by municipalities. To capture the characteristics of this bundle, I allow for inter-jurisdictional spillovers and imperfect rivalry in consumption, where σ describes the degree to which the public provision 'spills out' of the other municipalities into j, ν describes the degree to which the citizens of the other municipalities 'spill into' Page 6 of 21 Kuhlmey Swiss Journal of Economics and Statistics (2022) 158:11 j to consume there, and ρ describes the degree of rivalry in consumption. All parameters are meaningfully defined between [0, 1], whereas not all combinations make sense economically. For further details, see Kuhlmey and Hintermann (2019), who introduced this specification.
To determine the (net) revenue of each municipality, I consider two elements: Tax revenues and payments from or into a fiscal equalization scheme that is imposed by some higher level of government. The particular form of both is due to the actual situation in the canton of Zurich, to which I calibrate the model in Sect. 3.1. Tax revenue stems from taxing the income of the residents of each municipality. Each municipality decides on setting one multiplier, t j , on the municipal tax base b(y), which is a function of actual income y. The tax base determines the relative tax liabilities of households differing in income, whereas the level of taxes is not yet defined. Using this specification, t j > 0 is the meaningful limitation on the tax rate multiplier. A value of t j = 1 means that a household with income y which is residing in j has to pay municipal taxes that exactly correspond to b(y). 10 Note that the interpretation of t j has therefore changed compared to the linear-tax case previously considered in the literature, where b(y) = y implied that the tax rate describes the share of income that every household has to pay. 11 For the case at hand, the aggregate tax base of a municipality is given by: Multiplied with the tax rate multiplier t j , this determines the tax revenue of a municipality. The municipality-specific per capita level of the tax base is labeled the fiscal capacity ( FC j ) of a municipality such that This measure determines how much municipality j pays into or receives from the fiscal equalization scheme (FES).
The second element that I consider to determine a municipality's (net) income is a tax base equalization scheme at the municipal level. Such a scheme has two effects: On the one hand, it lifts the revenue of poor municipalities to a certain lower bound; on the other hand, it takes a certain percentage from the fiscal capacity of rich municipalities that exceeds some upper bound of the fiscal capacity. More precisely, the net subsidy of municipality j, labeled FES j , can be defined as follows: where FC avg is the average fiscal capacity of all municipalities in the canton. A municipality receives the subsidy Z j if FC j < ℓ · FC avg . The parameter ℓ determines the lower bound of the fiscal capacity to which the municipalities' revenue is topped up and therefore marks a lower bound of revenue for municipalities. And the municipality has to pay A j if FC j > υ · FC avg , with υ > ℓ . This inequality states that if the municipality's fiscal capacity exceeds υ · 100% of the average fiscal capacity, the municipality has to pay a fraction τ ∈ [0, 1] of its fiscal capacity in excess of this upper limit (haircut). Municipalities with a fiscal capacity between the lower and upper bound of the average neither receive payments from or owe payments to the FES. This setup leaves the scheme not necessarily balanced. The reason for this is that the sum of payments to the scheme ( j A j ) are not directly linked to the sum of subsidies ( j Z j ), such that it is the choice of the parameters ( ℓ, υ, τ ) that together with the distribution of households quantify the payments and budget balance cannot be guaranteed. The central government covers a deficit (and receives excess payments).
I am now able to define the net revenue of each municipality as the sum of tax revenue and the subsidy from (or payment into) the FES. Accordingly, the budget balance constraint of municipality j implies

Functional forms and solving the model
For the calibration, I rely on a Stone-Geary utility function, which I specify below. I am not able to solve this model analytically for its equilibrium values. Instead, I am left with a set of 3J equations and 3J unknowns, who form the basis for the numerical solutions from the next sections: The equilibrium conditions are J housing market clearing conditions (3), J majority voting equilibrium conditions (6), and J times the calculation of the consumption levels of the publicly provided good (7). The variables that I cannot solve for are the respective municipality characteristics p j , t j , and g j . Note that the J − 1 loci of indifferent households at the municipality 'borders' in the y-α-space, the loci of indifferent voters, as well as all the other variables (such as G j , TB j , FES j ) are implicitly defined for a given set of municipality characteristics.
As in much of the previous literature on decentrally determined income taxation, the preference structure of households is supposed to be characterized by a Stone-Geary utility function (see Schmidheiny 2002Schmidheiny , 2006b; Kuhlmey and Hintermann 2019). More precisely, the utility of a household with preference α for the publicly provided good is given by where β g , β h and β x are subsistence levels for g j , h j and x j , respectively. Beyond this subsistence consumption, (12) supports a linear expenditure system: α ∈ [0, 1] determines what share of the remaining income (after having paid for the subsistence levels) a household wants to spend for the publicly provided good. The remainder of that amount is then spent on the private consumption bundle: A share of γ ∈ [0, 1] is spent on housing and a share of (1 − γ ) on the numeraire.
The indirect utility function (1) follows as is the net income after paying taxes and providing the subsistence consumption levels of the private consumption bundle and is therefore a measure of disposable income. The remaining expressions can be derived from the indirect utility function (13). These include the aggregate housing demand HD j , the locus of indifferent households α j−1,j (y) or the locus of median voters α m j (y) . They are derived in the next section.
To get a better feeling of what this system of equations look like, I now present the system of equations that defines the model, where I consider the functional forms from above. The set of 3J 'true' equations that define the model are, for every j, the housing market clearing condition (3), the median voting condition (6), and the equation to determine the consumption level of the publicly provided good (7).
First, I derive the housing market clearing condition. For the Stone-Geary utility function (12), the housing demand of household (y, α) in j is given by Note that it is independent of α , which reflects the fact that the households cannot freely choose their preferred level of public provision, but have to consume the uniform consumption level, which is determined by (and therefore only optimal for) the households at the locus of median voters. Aggregate housing demand follows as the double integral of (14) for all households residing in j as N y j y j α j (y) α j (y) h j (y, α)f (y)f (α)dαdy . Considering the aggregate housing supply from (2), the housing market clearing condition in j therefore reads as and TB j is given by (8). The locus of indifferent households between the two municipalities is given by the lower bound, α j (y) , and the upper bound, α j (y) . Assume (without loss of generality) that the municipalities are numbered in ascending order, such that municipality j − 1 contains the households with lower levels of α for any given level of y, and municipality j + 1 , the households with higher levels of alpha. Then, the locus of indifferent households between any two adjacent municipalities, say j and j + 1 , follows from V j+1 (y, α) − V j (y, α) = 0 . For our functional forms, this can be solved for which gives the locus α j,j+1 (y) = nom j nom j +denom j as a function of the municipality characteristics (p j , g j , t j ) and (p j+1 , g j+1 , t j+1 ) . This defines the two integral borders α j (y) = α j+1 (y) = α j,j+1 (y) . Note that I could also solve for the locus of indifferent households in terms of income, y. This would imply solving for y j,j+1 (α) and require that I change the order of integration ( α as the outer and y as the inner integral). The results would be identical. I chose to solve for α-loci, since this is simpler for the given functional forms. I now turn to the median voting condition (6). This requires that we find the locus of median voters, α m j (y) , that cuts, for any j, the population in half. Recall that the households to the one side of this locus preferred higher tax rates, and those to the other side of the locus, lower tax rates. Again, the locus of the median voters in j can be expressed in terms of the municipality characteristics. As mentioned on page 7, the preferred tax rate of household (y, α) follows from maximizing V j (y, α) with respect to t j and subject to (7). The corresponding first-order expression can be solved to which-as above-gives the locus α m The tax rate in every j is then determined such that the thus-defined locus of median voters exactly splits the population in half, as formulated in (6). Note that the last set of the equilibrium conditions, (7), that determine g j , has been used here. It depends on G j , which is given according to (11), which depends on FES j according to (10). This, in turn, is determined by the distribution of households, which is implicitly defined by 16. The point I want to make here is that the model, though rather complex, can be boiled down to search for 3J values of municipality characteristics such that Eqs. (15), (6), and (7) are satisfied.

Calibration
In this section, I first specify the fiscal instruments relevant at the municipal level. Then, I specify the model presented in Sect. 2 for two groups of municipalities ( J = 2 ) that form the metropolitan area around the city of Zurich and discuss the choices of the unobserved parameters. Finally, I present the equilibrium properties for this baseline calibration and assess its performance.

Fiscal instruments at the cantonal level
I am interested in how households self-select into municipalities. Each municipality is characterized by its specific combination of the housing price p j , the linear tax rate multiplier t j , and the level of public consumption g j . Abstracting from a 'home-bias' or other frictions concerning relocation decisions, households choose the combination that suits them best. The municipalities, however, are not completely free to choose their tax regime. Two fiscal instruments that are determined at the cantonal level are crucial for this analysis: the municipal system of income taxation and the fiscal equalization scheme for the municipalities (FES). (17) In Switzerland, every household is subject to income taxation at the federal, cantonal, and municipal level. In my analysis, I am interested in the taxation at the local level. The municipal tax base b(y) of a household with income y is the cantonal tax liability of this household, and therefore determined by the cantonal tax scheme. The municipal tax liability is then given as t j b(y) for a household with income y. Note that t j is the same for all households within one municipality and b(y) is the same for all municipalities in the canton. The evaluation of t j for a household with a given income y therefore crucially depends on b(y). Zurich uses a progressive scheme with stepwise increases in the marginal tax rate. The taxation scheme differentiates between a 'basic' rate ("Grundtarif ") and a 'married' rate ("Verheiratetentarif "), where the latter is also applicable to single households with children. Both schemes are specified in Table 1. 12 The basic rate was applied to approximately 60% of the cases in 2013, and the married rate to the remaining 40% of cases. Since a married household typically consists of at least 2 people, it is plausible to assume that this rate affects more individuals than the base rate, which only applies to one-person households. For the calibration, I assume that every household is taxed according to the married rate. This biases the calibration, since I, effectively, apply tax rates that are too low for parts of the population and therefore underestimate the segregating consequences caused by decentrally determined income taxation. 13 The second fiscal instrument that the municipalities cannot (directly) influence is the fiscal equalization scheme for the municipalities (FES). I analyze the 'new' FES of the canton of Zurich that was introduced in 2012. 14 In 2015, the FES paid out 1,134 million CHF, which corresponds to roughly 10% of the total expenditures at the municipal level. 15 The payments of the rich municipalities into that scheme amounted to 667 million CHF, the rest being covered by the canton.
12 See "Steuertarife" on https:// www. steue ramt. zh. ch/ inter net/ finan zdire ktion/ ksta/ de/ steue rbere chnung/ steue rtari fe. html, last accessed July 2021. 13 For example, a household with a taxable income of 56,100 CHF has to pay 2513 CHF and face a marginal tax rate of 8% if taxed according to the basic rate. Applying the married rate reveals a tax liability of 1263 + 0.06 · (56, 100−47, 400) = 1785 CHF and a marginal tax rate of 6%. 14 For more information on the FES (in German), see "Handbuch Zürcher Finanzausgleich", available at http:// www. finan zausg leich. zh. ch/ inter net/ micro sites/ finan zausg leich/ de/ grund lagen/ unter lagen. html, last accessed July 2021. The data presented here for the FES are publicly available at https:// www. zh. ch/ de/ steue rn-finan zen/ gemei ndefi nanzen/ zuerc her-finan zausg leich. html, last accessed July 2021 (look for "Finanzausgleich ab 2012", which contains the 2015 data used in this paper). 15 Total expenditure of the municipalities in the canton of Zurich amounted to 11,994 million CHF in 2014. See "Finanzstatistik" of the federal financial administration, available at https:// www. efv. admin. ch/ efv/ de/ home/ themen/ finan zstat istik/ daten. html, last accessed July 2021. Kuhlmey Swiss Journal of Economics and Statistics (2022) 158:11 Zurich's municipal FES has three instruments: (1) transfers based on resource disparities ("Resourcenausgleich"), (2) compensation for specific extra-burdens ("Sonderlastenausgleich"), and (3) a payment to the city centers ("Zentrumslastenausgleich"). The latter is specifically designed to provide the cities of Zurich and Winterthur, the two biggest cities of canton Zurich, with sufficient means to supply their inhabitants with infrastructure and other goods and services that are to a large extent also used by inhabitants of the surrounding municipalities. Payments in this branch amount to 43% and thereby correspond quite precisely to the amount paid by the canton (41.3%). As I discuss below, in the baseline calibration to the metropolitan area of Zurich, I exclude the city of Zurich. Therefore, I do not consider this instrument of the FES. The second instrument redistributes money to municipalities with a high share of pupils, as well as to municipalities that face disadvantages in terms of geography or other burdens which the municipality cannot influence and which the canton authorizes. The economic importance of this instrument, however, is limited, as it accounts for only 3% of total expenditures. This is why I also do not model this instrument.
Instead, I focus on the first instrument, the transfers based on the resource disparities of the municipalities. This instrument collects all the payments of rich municipalities into the scheme, and the paid-out subsidies in this branch of the FES approximately amount to the remaining half of the budget. The basic structure of this instrument is described in Eq. (10). The values of ℓ, υ and τ are the result of a political process and were set to 0.95, 1.1, and 0.7, respectively, when the new FES was introduced in 2012. For the interpretation of these levels, recall the concept of a municipality's fiscal capacity. As laid out in (9), it is equal to the per capita tax revenue if the tax rate multiplier is 1 and therefore corresponds to the per capita cantonal tax liability in that municipality. The average (per capita) cantonal tax liability over all municipalities is labeled the average fiscal capacity.
If a municipality's fiscal capacity is below ℓ = 95% of the average, it receives the difference between its actual fiscal capacity and this lower bound as a subsidy. As a consequence, after the transfer payments, every municipality is (when it selects a multiplier of at least 1) eligible to spend at least 95% of the average fiscal capacity, which makes ℓ an effective lower bound of the revenue capacity of the municipalities. A municipality whose fiscal capacity is more than 10% higher than the canton's average has to pay 70% of its fiscal capacity in excess of this level. Therefore, τ constitutes a 70% marginal tax on a rich municipality's fiscal wealth. In 2014, 127 of the municipalities received payments, 27 paid, and the remaining 13 received nothing and paid nothing.

Baseline calibration
In this section, I first show how to arrive at the two municipality groups used for the calibration. Then I specify the remaining modeling parameters, which allows me to solve the model and discuss its baseline calibration.

Construction of two municipality groups
I follow Schmidheiny (2006b) and select a set of 39 municipalities around the city of Zurich, whose inhabitants predominantly work in Zurich's city center. I leave out the city center as this 'municipality' entails many special factors and characteristics that are not captured in the present setup. This concerns, e.g., its special role within the FES or the fact that city centers provide goods and services that are to a larger degree consumed by households residing elsewhere. Descriptive statistics for the metropolitan area around the city of Zurich are given in Section B.1 in Additional file 1: Online Appendix. The municipalities are sorted according to their percapita income and divided in two subgroups of equal building areas, such that one group contains the rich and the other the poor municipalities. On the aggregated level, the characteristics of the poor and rich municipalities are summarized in Table 2. The average income of the households in the group of rich municipalities is 80% higher than the average income of the households in the poor group. The average land price in the rich subgroup is almost 60% higher, although it is inhabited by 40% fewer households than the poor group and their building areas are equal. The tax rate multiplier is about one quarter lower in the (group of ) rich municipalities, while public expenditure levels are comparable.
The average amount paid to the fiscal equalization scheme (FES) by rich municipalities was almost 2600 CHF per capita in 2015, whereas the poor municipalities received approximately 400 CHF. 16 For the remainder of the paper, I use the terms 'municipality' and 'municipality group' interchangeable.

Model specification
In the following, I present the parameters used for the baseline calibration, which are summarized in Table 3. For a sensitivity analysis that tests the sensitivity of the model outcome with respect to many of these parameters, see Table B2 in Additional file 1: Online Appendix. Among the observed parameters is the population size, which I set to N = 360 , in accordance with the value of 362 thousand inhabitants given in Table 2. 17 I assume that income is distributed log-normally between y = 15 k CHF and y = 1000 k CHF. The shape parameters correspond to the distribution of taxable income at the household level in the canton of Zurich, as described in Section B.2 in Additional file 1: Online Appendix. Accordingly, the aggregate income of the households in my model amounts to 27.23bn CHF, close to its 'true' value of 28.49bn CHF.
The average fiscal capacity follows from the mass of households in my model. It does not depend on any equilibrium outcome variables, since it simply adds up the cantonal tax liability of every household, divided by N.
The observed average of the fiscal capacity in the canton of Zurich is about 3500 CHF. 18 Note that for the calibration I use a lower value and set FC avg = 3000 CHF per capita. This is done to put a cap on the payments to the group of poor municipalities in the calibrated version.
The level of the publicly provided good consumed, g j , is given according to (7). It depends on the expenditure on the publicly provided good, G j , in both municipalities, as well as on the degree of spillovers ( σ , ν ) and rivalry, ρ . Concerning the former, I assume that σ = ν , which implies that the degree to which public provision spills out to the other municipality is the same as the degree to which households from one municipality consume the good in the other. To estimate the degree of spillovers and rivalry in consumption of the publicly provided good, I broadly categorize the municipalities' expenditure: The set of spillover-generating expenditure categories consists of expenditures on health, culture and leisure, security, environment, and traffic. Expenditure categories associated with a rather high degree of rivalry are health, education and welfare. Table B1 in Additional file 1: Online Appendix shows the detailed numbers for the selected municipalities; Table 2 gives the aggregated numbers. It shows that the rich municipalities tend to spend more on goods that appear more likely to spill over to neighboring municipalities, whereas congested goods (with an arguably relatively high degree of rivalry in consumption) are supplied equally. The chosen values of σ = ν = 0.2 and ρ = 0.75 seem reasonable, though I do not want to claim these levels are the 'true' values.
Housing supply is given by (2). The available building areas of the two municipality groups is set to L 1 = L 2 = 25 ( ·100ha), which corresponds to the building areas. The price elasticity of the housing supply, θ , is set to 1. This value is not easily observable, and the previous literature has typically used values of around 3 (see, e.g., Schmidheiny 2006b;Calabrese et al. 2012;Kuhlmey and Hintermann 2019). In a recent, more elaborate study on this topic, Saiz (2010) argues that for metropolitan areas a smaller value of around 1 (or even lower) seems more appropriate.
The remaining parameters of the model, the preference parameters, are less accurately observed. They are used to fit the model outcome as well as possible to the observed outcome (while remaining in a plausible range). These parameters include the housing preference γ , which is set to 0.3. This implies that once a household has paid for the subsistence consumption levels, it wants to spend (1 − α) · 30% of the remaining income on housing-recall that α is the share optimally allocated to the publicly provided good.
The preference for the publicly provided good, α , is assumed to be beta-distributed and therefore limited to the interval [0, 1]. The shape parameters a = 1 and b = 49

Cantonal tax scheme
'Married' rate, see Table 1 Page 12 of 21 Kuhlmey Swiss Journal of Economics and Statistics (2022) 158:11 imply the mean of this distribution is a/(a + b) = 0.02 and the mode is (a − 1)/(a + b − 2) = 0 . The subsistence level of the publicly provided good is β g = 10.75 . The proposed combination of the distribution of α and the level of β g offers reasonable tax rate multipliers and consumption levels as will become apparent in the next section.
The subsistence levels of the private consumption bundle (h, x) are set to β h = 0.5 and β x = 5 . These levels imply that in the baseline calibration (see Sect. 3.2), the poorest household with an income of 15k CHF has to pay approximately 80% of its income for its private subsistence consumption.

Outcome and evaluation
Using the data from Table 2 on the two groups of municipalities, I have to solve a system of six equations and six unknowns as indicated in Sect. 2.3. In this section, I present the equilibrium outcome for this baseline calibration and assess its performance. Recall that I was not able in Sect. 2.3 to identify necessary conditions for an equilibrium in the presence of progressive taxes. Therefore, I also show that the identified segregating equilibrium is indeed incentive-compatible.
Equilibrium values of the endogenous variables are presented in Table 4. Section B.4 in Additional file 1: Online Appendix offers a sensitivity analysis, for which I varied parameters that are less easily observed and also show results of a simplified version of the model to investigate the sources of equal housing prices in my baseline calibration . Figure 1 reveals the distribution of households as well as the loci of median voters in the y-α-space for the baseline calibration. All of those are implicitly defined by the set of equations defined above.
The solid line in Fig. 1 is the locus of indifferent households. For the Stone-Geary specification, it is defined by (16). All households with a value of α below this curve reside in municipality 1 (by definition of municipality 1), those above this curve reside in municipality 2. Therefore, the inhabitants of municipality 1 have lower levels of α . To see this, fix a level of income y and 'cut vertically' through the y-α-space. This reveals that residents of municipality 1 have a lower preference for the publicly provided good such that they wish to spend less for it and accordingly prefer lower tax shifters than the households in municipality 2. Equally, you could interpret the locus of indifferent households by fixing a value for α and thus 'cut horizontally' through the y-α-space of heterogeneous households. Here the interpretation is not so clear, however: Whether poorer households, i.e., those to the left of the locus, prefer to reside in municipality 1 or 2 depends on the shape of the locus. If it is an increasing function (as it is here), poorer households (to the left of the locus) prefer higher taxes than richer households. 19 The dashed lines are the loci of median voters. They are defined by α m j (y) , which is introduced in Eq. (6) and for the Stone-Geary specification is defined according to (17). For each municipality, it describes the set of households (i.e., the combinations of y and α ) that prefer the equilibrium tax rate to all other potential values of t j . It exactly splits the population of the municipality in two equally large groups. Those below the locus prefer lower tax rates, since they have lower levels of α , and those above the locus prefer higher tax rates.
The distributions of y and α are independent, such that, if the locus of indifferent households was a horizontal line, both municipalities would have exactly the same average income. Segregation would in that case be limited to the preference for the publicly provided good. For my baseline calibration, more poor households reside in municipality 2, the home of the 'public good lovers' . This implies that the households residing in municipality 1 are richer (on average) than those residing in municipality 2.
Consider Table 4. Overall, the model is capable of generating a realistic distribution of households: The 143k (from a total of 360k) households that reside in the rich municipality (where the label 'rich' and 'poor' is endogenous) have an average annual income of 98k CHF, whereas the remaining 217k households in the poor municipality make on average 61k CHF per annum. This implies that for my baseline calibration, the rich municipality is slightly overcrowded and slightly poorer than observed, as approximately 137k households that earn an average annual income of 110k CHF actually reside in the rich group of municipalities. My baseline calibration also predicts the observed tax rate multipliers quite well. I adjusted the subsistence level of the publicly provided good, β g , such that the predicted tax rate multiplier in the rich municipality matches the observed 82.8% of the cantonal tax liability. The corresponding multiplier in the poor municipality is predicted 6 percentage points above the observed 107.6%. This divergence can at least partly be explained by the imprecise fit of the household distribution.
Concerning the remaining outcome variables, the fit of my baseline calibration is less accurate. The relative size of the payments to or from the FES, the (relative) level of public expenditures, and the relative housing prices in both municipalities require further inquiry into the sources of the divergences. First, consider the FES. Recall its mechanics from (10) and note that the amount any municipality j has to pay or may receive depends on FC j = TB j /N j , its average tax base. With a progressive cantonal tax scheme, the aggregate tax base in that municipality, TB j , is not equal to the cantonal tax liability of the mean household income times the population: Rather it depends on the population composition of this specific municipality, whether this amount is larger or smaller. With many rich households in a municipality, TB j > b(ŷ j )N j ; and with many poor households, the opposite holds. This directly relates to the concept of fiscal capacity: For municipalities with relatively many rich [poor] households, FC j > [ < ]b(ŷ j ) . This is relevant, since-following the same logic-the 'average' amount that is credited to the FES for a group of rich municipalities and the 'average' amount that is debited from the FES for a group of poor municipalities are not equal to what one rich and one poor municipality would pay or receive. Intuitively, the amount that the rich group had to pay would be lower and the amount the poor group would receive would be higher than the respective (population-weighted) sum of the actual payments from or to the single municipalities in each subgroup. This is what I observe here: Payments of the rich group amount to 1800 CHF in my calibration (vs. 2580 in Table 2), and the subsidies to the poor group are 1500 CHF (vs. 400). Note that a mediating factor is that the average fiscal capacity is set below its 'true' value, as discussed in Sect. 3.2.2.
This leaves the discussion of the housing prices and of the public consumption levels, where the calibration does not fit the reality well. In my calibration, housing prices are equal in both municipalities, whereas the observed average building areas price is almost 60% higher in the rich municipality group. While public spending levels in both municipalities are roughly equal according to the data, in my calibrated version, the rich spend one third less than the poor on the publicly provided good. These mispredictions have a common source: Rather than splitting the population into a segment of rich households (averaged) who love to spend money on housing and another segment of poor households (averaged) who do not, I split the population into 'public good lovers' and 'public good haters' . Note that the poorer households reside to a larger extent with the public good lovers. This is intuitive in the presence of progressive taxation, since poorer households are only obliged to contribute underproportionally to public revenue and therefore care less about the level of t j . One approach to overcome this imprecise prediction of the relative housing prices is to assume preference heterogeneity with respect to housing instead of the publicly provided good. 20 I stick with a fixed γ for the scope of this paper, for two reasons: (1) the model with α-heterogeneity is able to explain the other features of the metropolitan area around the city of Zurich quite well and (2) the second heterogeneity (i.e., the preference heterogeneity with respect to α or γ ) is necessary for a realistic 'imperfect sorting' , but the inefficiencies that we are after (the JCE and the inter-municipal free-riding) are not directly affected by the source of this second heterogeneity, nor by the relative housing prices. This strongly suggests that the results of the policy evaluations from the next section are not borne by this inaccuracy.

Policy evaluation
In this section, I discuss two sets of policy changes: First, I gradually remove the fiscal equalization scheme (FES); then, I change the underlying tax scheme-to a more progressive one and to a linear one.

Removal of the FES
The FES redistributes a significant amount of money from richer to poorer municipalities. All else held constant, it is easily possible to quantify its importance, e.g., in terms of counter-factual tax rate multipliers necessary to maintain consumption levels if the FES did not exist. Such ceteris paribus analyses, however, are incomplete, as they ignore the general equilibrium effects. These reveal the FES' mitigating effect on segregation, taking into account the adjustments in the housing prices, tax rates, and public expenditure.
To show these adjustments, I gradually remove the FES from the baseline calibration used in the previous section. To do so, I introduce the weighting parameter κ ∈ [0, 1] and assume that the payment from or to the FES is given by κ · FES j , where FES j is determined by (10). Starting from the baseline calibration ( κ = 1 ), this amount is gradually reduced to 0, for which no payments are enforced and therefore the FES is effectively switched off. Thus, the general setup of the FES and therefore the incentive structure remain unchanged, but are increasingly weak. For values of κ below 40%, I found equilibria in which one municipality is 'empty' , i.e., left without households. This can be interpreted as the most extreme form of the 'poor chasing the rich' . This rather peculiar outcome seems unlikely, which is why I left out these cases in parts of the analysis. It illustrates, however, the important function that the FES has in achieving a socially more desirable, i.e., less segregated distribution of households in the presence of local tax competition. Table 5 summarizes the municipality characteristics when the FES is gradually reduced, and Fig. 2 visualizes the relative strength of these changes. The levels of public consumption g, public expenditure G/N, and of the housing price p do not change much. And if they do, it is in the expected way: Public consumption and expenditure is higher in the rich and lower in the poor municipality for lower values of κ.
The tax rate multiplier t heavily decreases in both the rich and the poor municipality as κ decreases. The decrease of the tax rate of the poor municipality is, at first glance, surprising, since the FES effectively subsidizes the poor municipality. If subsidies are faded out, this municipality becomes less attractive-even more so in comparison with the rich one that becomes more attractive as it has to pay less. This is why one could expect the public expenditure levels in the poor municipality to decrease, and/or the tax rate multiplier to increase and housing prices to fall. But this kind of reasoning neglects the general equilibrium effects: What happens, in addition, is that the households allocate differently.
The household distribution for decreasing κ is displayed in Fig. 3, which illustrates how the locus of Table 5 Phasing-out the FES: effect on municipality characteristics "FES effect" ( κ ) describes to what percentage the fiscal equalization scheme is implemented, relative to the full implementation described in (10) and used in the baseline. The 100%-column is a copy of indifferent voters changes. When phasing out the FES, more poor households reside in the poor municipality, and more rich households reside in the rich municipality. The general shape of the locus of indifferent households remains stable (in the sense that it is an under-proportionally increasing line). Interestingly, at 40% its original strength, the rich municipality (below the locus) 'got rid' of the very poor households. Table 5 shows how this translates to changes in the population and mean income and therefore reveals the scope of this change in the distribution of households: For κ = 0.4 , the rich municipality has 7% fewer inhabitants and its average income is approximately 8% higher compared to the full implementation of the FES. This is support for the intuitive claim that the FES mitigates segregation induced by decentrally determined income taxation at the local level. Moreover, for smaller levels of κ , the poor municipality offers higher public consumption levels than the rich municipality at the expense of higher tax rates. Keep in mind that in the presence of progressive taxes poor households are hurt less by the higher tax rate multiplier than the rich, which explains why more poor households reside in the poor municipality.
The change in the distribution of households explains an unexpected pattern: The amount that the poor municipality receives through the FES is higher for smaller levels of κ than in the baseline case. The contribution of (very) poor households to the fiscal capacity of a municipality is (very) small, such that-when the FES is faded out-the 'migration' of poor households into the already poor municipality causes the average fiscal capacity to decrease. This decrease is so pronounced, that κ · FES is actually increasing as κ decreases. Note from Table 5 that the tax rate multipliers in both municipalities are below baseline levels for small values of κ , and Fig. 2 reveals that the decrease of the multiplier is more pronounced in the rich municipality. This is why not only fewer poor households but also more rich households reside in the rich municipality when κ is small compared to the baseline.
Payments from the rich municipality to the FES, however, are lower for lower values of κ . This indicates that the fiscal capacity of the rich municipality does not increase 'too strongly' and thereby overcompensate the decreased payment due to lower levels of κ.
For even lower levels of κ , i.e., if κ < 0.4 , the payments to the poor municipality start decreasing (not displayed). One could say that all poor households, which caused the over-proportional decrease in the fiscal capacity (which in turn led to increasing transfer payments, although the scheme was faded out), are already living in the poor municipality. This implies that the poor municipality can no longer attract additional households; instead, a rather peculiar form of segregation, that leaves one municipality empty, occurs. Though I do not consider this household distribution to be a realistic description of what would happen if the canton of Zurich removed its FES, the results support the claim that the existence of the FES is a crucial measure to counter the segregating forces created by local tax competition-especially in the presence of an underlying progressive tax scheme.
To sum up, public consumption and public expenditure are surprisingly stable as the FES is faded out, even though the poor municipality has to cope with a modest decrease of both. For smaller levels of κ , the households allocate differently. This consequence at first dominates the direct effect of the phase-out on the subsidy that the poor municipality receives through the FES. This causes the higher level of subsidies for smaller levels of  Economics and Statistics (2022) 158:11 κ . Since the payments from the rich municipality to the FES are decreasing as the FES is faded out, lower levels of κ allow both municipalities to set a lower tax rate multiplier whilst still providing relatively high levels of public expenditure and consumption. Concerning the ability to mitigate segregation, the existence of the fully implemented FES (as in the baseline calibration) proved quite powerful.

Change of the underlying tax code
In this section, I investigate the role of the progressivity of the tax scheme, leaving the FES fully implemented as in the baseline calibration. I analyze two policy changes, illustrated in Fig. 4 which plots the tax liabilities of the three tax schemes for different levels of income and where the tax rate multiplier is 1 (for tax liabilities of income levels beyond 300k CHF see Table B5 in Additional file 1: Online Appendix). First, I change the tax code to a linear tax scheme, for which I set the tax rate equal to the average tax rate of all households residing in my municipalities.
In mathematical terms, the marginal rate of the linear tax scheme is set to j TB j / j Y j . This ensures that tax rate multipliers of this policy scenario are comparable to the baseline scenario in the sense that the fiscal capacity is equal. Poorer households up to a taxable income of around 100k CHF face higher tax bills under the linear scheme compared to the progressive ones, while richer households pay less if taxed with the linear scheme.
The second is an increase in the progression of the cantonal tax code. The average fiscal capacity under this code is about 10% higher than in the baseline case, which has two implications: First, the absolute levels of the tax rate multipliers are not perfectly comparable to the baseline, since (on average) the same multiplier translates to 10% more revenue, and, second, since FC avg has not been increased, the poor municipality receives less and the rich pays more than they would if FC avg were adjusted and therefore the equilibrium payments from [to] the scheme are too low [too high]. Table 6 summarizes the equilibrium values of municipality characteristics for the three tax schemes, and Fig. 5 plots the loci of indifferent households and of the median voters. Increasing the progression of the tax code increases the degree of segregation of rich and poor households and thus leads to more redistribution through the FES. In total numbers, the household distribution is not changing much, but the magnitude of the FES-payments is: The rich municipality has to pay roughly 1,000 CHF more per capita, and the poor receives an additional amount in excess of 500 CHF per capita, which is partly because the FES avg value was not increased.
Next, consider the switch to a linear tax scheme, where households pay a flat rate of 5.26% of their income. The pattern of the household distribution changes as expected: The degree of segregation is lower, and the households are distributed more evenly among both municipalities. 21 The municipality characteristics (p, t, g) remain relatively unchanged, except for slightly lower public provision levels in both municipalities. This can be explained by the fact that now both municipalities have to pay to the FES. 22 And it implies that the progressivity of the tax rate is not suitable to curb under-provision of the publicly provided good (but does not worsen it, neither).
Recall that by definition, conditional on the level of income y, municipality 1 is containing the households with the low values of α , and municipality 2 those with  Figure 5 helps to understand the underlying process: Comparing the locus of indifferent households for the linear tax scheme with the baseline case (i.e., the beige and the black solid lines) reveals that the beige line is less sensitive ('flatter') for higher levels of income. This indicates that the richer households chose their residence mostly according to their preference for the publicly provided good ('public good lovers' vs. 'public good haters' , see discussion in the next paragraph) rather than depending on their income. For poorer households, however, the situation is different since their preferred municipality strongly depends on their income. The slopes of the loci of indifferent households (that split the population of each municipality in half and whose preferences determine the municipal tax multiplier) change their curvature accordingly. To determine whether such a change finally translates into more or less segregation, we additionally need to consider the pdf of the population distribution in the y-α-space, which is not visible, but of course considered in the municipality characteristics in Table 6. 22 This is not implausible, for two reasons: (1) The selected municipalities are richer than the canton-wide average and thereby are on average netpayers to the municipal FES in the canton of Zurich. (2)  high levels. Figure 5 reveals that in the cases of a progressive tax scheme, the poorer households congregate to a larger extent in municipality 2 ('public good lovers'); in the case of a linear tax scheme, however, the poorer households prefer, on average, to live in municipality 1 (where the 'public good haters' reside). This causes the attribution of 'rich' and 'poor' to swap: With linear taxes, municipality 2 is inhabited by (on average) richer households and municipality 1 by the poorer ones. The intuition is that linear taxes increase the tax burden of the poor households, which consequently makes them more sensitive to the tax rate multiplier in their municipality: The incentive to 'sneak' into the municipality, where the rich pay an overproportional share of the higher tax levels, decreases. The results from this section indicate that a progressive tax scheme entails strong segregating forces in terms of an increasing disparity of average income levels and in terms of a more uneven distribution of households: When compared to the revenue-neutral linear tax rates, I find that the group of rich municipalities is inhabited by 11% fewer households and is 12% richer if the progressive scheme from the baseline calibration is being implemented.

Conclusion
I presented a model of Tiebout sorting with decentrally determined income taxes and spillover-generating public goods that combines a progressive tax scheme and a fiscal equalization scheme (FES). Households that differ with respect to income and their preference for a publicly provided good choose in which municipality they want to reside. The aggregate distribution of households, in turn, determines the triplet of housing prices, tax rate multipliers, and public consumption levels, where the multipliers are determined by majority voting. The trade-off between these characteristics defines for each household which municipality is its preferred choice of residence.
With this model, I can predict the migrational consequences of changes in the FES or the tax system. For a given household distribution, a progressive income tax scheme is preferable to a linear tax scheme in terms of equity. If households choose their location freely, the equity implications of a progressive tax scheme are less clear: Roller and Schmidheiny (2016) show that household mobility weakens the degree of progression in the effective average and marginal tax rates (measured as the observed actual tax payments of households) and can even imply lower average tax rates for higher-income households, i.e., a regressive actual taxation. Their work, however is purely descriptive in the sense that the focus is on the interaction between the locational choice of heterogeneous households and their effective tax liabilities. By changing the underlying tax scheme of my baseline calibration, I was able to show that an increase in the degree of progression leads to a stronger segregation of rich and poor households: In the baseline calibration, i.e., with progressive taxes, the average income in the 'rich' group of municipalities is 60% higher than in the 'poor' . With linear taxes my model predicts that this ratio drops significantly with the consequences that the rich would only be 30% richer on average. Table 6 Changing the progressive tax scheme: effect on municipality characteristics "1" and "2" label the municipalities. Municipality 1 is defined as the municipality that inhabits the households with the low preferences for the publicly provided good ( α ) and municipality 2 the households with high levels of α . Instead of using the municipality number, I often label the two municipalities 'rich' and 'poor' instead, according to their respective average income < 0 then describes the case where rich households accept a lower increase in t j in exchange of a marginal rise in g j . With case 2, the rich accept a higher increase in t j than the poor.
Lastly, I turn to the MRS between the tax rate and the housing price, which is negative for all y and α . This means that a household demands a decrease in the housing price as compensation for an increase of the tax rate. This trade-off changes in income according to (24): To ensure monotonicity, the inequality in (26) must have the same sign for all values of y and α . Whether it is positive or negative depends then on the full specification of the model and the equilibrium characteristics of the municipalities. For now, it is sufficient that an expression to determine the sign can be identified.
To sum up, under some additional assumptions, the first of Schmidheiny's two sufficient conditions can be adopted to progressive taxation.
(25) ∂M t j ,g j (y, α) ∂y < 0 if ε b,y > y y−p j β h −β x (≥ 1) > 0 if y y−p j β h −β x > ε b,y (≥ 1). 25 I assume that the subsistence levels are feasible for every household, which means that y y−pj β h −βx ≥ 1 , where this holds with equality if and only if there are no subsistence levels. Therefore, the two conditions that β h = β x = 0 and b(y) is a progressive tax scheme, would be sufficient to establish the monotonicity of the t j -g j -trade-off.
Page 20 of 21 Kuhlmey Swiss Journal of Economics and Statistics (2022) 158:11 The proportional shift in relative preferences The second of Schmidheiny's conditions, the proportional shift in relative preferences, is (for the functional forms used in this paper) only satisfied if β h = 0 , see (22). For positive levels of the subsistence level for housing, it is not satisfied in the presence of a progressive tax scheme. This general incompatibility has already been mentioned in Schmidheiny (2002). In this section, I want to discuss the necessary restrictions on preferences to comply with income segregation if β h > 0. Figure 6 illustrates my argument. The figure consists of three panels, each depicting indifference curves in the g-t-space. These are from three households that differ with respect to their income ( y I < y II < y III ) but have the same α . Let the housing price, which is not depicted, be either p 1 or p 2 , with p 1 = p 2 . Assume that there are two municipalities, 1 and 2, characterized by the triplets ( p 1 , t 1 , g 1 ) and ( p 2 , t 2 , g 2 ), respectively. Denote the level of utility that each of the three households realizes when residing in municipality 1 by V y I p 1 , V y II p 1 , and V y III p 1 , respectively. This allows me to plot the first three indifference curves, the dashed lines. The solid lines show indifference curves that provide each household with the same utility as it receives in municipality 1, given the housing price from the second municipality, p 2 . This is, V y p 1 = V y p 2 ∀ y ∈ [y I , y II , y III ] . Assume further that household y II is indifferent between both municipalities such that V y II p 2 goes through ( g 2 , t 2 ). For the poorer and richer households, V y I p 2 and V y III p 2 describe the respective combinations of g and t that makes them just indifferent to municipality 1, if the housing price is p 2 . 26 Panel 1 corresponds to Schmidheiny's (2002, Figure 2) and depicts a situation where the condition of the proportional shift in the relative preferences is met: The indifference curves of the three households intersect in one point for each housing price. In my illustration, the poor household prefers to live in municipality 1, while the richer household prefers municipality 2.
Panel 3 depicts a situation where the assumption of a proportional shift is violated, and income segregation is not incentive-compatible. This corresponds to Figure 3 in Schmidheiny (2002). However, this is not necessarily the case, whenever the assumption is violated, as shown in panel 2. The three indifference curves for p 2 do not cross in one point; they are shifted unproportionally. Still, in this situation, income segregation is incentive-compatible: The poor household prefers municipality 1, the rich