Examining the vintage effect in hedonic pricing using spatially varying coefficients models: a case study of single-family houses in the Canton of Zurich

This article examines the spatially varying effect of age on single-family house (SFH) prices. Age has been shown to be a key driver for house depreciation and is usually associated with a negative price effect. In practice, however, there exist deviations from this behavior which are referred to as vintage effects. We estimate a spatially varying coefficients (SVC) model to investigate the spatial structures of vintage effects on SFH pricing. For SFHs in the Canton of Zurich, Switzerland, we find substantial spatial variation in the age effect. In particular, we find a local, strong vintage effect primarily in urban areas compared to pure depreciative age effects in rural locations. Using cross validation, we assess the potential improvement in predictive performance by incorporating spatially varying vintage effects in hedonic models. We find a substantial improvement in out-of-sample predictive performance of SVC models over classical spatial hedonic models.


Introduction
Hedonic real estate models contain several predictor variables, and age is a key explanatory variable. The marginal effect of the building age on house prices has been well-studied. It has been found that the age effect is nonlinear (Clapp & Giaccotto, 1998;Goodman & Thibodeau, 1995). In particular, Case et al. (2004) report a "plausible quadratic form" for the building age. This behavior is a result of two main features of the age as an independent variable: (1) In general, older buildings depreciate due to deterioration; (2) "however, beyond some point, only those houses with the best locations and the highest construction quality survive. " (Case et al., 2004, p. 171). The quadratic appearance of the age effect has also been observed by Fahrländer (2006) and linked to the building material and architectural style. Studies investigating this particular type of behavior, i.e., a deviation from a pure depreciative effect once a particular age has been reached, are referencing to it as a vintage effect (Clapp & Giaccotto, 1998;Goodman & Thibodeau, 1995;Rubin, 1993).
Over the last two decades, there emerged a special focus on location specific effects due to newly available modeling methodologies. There are numerous publications which show a clear indication of spatially varying covariate effects within hedonic pricing models. For instance, when applying additive mixed regression models on rents in Vienna (Austria), Brunauer et al.
The goal of this paper is to unify both frameworks, i.e., vintage effects and SVC modeling, to investigate a possible spatially varying vintage effect. In particular, we want to examine if such non-stationary vintage effect exists and, in a second step, see if we can improve the quality of hedonic models in price prediction. Our study is motivated by previous work on spatially varying relationships between house prices and age. For instance, one of the first observations of spatial differences in the age effects can be found in Malpezzi et al. (1987). They compared individual hedonic models for 59 metropolitan areas in the United States and concluded that "[s]everal metropolitan areas exhibited significant deviations from the average depreciation patterns. " (Malpezzi et al., 1987, p. 382). More recent evidence for such behavior is presented in Brunauer et al. (2010) as well as Dambon et al. (2021a) who found pronounced spatially varying effects on the rents and the prices of apartments, respectively.
In this paper, we model spatially varying vintage effects for single-family houses (SFHs) in the Canton of Zurich (ZH, Switzerland). We select the Canton of Zurich as our area of analysis for several reasons. Firstly, the Canton of Zurich is sufficiently large and contains urban as well as rural areas. This is relevant in that our working hypothesis is that, on average, age has a negative effect on SFH price, but that spatial deviations in the form of vintage effects might occur in metropolitan and urban areas. One hypothesis is that such spatial deviations are driven by unobserved attributes such as architectural style and build quality of the SFH. New research also suggests that redevelopment options might also have an impact (Clapp & Salavei, 2010;Munneke & Womack, 2016). Hence, the Canton of Zurich with its above average rate of SFHs in urban areas is of particular interest. Further reasons for choosing the Canton of Zurich as the area of analysis are that the age structure of the Canton of Zurich is very similar to that of Switzerland as a whole and that some information on the full census of SFH transactions in the Canton of Zurich in the chosen study period is available. The latter is valuable in that is allows to check the representativity of our data.
Our analysis is economically relevant as a sizeable portion of the SFHs in the Canton of Zurich and in Switzerland in general are old. More specifically, a quarter of all SFHs in Switzerland were built before 1945 and a third were built before 1960. Very similar proportions are found for the Canton of Zurich (see Fig. 1 below and Table 6 in the "Appendix"). Given the existence of a vintage effect, accounting for such effects could therefore yield more accurate predictions for a sizeable portion of SFH transactions.
To verify our hypothesis on spatially varying vintage effects, we will use a new methodology introduced by Page 3 of 14 Dambon et al. Swiss Journal of Economics and Statistics (2022) 158:2 Dambon et al. (2021a) to model spatially varying coefficients using Gaussian processes (GP). In the next section, we first introduce and then extend the definition on SVC models and GP-based SVC models. In Sect. 3, we present the real estate data and justify the model. The model results are presented in Sect. 4. In Sect. 5 we assess predictive performance of the SVC model and compare it to a standard hedonic model. We conclude with a discussion of our results in Sect. 6.

Spatially varying coefficient models
Spatially varying coefficient models are a generalization of classical linear regression models, where we allow the regression coefficients to vary over space. That is, the effect of a covariate x (j) denoted by the coefficient β j can depend on a geographic location s , which we assume to be two-dimensional. SVC models can be applied to spatial points data sets, where for each of the n observations of the response variable y := y 1 , . . . , y n T ∈ R n and p every observation has an associated location s i . In summary, SVC models are defined as where i = 1, . . . , n indexes the observations with their corresponding locations s i and ǫ i is a classical N 0, τ 2 iid error term with τ 2 > 0. If one assumes that not all coefficients should contain spatial structures, one can define mixed SVC models. Let q with 1 ≤ q ≤ p be the number of covariates for which we want to model SVCs. Without loss of generality, we define the mixed SVC model as From now on, we assume that the first coefficient j = 1 always models an intercept. In the special case when q = 1 , we have the classical geostatistical model that is also used in most hedonic models. The exact assumptions for the coefficients β j (·) , j = 1, . . . , q , and how they are estimated, have yet to be defined. The literature on how to do so for both the classical geostatistical and SVC models is extensive. For geostatistical models, see Cressie (2011) and Heaton et al. (2019) for an overview. For SVC models, see Dambon et al. (2021a), Wheeler and Calder (2007), and Wheeler and Waller (2009) for comparisons.

Gaussian process-based SVC models
We specify the SVC model such that each coefficient is defined by a Gaussian process (Rasmussen & Williams, 2006). Gaussian processes are well-studied and (1) widely used tools to model dependency structures with applications including-but not limited to-spatial statistics (Banerjee et al., 2008;Datta et al., 2016;Gelfand & Schliep, 2016), econometrics (Wu et al., 2014, and time series modeling (Roberts et al., 2013). They are infinite dimensional stochastic processes that are defined similarly to a finite-dimensional normal distribution. We assume the GP to be jointly independent as well as independent of the error term ǫ := (ǫ 1 , . . . , ǫ n ) T ∼ N n 0 n , τ 2 I n . For n observations s := (s 1 , . . . , s n ) T , they are given by for j = 1, . . . , q . We assume a constant mean µ j and a covariance matrix (j) , which is defined by a covariance function c (j) and the corresponding observation locations s . The observation locations are being used to model the dependency between observations by computing the distances. In spatial statistics, one usually assumes that closer observations share higher dependency than observations which are far apart. 1 We use the Euclidean distance denoted by �·� which yields pair-wise distances d kl := �s k − s l � between all observations, 1 ≤ k, l ≤ n . Here, we assume to have exponential covariance functions c (j) (d) = σ 2 j · exp −d/ρ j , d ≥ 0, parametrized by variances σ 2 j ≥ 0 and ranges ρ j > 0 . The former parameter defines the extent of variation within an SVC β j (s) and the latter defines the decay of spatial dependency with distance. The covariance function is then applied to the distances, which yields the following corresponding covariance matrix

Example of two sampled Gaussian processes
In this section, we illustrate the interpretation of the parameters for a GP with the help of two samples. Both are defined by their corresponding parameters given in Table 1. Under the assumption of an exponential covariance function, these parameters, more specifically, the ranges and variances, define the covariance functions given in Fig. 2. With the given covariance functions as well as the mean parameters, we sample the GPs on a regular 101 × 101 from the unit square. The sampled GPs are given in Fig. 3. The influence of each of the corresponding 3 parameters, i.e., the mean µ , the range ρ , and the variance σ 2 , Fig. 2 Covariance functions. Two exponential covariance functions which are depending on a distance d and parametrized as given in Table 1. One can clearly see that the variance in Parametrization 1 is lower than in Parametrization 2. On the other hand, Parametrization 1 has a greater range which leads to slower decay of the covariance function over distance  Table 1 and Fig. 2, respectively. The Gaussian processes values are given at the respected coordinates x and y in the unit square by the color scale Page 5 of 14 Dambon et al. Swiss Journal of Economics and Statistics (2022) 158:2 can be directly seen from the individual visualized samples in Fig. 3. First, we note that the values of each parametrization are scattered around their individual means. The greater range of parametrization 1 relative to parametrization 2 expresses itself by larger color patches in Fig. 3. The greater variance of parametrization 2 leads to a wider range of values in the simulation which manifests itself by a wider color range in the visualization.

Maximum likelihood estimation of GP-based SVC models
We give a brief summary of a maximum likelihood estimation (MLE) approach for SVC models as introduced in Dambon et al. (2021a). Additionally, we extend the framework such that not only full GP-based SVC models as given in (1), but also mixed GP-based SVC models as given in (2) can be estimated.
With a data matrix X , where the entry (X) ij := x (j) i is the i th observation of the j th covariate, a mean vector µ := µ 1 , . . . , µ p T ∈ R p , the element-wise matrix product, and using the independence assumptions from above, the distribution of the response is given by The differences between the response's distribution as above and as given in Dambon et al. (2021a) are twofold. The first q entries of the mean vector µ are the means of the GP as defined in (3), while the further entries are the coefficients β q+1 , . . . , β p . For simplicity, we identify them with µ q+1 , . . . , µ p , respectively. The second difference is the sum building the covariance matrix. Since only covariates j = 1, . . . , q are defined to have SVCs, only q covariance matrices and the respective covariates enter the sum.
The model is thus fully parametrized by the covariance and the mean parameters µ ∈ R p . We define ω := θ T , µ T T as our parameter of interest which we estimate by maximizing the log-likelihood of (4). Since there exists no analytical solution, we must turn to numeric optimization. Once the estimate ω is found, one can use it to predict the SVCs for (new) locations s ′ using the conditional distribution, i.e., one obtains β j s ′ , j = 1, . . . , p . The estimator and predictor are implemented in the statistical software R (R Core Team, 2020) and can be used via the package varycoef (Dambon et al., 2021b).

Data
The analysis is based on transaction data for SFHs in the Canton of Zurich. The data is provided by Fahrländer Partner Raumentwicklung (FPRE), Zurich (Switzerland) and was collected by Swiss banks and insurance companies in their day-to-day business. It covers a time span of 6 consecutive quarters ranging from the 3rd quarter of 2018 to 4th quarter of 2019 and consists of 1578 observations. 2 Comparing the total number of transactions between the full census (approximately 3392 observations) and our data set at hand, we cover approximately 47% of the transactions of SFHs in the Canton of Zurich for the given period (Statistisches Amt des Kantons Zürich, 2021a, 2021b). The median transaction price of a SFH in our dataset is 1,390,000, which is comparable to the median transaction price in the full survey, which was 1,200,000 in 2018 and 1,250,000 in 2019 (Statistisches Amt des Kantons Zürich, 2021a). An overview of the data alongside some summary statistics is given in Table 2(a) and (b). Due to Swiss banking secrecy, the exact geographic locations of the SFH cannot be disclosed. Here, FPRE works with a fine grid of cells that divides the Canton of Zurich into a total of 563 cells. The true SFH locations in our data are given by representative centroids of the cells, c.f. Table 2(c) and Fig. 4. The centroid's location is provided in the LV03 coordinate reference system (Federal Office of Topography swisstopo, 1900). The cell's resolution is higher in densely populated areas and the cells were defined by real estate experts to account for differences on sub-ZIP-code level. The high resolution of the cells allows us to differentiate between districts of municipalities, for instance the proximity of a cell to a lake or city center. The median cell size is 3.576 km 2 , with the total range of areas extending from 0.246 to 18.809 km 2 . In total, we observe data at 268 distinct cells. Additionally, each cell is labeled with a location type, see Table 2(c), which will turn out helpful when analyzing our findings in Sect. 4.

Model
The model has the natural logarithm of the transaction price as the response variable. Further, we standardize the age using the following transformation, Page 6 of 14 Dambon et al. Swiss Journal of Economics and Statistics (2022) 158:2 where s age is the empirical standard deviation of age of all observations. The advantage of working with Z.age rather than the actual age is a numerical stable optimization process of the maximum likelihood estimation. As mentioned above, we expect a quadratic effect (Case et al., 2004;Clapp & Giaccotto, 1998;Fahrländer, 2006;Goodman & Thibodeau, 1995), which is why we also include the covariate Z.age 2 . As we expect spatial variation in these Z.age = age s age . coefficients, we use SVCs for these variables, c.f. first line in (5). The plotsize as well as the volume enter the model under a natural logarithm transformation. The rest of the continuous covariates renov, standard and micro are included without further transformation. Thus, all continuous covariates have approximately the same standard deviations which results in a well-behaved numeric optimization procedure for estimating the model. The categorical variables yearquarter, energy and SFHtype and the error term complete our model which can be formulated as: Comparing the general mixed SVC model (2) and our explicit hedonic model (5) we note that we have q = 3 and p = 16 including the intercept and all factor levels deviating from the reference levels. The model is therefore fully parametrized by (5) y i = log price i = β 1 (s i ) + β 2 (s i ) · Z.age i + β 3 (s i ) · Z.age 2 i + β 4 · log volume i + β 5 · log plotsize i + β 6 · renov i + β 7 · standard i + β 8 · micro i + β 9 · yearquarter i + β 10 · SFHtype i + β 11 · energy i + ǫ i . ω = θ T , µ T T = ρ 1 , σ 2 1 , ρ 2 , σ 2 2 , ρ 3 , σ 2 3 , τ 2 , µ 1 , . . . , µ 16 T ∈ R 23 .
We will use a numeric optimization over the profile likelihood. Thus, we must optimize over the covariance parameters θ and the mean parameters µ are determined implicitly by calculating the generalized least square estimate.

Observation locations
As the LV03 coordinates for the centroid's locations s i = LV 03x i , LV 03y i T cover a fairly large range, we standardized them to kilometers using the following formula: Z.LV 03x i Z.LV 03y i := 10 −3 · LV 03x i LV 03y i − 600000 200000 Again, this ensures a well-behaved numeric optimization while remaining interpretable as the ranges ρ j now act as a scaling factor on the kilometer distances.

Parameter estimates
We first look at the ML estimates ω MLE , which are given in Table 3. Here, we find that the mean estimates match our expectations. In particular, the vintage-related covariates, i.e., Z.age and Z.age 2 , show the following: 1. The mean effect for Z.age is negative, as one would expect, and statistically significant at the 0.1% level. This can be interpreted in the sense that on average the value of a single-family home typically decreases with age. 2. The quadratic effect for Z.age 2 shows a relatively small, positive mean effect, which is statistically significant at the 10% level. A larger quadratic mean age effect would correspond to an emphasized vintage effect.
All other mean effects have plausible signs, too. Namely, all other coefficients of continuous covariates are positive and statistically significant at reference levels. As for the categorical covariates, we observe some temporal price volatility for the transaction year and quarter, a premium for stand-alone, detached SFHs compared to other SFHs for the type of SFH and a premium for houses with enhanced energy efficiency for the energy standard.
The estimates for ranges of the Gaussian processes show that the range for the intercept and Z.age are considerably larger than the one for Z.age 2 . This will be expressed in larger spatial structures for the SVCs modeling the intercept and the linear age effect compared to the SVC modeling the quadratic age effect. The small range of Z.age 2 on the other hand indicates that the SVCs corresponding to the quadratic age effect will behave much more selective in their deviations from the mean. Finally, we analyze the estimated variances of the spatially varying coefficients. The intercept's variance is the largest and highly statistically significant. As for the linear and quadratic age effects, the range of the coefficients' values is smaller, see Table 3 and Fig. 5.

Visualization and interpretation of SVCs
In Fig. 5 we visualize fitted and predicted SVCs. Specifically, the figure shows the estimated SVCs for the observation locations the model has been trained on as well as for the spatial predictions for all other cell's centroid where we did not have any observations. The quality of these coincides with the previous parameter estimates' interpretations from Sect. 4.1 and real estate experts' knowledge. Table 3 Mean and covariance estimates ω MLE of the SVC model (5) The corresponding estimates' standard errors are given in parenthesis. In most cases, the standard errors can be approximated and computed by the Hessian from the numeric optimization. For the mean and the variance estimates, we use a two-sided Z-and a Wald-test to test whether μ j � = 0 and σ 2 j > 0 , respectively. This is only possible if the standard error is available. Page 9 of 14 Dambon et al. Swiss Journal of Economics and Statistics (2022) 158:2 For the intercept's SVC, c.f. Fig. 5a, which also can be interpreted as a mean price level, we can see that the highest values are achieved close to the city of Zurich and Lake Zurich, with a local peak in the city of Winterthur. As expected, the lowest values can be found towards the northern and eastern borders of ZH, which are rural areas.
A similar pattern as for the intercepts' SVC can be observed for the Z.age SVC, c.f. Fig. 5b. In absolute values, the linear age effect is smallest in the city of Zurich and in the area surrounding Lake Zurich, while it increases towards the northern and eastern regions of the Canton of Zurich. This indicates that the depreciation of SFH prices by age is higher in rural areas. For the Z.age 2 SVC, c.f. Fig. 5c, small scale deviations from the mean effect can be observed around Lake Zurich and the city of Winterthur. This hints at the presence of a vintage effect in the metropolitan areas of Zurich and Page 10 of 14 Dambon et al. Swiss Journal of Economics and Statistics (2022) 158:2 Winterthur. The resulting coefficient values for the three SVCs are summarized in Table 4. The individual interpretation of both panels b and c in Fig. 5 is cumbersome and inadequate as the fitted SVCs originate from the same covariate. As we are simultaneously modeling a linear and quadratic effect, one could therefore interpret the results as spatially varying paraboloids. Using the SVCs β 2 (·) and β 3 (·) for all observation locations s train within the training data, we back-transform the estimated effects to receive the marginal effect me s train , age for the age ∈ [−1, 99]: (6) me s train , age :=β 2 (s train ) · age s age +β 3 (s train ) · age s age

.
This is what we visualize in Fig. 6. The grey lines are the marginal effects me s train , age , grouped in panels by FPRE type of location and filtered such that (i) there are at least 5 observations per location s train and (ii) cropped to the span of observed years of construction at the corresponding location. This is to ensure that we have sufficient data backing up the results and that we do not extrapolate to unobserved building age. The red line is obtained by aggregating all marginal effects by type of location, i.e., where S κ , κ ∈ {1, 2, 3, 4} are the sets of all observations s in respective type of location and age ∈ [−1, 99] . We observe a pure depreciation for a majority of SFHs with age < 25 . This holds not only for the aggregated age effects, but also for most individual age effects per cell. It is at this point ( age > 25 ) that the marginal age effects start to differ. Aggregated on the type of location, we see a strong vintage effect for top locations while all other types of locations have aggregated marginal age effects of pure depreciative nature.
Looking at the individual cell's marginal age effects, one can observe some variety within each location type. Overall, it motivates the usage of spatially varying random effects, since the type locations cannot account for the geographical variety. At top locations a vintage effect is present such that some of the oldest SFHs have the same marginal (7)  (6) and the red lines are the aggregated marginal effects as defined in (7). The most extreme effects are displayed with their respective cell names, i.e., Fluntern (a district of the city of Zurich), Bonstetten (a suburb to the West of the city of Zurich), Guggenbühl (a district of the city of Winterthur) and Feuerthalen (a district bordering the city of Schaffhausen) Page 11 of 14 Dambon et al. Swiss Journal of Economics and Statistics (2022)  Here we see various rates of price depreciation and in some cases also a vintage effect. For instance, Guggenbühl, a district of Winterthur, shows a rather constant age effect while Feuerthalen and Bonstetten show a steep depreciation with age. Upon further investigation of the latter two, we note that the estimates in Bonstetten are primarily driven by two observations of very old SFHs with low transaction prices. Regarding the district of Feuerthalen it is worth mentioning that that Feuerthalen is located in the very north of the Canton of Zurich bordering the city of Schaffhausen. Since we do not have data on the city of Schaffhausen in our dataset, margin effects could come into play here, and therefore also the estimated marginal effects for Feuerthalen should be treated with caution. An in-depth analysis for all of these individual age effects is out of scope for this work. However, these insights underline the broad variety location-specific effects that is hard to account for by regular fixed effects, say, like interactions between the covariates of age and regionalization factors. We conclude this section by noting that we observe spatially varying age effects which clearly deviate from a pure depreciation. These effects are locally pronounced and mostly appear at top locations, which backs our hypothesis of spatially varying vintage effects and is in line with both initial citations taken from Case et al. (2004) and Malpezzi et al. (1987).

Predictive performance
We now assess the implications of our findings on predictive performance. As suggested in Sect. 4, there appears to be a spatially varying age effect that deviates from a linear depreciation with age. Now, we investigate if one can exploit this to enhance classical hedonic models to increase predictive performance.
We validate and compare our findings to a classical hedonic model with only the mean price, i.e., the intercept depending on spatial location s . Thus, we use a geostatistical model similarly defined as the SVC model in (5) but with q = 1 : To compare the two models (5) and (8), we conducted a tenfold cross validation that accounts for the temporal structure of the data. The first 5 quarters of transaction (8) y i = log price i = β 1 (s i ) + β 2 · Z.age i + β 3 · Z.age 2 i + β 4 · log volume i + β 5 · log plotsize i + β 6 · renov i + β 7 · standard i + β 8 · micro i + β 9 · yearquarter i + β 10 · SFHtype i + β 11 · energy i + ǫ i . data were exclusively used as training data. We randomly divided the observations from the last quarter, i.e., the 4th quarter of 2019, into 10 sets V f , f = 1, . . . , 10 , of 20 or 21 observations each. In each fold f , the observations V f were withheld from training to provide a validation set. Therefore, for all folds f the training data denoted T f = {1, . . . , 1578}\V f consists of 1557 or 1558 observations while the validation set V f consists of 21 or 20 observations from the 4th quarter of 2019. In such a way, we account for the temporal structure of the data.
The root mean square error (RMSE) is chosen as a measure of comparison and computed for in-sample estimates and out-of-sample predictions. Let ŷ i m, f denote the estimate or prediction of y i by model m in fold f , i.e., if i ∈ V f we have an out-of-sample prediction and if i ∈ T f we have an in-sample estimate. For each model m and fold f , we define the RMSE of in-sample estimates and out-of-sample predictions as: respectively. We report the respective medians over all folds as well as the percentage improvement in Table 5. The in-sample performance of the geostatistical model is improved by 7.7% by using the SVC model. This is to be expected, as the SVC model (5) offers more flexibility and the geostatistical model (8) is a true sub-model of the SVC model. For the out-of-sample predictions we find that the absolute error values are higher compared to their in-sample counterparts, which again is to be expected. In addition, we observe an 13.9% improvement in price prediction. We thus conclude that accounting for spatially varying age effects using SVC models as discussed in the previous section, translates into more accurate out-of-sample predictions.

Conclusion
To the best of our knowledge, the presented work is the first of its kind to investigate a spatially varying age effect for SFHs. While we find a purely depreciative age effect for some locations in the Canton of Zurich, there appears to be a substantial price premium for older SFHs, primarily at top locations. The existence of a not purely depreciative age effect is in line with the scientific literature and the assumptions of real estate experts. Further, even if there is no vintage effect present, we observe various grades of age depreciation by location. In this context, we consider it likely that age acts as a proxy for unmeasured covariates that directly have an impact on prices, such as quality of built or architectural style (e.g., room height, architectural details) of the object as has been suggested by the existing literature (Case et al., 2004;Goodman & Thibodeau, 1995). Our findings are also in line with a relatively new concept of redevelopment options as suggested by Clapp and Salavei (2010), where further analyses by Munneke and Womack (2016) also showed substantial spatial variation of such redevelopment options. However, both referenced works analyze data from the United States. It would be interesting to see if these concepts transfer to hedonic models based on Swiss SFHs. However, it is out of scope for this work. Finally, our assessment of the predictive performance in a cross validation yields more accurate predictions from an SVC model with spatially varying age coefficients than a classical geostatistical model.
Overall, our analysis suggests a spatially varying vintage or at least location specific age effect. Further research on the topic based on data from different regions or with higher resolution would be desirable.