Sample selection bias with multiple dependent selection rules: an application to survey data analysis with multilevel nonresponse

The microdata of surveys are valuable resources for analyzing and modeling relationships between variables of interest. These microdata are often incomplete because of nonresponses in surveys and, if not considered, may lead to model misspecification and biased results. Nonresponse variable is usually assumed as a binary variable, and it is used to construct a sample selection model in many researches. However, this variable is a multilevel variable related to its reasons of occurring. Missing mechanism may differ among the levels of nonresponse, and merging the levels of nonresponse may cause bias in the results of the analysis. In this paper, a method is proposed for analyzing survey data with respect to reasons for the nonresponse based on sample selection model. Each nonresponse level is considered as a selection rule, and classical Heckman model is extended. Simulation studies and an analysis of a real data set from an establishment survey are presented to demonstrate the performance and practical usefulness of the proposed method.


Introduction
Modeling relationships between variables based on survey microdata is an essential part of many researches and analyses. For example, in survey methodology, determining a proper approach for some problems, such as appropriate strategies for following up nonresponse units in the phase of data collection, assessment of measurement errors of main variables among individuals that responded and imputation of nonresponse, is usually based on the modeling. Another example in economics is to model productivity or efficiency in a sector in the form of secondary analysis of survey data or to examine the relationship between turnover and value added of an establishment with some factors related to production such as the number of employees based on microdata of an establishment survey.
Most surveys suffer from nonresponse and their microdata are often incomplete. Nonresponse can increase errors of estimates or lead to model misspecification and biased results, especially in the case of nonignorable nonresponse. Heckman (1976Heckman ( , 1979 presented a method to adjust bias due to the nonresponse in modeling of a dependent variable. He considered a model for nonresponse called sample selection model and presented the estimates of the parameters and the variances of the estimators by assuming nonresponse variable as binary and normal distribution for the errors components of two models (nonresponse and variable of interest models). Hanoch (1976) extended the Heckman approach for multivariate dependent variables with one equation for nonresponse mechanism and investigated main factors on labor force. Robinson (1978, 1982) developed the Heckman model by two and then multiequations for nonresponse mechanisms and obtained estimators for model parameters with independent assumptions between the random effects in the equations of selection mechanism. Since then, in recent years, some developments have been performed on Heckman model. Jolani (2014) worked on longitudinal data in the presence of nonresponse by presenting an extension of Heckman model. He modeled dependent variable with some explanatory variables at time t, and for each time before t, considered a model as selection model. He obtained the estimates of the parameters by assuming nonresponse variable as binary and multivariate normal distribution for error components in the models. Kim and Kim (2016) presented a method to analyze data with multivariate sample selection model. They assumed elliptically contoured (EC) distribution for the errors in the models to obtain robustness against departures from normality.
However, nonresponse can be caused by different reasons, and therefore it is in fact a multilevel variable. Merging of the levels may lead to model misspecifications and biased results, especially in cases where the mechanism of nonresponse is not the same at different levels of nonresponse. In other words, different covariates may be related to different reasons of nonresponse or the effects of covariates are of different strengths, or go in opposite directions. So, it makes sense to consider a different selection model for each level of nonresponse in such cases.
Most of the researches about analysis of survey data are based on using only one binary variable for nonresponse. Also, there are a few works on nonresponse in establishment surveys in recent years. Earp et al. (2014Earp et al. ( , 2018, Kirchner and Signorino (2018), Phipps and Toth (2012), Seiler (2010) and Rezaee et al. (2021) used logistic regression, classification tree and support vector machine methods to investigate nonresponse in establishment surveys. Paiva and Reiter (2017) provided a way to follow nonresponse samples in an establishment survey using a mixture pattern model and the assumption of a nonrandom nonresponse mechanism. Refusal and noncontact are two levels of nonresponse variable and were studied in some of the researches about household surveys. Heerwegh et al. (2007) examined the effect of nonresponse error due to refusal and noncontact in a household survey and concluded that the error due to noncontact nonresponse is 2.56 times greater than the error due to refusal. Durrant and Steele (2009) examined the factors influencing the nonresponse by distinguishing refusal from noncontact for a set of UK household surveys using a multivariate logistic regression model. Steele and Durrant (2011) examined alternative approaches to multilevel modeling of survey noncontact and refusal. They reviewed multinomial and sequential models and compare them with a sample selection model that allows for residual correlation between a sample unit's noncontact and refusal propensities. Vassallo et al. (2015) also examined interviewer's experience effects on nonresponse in a panel survey in the case of multilevel nonresponse.
In this paper, we provided a method for analyzing incomplete survey data with considering nonresponse as a dependent multilevel variable. We extended the classical Heckman model via increasing the number of selection models, caused by the number of nonresponse reasons, considering the dependency between nonresponse levels, then we evaluated the performance of the proposed method using a simulation study and implemented it on an establishment survey with two reasons, refusal and noncontact for nonresponse. We compared the results of the proposed method with those of the univariate selection model and investigated the influence of nonrandom nonresponse by a sensitivity study.
This paper is organized as follows. In Sect. 2, Heckman model is reviewed, then in Sect. 3, sample selection model with multiple selection rules is presented and discussed. In Sect. 4, simulation studies are given, and in Sect. 5, the proposed method is implemented on an establishment survey microdata and the results are compared with those of using univariate selection model. Also, the influence of nonrandom nonresponse is investigated using likelihood displacement. In Sect. 6, conclusion and discussion are given. Heckman (1976Heckman ( , 1979 proposed a method for bias correction due to nonresponse samples in an ordinary regression model. He wanted to estimate the parameters in the model in the presence of nonresponse on some y i s. He considered the model for nonresponse as sample selection model, where y * i is a latent variable such that if y * i < 0 then nonresponse occurs for y i and if y * i ≥ 0 , then y i is observed. He assumed bivariate normal distribution for the errors (e i , u i ) with parameters (0, 0, σ 2 1 , σ 2 2 , ρ) and found estimators of parameters in models (1) and (2). In order to calculate sample selection bias, Heckman first obtained

Univariate selection model
is known as inverse Mills ratio, z i = −w i α/σ 2 , σ 12 = ρσ 1 σ 2 and f and F are density and distribution functions of the standard normal distribution. Therefore, sample selection (1) Page 3 of 15 Rezaee et al. Swiss Journal of Economics and Statistics (2022) 158:8 bias is equal to σ 12 σ 2 i . Then, he rewrote the regression model as where v i has mean 0 and variance σ 11 It is possible to construct likelihood function and estimate the unknown parameters but since this task involved complex calculations, especially at that time, he estimated the unknown parameters in two steps. Firstly, he estimated α by maximum likelihood estimator using likelihood function: 1−F (ẑ i ) and secondly, using ˆ i instead of i in model (3), β , σ 12 and σ 2 1 were estimated by ordinary least squares regression (OLS). He adjusted the estimator of where S 0 is the set of individuals who responded y i and v i is the estimation of residuals that can be obtained from OLS. For identification of probit model in (4), one has to assume σ 2 2 = 1 (Long 1997, p. 47). Heckman (1979) presented a method for estimation of variance of estimators of parameters based on asymptotic distribution of them. Heckman twostep method is a convenient method for bias correction but it has some weakness including assumption of bivariate normal distribution for errors and not using the exact likelihood of the observations.

Sample selection with multiple selection rules
Nonresponse occurs for a variety of reasons (here, levels) in many surveys. Combining these levels into a single category and using a selection model to show their relationship with the main variable of interest may lead to an increasing error or model misspecification. Kim and Kim (2016) presented a method for multivariate selection regression model assuming the errors come from a family of elliptical distributions. They used exact likelihood to drive estimates using an extended version of the EM algorithm and a hierarchical model. In their method, it is not possible to generalize the Heckman's two-step procedure because of non-normality of the errors and finite boundary values of the latent variable for determining nonresponse.

Model structure
In this section, we increase the number of sample selection models to be equal to the number of nonresponse reasons. We consider the problem to be the study of the relationship between Y and X based on a survey data, in which a percent of samples are nonresponse for some recorded reasons. Let the set of observations in survey be S = {(y 1 , M 1 , x 1 , w 11 , . . . , w 1K ), . . . , (y n , M n , x n , w n1 , . . . , w nK )} where y i is the variable of interest, M i is the nonresponse indicator of y i , i.e., 0 for response and j in the case that y i is nonresponse due to reason j, j = 1, . . . , K and for i = 1, 2, . . . , n and j = 1, 2, . . . , K , x i = (x i1 , x i2 , . . . , x ip ) , w ij = (w ij 1 , w ij 2 , . . . , w ij q j ) are the vectors of known explanatory variables. We use following models to show the relationship between variable of interest (main model), explanatory variables and levels of nonresponses (selection models): n, j = 1, 2, . . . , K and e i and u i = (u i1 , u i2 , . . . , u iK ) have multivariate normal distribution with mean zero and covariance matrix = σ 00 eu ue uu where eu = [σ 01 , σ 02 , . . . , σ 0K ] , ue = ′ eu and uu is the K × K covariance matrix of u i with diagonal elements of 1 due to identifiability and off-diagonal elements of σ kl = ρ kl . Our main goal is to find estimates of parameters in model (5) and (6) such that bias of sample selections be corrected.
Usually in surveys, reasons of nonresponse have priority of observing over each other, i.e., it is not possible to observe nonresponse reasons all at the same time and only one reason is observable and the others may or may not. For example, if the nonresponse in a survey is due to noncontact and refusal, then for some individuals, only noncontact is observable, i.e., refusal will be observable if contact with respondent can be done. Therefore, by prioritizing the nonresponse reasons, observing M i = k means that nonresponse reasons were not related to 1 to k − 1 first items of reasons and also we cannot have any judgment for reasons k + 1 to K. In this paper, we assume that the reasons for the nonresponse are prioritized, therefore M i = 0 if we have M ij = 0 for all of j = 1, . . . , K and M i = j if M ij = 1 and M it = 0 for all of t = 1, . . . , j − 1 . In this case, the value of M it are unobservable for all t = j + 1, . . . , K.
Reviewing literature, there are other models such as sequential, nested, Tobit or double-hurdle models which are commonly used. However, these models cannot cover the issue of priority well. Sequential or nested model can Page 4 of 15 Rezaee et al. Swiss Journal of Economics and Statistics (2022) 158:8 be used to analyze multilevel nonresponse with priorities, but using this model, it is not possible to consider correlation between the reasons of the nonresponse. Failure to account for this correlation may lead to biased parameter estimates. For example, if the reasons of nonresponse are noncontact and refusal, the sequential model cannot take into account the dependency between the noncontact and refusal processes and this dependency will be unexplained by the covariates in the model (Steele & Durrant, 2011).
In the extended Tobit model, there are one selection model for each of dependent variable of interest. The doublehurdle model can be considered as a special case of the method presented in this paper when the number of selection models is 2, the correlation coefficient between the nonresponse reasons is zero and the nonresponse reasons have priority over each other. See, Bruno (2013) and Engel and Moffatt (2014) for more details. We set S j = {i|M i = j}, j = 0, 1, 2, . . . K and use respondent samples to estimate the parameters in models (5) and (6). Since samples are respondent if the corresponding latent variables are nonnegative, (1992) investigated the moment generating function of truncated normal distribution. Based on his work and Jolani (2014), we have: where φ 1 (w ij α j ) is the density function of the univariate standard normal distribution evaluated at w ij α j , � K (w i1 α 1 , . . . , w iK α K ) is the cumulative distribution function (cdf ) of a K-variates normal distribution with mean zero and covariance matrix uu in the form of: given u ij and In equation (8), and φ 2 (w ik α k , w il α l ; kl uu ) is the density function of the standard bivariate normal distribution evaluated at w ik α k and w il α l , kl uu is the covariance matrix of u k and u l , is the conditional density function of a K − 2 variates normal distribution evaluated at the u i(kl) ( all u i s without the k-th and the l-th variable ) given u ik and u il and Also in equation 8, σ 00 + eu H i ue should be positive. The model 5 can be rewritten as follows: where v i is the random error with E(v i ) = 0 and Var(v i ) = Var(y i ) for all i = 1, 2, . . . , n.

Two-step estimation
Now, it is possible to apply Heckman's two-step method. The likelihood function in the firs step with priority of nonresponse reasons is in the form of: i1 , i2 , . . . , iK and uu can be estimated by maximum likelihood estimation method. With substitution of the value of ˆ ij in equation (9), β and σ 0j can be estimated by OLS.
has heteroscedasticity and the estimate of σ 00 should be adjusted. Since to σ 00 + eu H i ue and therefore we can adjust the estimator of σ 00 in form of:

Estimation of standard errors of the estimators
We can consider model (9) as follows to find the standard errors of the estimators of parameters: Based on the work done by Lee et al. (1980), the appropriate forms of the standard errors of the parameters in the multivariate sample selection can be expressed by eu is a n × nK dimension matrix with diagonal elements eu and off-diagonal elements 0, is a nK × nK dimension matrix with diagonal elements and * is the asymptotic covariance matrix for the parameters of the first step. The standard errors of the parameters in vector β * are given by the squared root of the diagonal elements of Cov(β * ).

One-step estimation
With the development of methods to compute the multiple integrals and to optimize multivariate functions in major statistical software, it is possible to obtain estimators by maximizing the exact likelihood. The exact likelihood, with considering priority of nonresponse reasons, is in the form of: Page 6 of 15 Rezaee et al. Swiss Journal of Economics and Statistics (2022) 158:8 where Y obs is the vector containing the observed y i s. Likelihood function (11) can effectively evaluated by many statistical software such as R.

Estimation of standard errors of the estimators
In this method, since the estimators are obtained from the maximum likelihood estimation method, the variance of the estimators can be obtained approximately from the inverse of the diagonal components of Fisher information. Also, the bootstrap and Jacknife methods may be used.

Test of significancy of the model parameters
In the one-step method, we have exact likelihood, and we can test significancy of the model parameters using likelihood ratio test as follows: With the exact likelihood in (11), it is possible to obtain the estimators with K sample selection models, and where β 0 , α 0 1 , . . . , α 0 K , � 0 are the values of the parameters under the null hypothesis ( H 0 ), β , α 1 , . . . ,α K , ˆ are maximum likelihood estimators that are obtained from (11) and df is the number of parameters which are not assumed to be known in H 0 .

Sensitivity analysis
By the specification of the exact likelihood in (11), it is possible to use likelihood displacement approach to study the influence of sample selection on estimates of the parameters. The method of local influence was introduced by Cook (1986) and developed by others as a general tool for assessing the influence of local departures from the assumptions underlying the models. These assumptions, since we desire to study the departure of random nonresponse to nonrandom nonresponse, are about the elements of , for example ρ 01 = 0 , ρ 02 = 0 or ρ 01 = · · · = ρ 0K = 0 may be considered to see the influence of nonresponse on the results. The likelihood displacement LD (w) is defined as: where θ = (α 1 ,α 2 ,β,ˆ ) and w is the q × 1 perturbation vector which shows the departures from the assumptions. In the cases where we desire to study the influence of each reason of nonresponse, e.g., k − th reason, k = 1, . . . , K , ρ 0k = 0 , and w is a scalar around 0 and for influence study of a subset of reasons, simultaneously, w is a multi-dimensional vector. l(θ) ( l(θ |w) ) is the (12) maximum log-likelihood with no perturbation (perturbation). When w is univariate, influence graph LD (w) around zero is a convenient tool for studying the local behavior of w. If the graph is strongly curved at zero, it means that sample selection is nonrandom and the parameters are estimated with high precision, and otherwise, sample selection is random.
In the cases when w is multi-dimensional, there are several curvatures. Cook (1986) suggests investigating the direction in which this influence measure changes most rapidly locally. The maximum curvature C max of the LD (w) surface is given by: where is the P × q matrix with elements of ∂ 2 l(θ|w) ∂θ i ∂w j | θ =θ,w=0 , i = 1, . . . , P; j = 1, . . . , q , P is the dimension of the θ with respect to not having perturbation, θ is the estimation of θ under no perturbation, w = 0 denotes no perturbation, Q is the P × P matrix with the elements of ∂ 2 l(θ|w) ∂θ i ∂θ j | θ =θ,w=0 , i = 1, . . . , P; j = 1, . . . , P and l is the eigen vector corresponding to the maximum absolute eigen value of the matrix ′ Q −1 . It is straightforward to apply this approach to multivariate selection model. For more details see Billor and Loynes (1993), Cook (1986), Ganjali and Rezaei (2005) and Razie et al. (2013).

Simulation studies
The multivariate sample selection model was examined in this section by comparing its performance with those of the univariate selection model (USM) and the regression model with removing nonresponse observations (complete cases, CC). It is possible to run simulations for any number of selection models, but due to the number of different combinations of the nonresponse mechanisms at the nonresponse levels, there will be many cases and reporting them is out of the aim of this paper. For this reason, bivariate selection model (BSM) was considered. To compare the results with the USM and its relationship with the variable of interest, the same explanatory variable was chosen for the selection models and the main model. This variable extracted from a uniform distribution between 1 and 10. In order to investigate the importance of normality assumption for errors, we run simulation in two parts, at first assuming normality and secondly considering non-normality assumption. Moreover, to study the behavior of the proposed method (14) C = 2 l ′ ′ Q −1 l Page 7 of 15 Rezaee et al. Swiss Journal of Economics and Statistics (2022) 158:8 in different states of the nonresponse mechanisms at the different levels of nonresponse, we consider the following three cases.
We set β 0 = −1 , β 1 = 1.5 , σ 00 = 1 , the number of iterations of 500 and a sample number of 1000 were used to generate the data. Other sample sizes such as 200 and 500 were used, and the results were consistent with the results of sample size of 1000. All analysis were done in Table 1 Estimates of the parameters with mean squared errors (in parentheses) with the normality assumption for errors Estimates are based on the bivariate selection model using two-step and one-step methods, BSM (two-step) and BSM (one-step), univariate selection model in onestep method, USM (one-step), and complete case analysis (CC), i.e., data after deleting nonresponse cases.  Rezaee et al. Swiss Journal of Economics and Statistics (2022) 158:8 R. We applied some packages in R such as "maxLik" and "mvtnorm" to do calculations. The corresponding source code is available on request.

Normality assumption
We assume the stochastic errors come from a three-variate normal distribution with mean zero and covariance structure as: The main model and the selection models are as follows, respectively: The average response rate is about 60% in case 1 and about 63% in cases 2 and 3. The average nonresponse rates at level 1 in cases 1 to 3 are about 29, 21 and 15 percent, respectively, and at level 2 are around 11, 16 and 23 percent, respectively. Table 1 shows the results of the simulation. Figure 1 shows a boxplot of the main model's coefficients estimates to evaluate the performance of the methods. It is observed that in all three cases, the CC � =   σ 00 σ 01 σ 02 σ 01 1 ρ 12 σ 02 ρ 12 1   . Fig. 1 Boxplots of the estimated regression coefficients of β 0 and β 1 with normality assumption from 500 simulation runs. Methods are the bivariate selection model using two-step and one-step methods, BSM (two-step) and BSM (one-step), respectively, univariate selection model using one-step method, USM (one-step), and complete case analysis (CC), i.e., data after deleting nonresponse cases Table 2 Root of the mean squared error (RMSE) for three cases of simulation with the normality assumption The same as that of Table 1 Method BSM (two-step) BSM (one-step) USM (one-step) CC  Rezaee et al. Swiss Journal of Economics and Statistics (2022) 158:8 method has biases in estimating β 0 and β 1 . The method of USM in case 1 and 3 has biases in estimating the intercept, but in case 2, there is almost no bias. It can also be seen in the estimating of β 1 that, this method, in case 3, gives biased estimate. The BSM has no bias in estimating the intercept in cases 2 and 3, and in case 1, the bias of this method is much less than that of the biases of other methods. To estimate β 1 , this method gives no bias in cases 1 and 3, and in case 2, its bias is slightly higher than that of the USM. The twostep BSM gives almost bias in all three cases. Considering the above, it can be stated that the efficiency of the one-step BSM is more than that of the univariate, and according to the advances it made in maximization algorithms and computer programs, the method is acceptable. Comparison of the performance of this method with that of the USM in cases 1 and 3, especially 3, has a significant advantage, but in case 2, the USM has more advantages. Table 2 compares the performances of the methods based on the root of the mean square error (RMSE) criterion using the regression model . Given that all the y i values of this criterion are observed, it seems to be a suitable criterion for comparing methods. It is observed that the value of this criterion for the one-step BSM method in cases 1 and 3 is less than those of the other methods. The method of USM in case 2 has less RMSE than those of the other methods.

Table 3 Estimates of the parameters with mean squared errors (in parentheses) using the t distribution (3 degrees of freedom)
The same as that of

Non-normality assumption
In considering selection models, errors are mostly assumed to have a multivariate normal distribution due to flexibility in computation and mathematical formulation. However, sensitivity to such an assumption should be considered. For this purpose, the multivariate t distribution can be used, because of its heavier tail than that of the multivariate normal distribution. A simulation study is done in this section to investigate the change of results due to using the distribution of t with 3 or more degrees of freedom. The simulation results are given in Table 3, which show almost close results to those of the normal distribution model, except the use of BSM (two-step) method in the cases 2 and 3. Figure 2 shows a boxplot of the main model coefficient estimates to evaluate the performance of the methods. Although it is observed that all three cases have a bias in estimating β 0 and β 1 , the bias of one-step BSM is less than those of others. Because of low efficiency of twostep BSM, boxplot of using this method was removed from Fig. 2. Table 4 compares the methods by the RMSE criterion. As in the normality assumption, the efficiency of the BSM (one-step) method is higher than those of other methods. The two-step method is not as efficient as USM (one-step) and CC methods. Fig. 2 Boxplots of the estimated regression coefficients of β 0 and β 1 with t distribution assumption from 500 simulation runs. Methods are the bivariate selection model using two-step and one-step methods, BSM (two-step), BSM (one-step), respectively, univariate selection model using one-step method, USM (one-step), and complete case analysis (CC), i.e., data after deleting nonresponse cases Table 4 Root of mean squared error (RMSE) in three cases of simulation, t distribution with 3 degrees of freedom The same as Table 1 Method BSM (two-step) BSM (one-step) USM (one-step) CC     Rezaee et al. Swiss Journal of Economics and Statistics (2022) 158:8

Application: analysis of an establishment survey
In this section, we apply the presented method on the data of manufacturing with ten employees or more which is one of the most important surveys implemented in the statistical center of Iran. Its results are used for calculation of value added in the manufacturing sector in whole country and provinces. We use this survey on industry of "manufacture of rubber and plastics products" to investigate the effect of covariates on output variable, i.e., the value of all sales of goods and services for each of establishment. Noncontact and refusal are two reasons of nonresponse in this survey, and so we consider two selection models. In order to obtain the parameters estimates in models (5) and (6), we applied BSM and USM using both two-step and one-step approaches and also CC, i.e., only using those with observed values. In finding explanatory variables for the USM, variables that did not have significant coefficient were excluded from the model. Therefore, the set of explanatory variables for the USM and the BSM are not the same. Table 6 shows the estimates in using the selection models provided with standard errors and P values. We use likelihood ratio test to obtain P values. The correlation coefficient between refusal and noncontact is estimated to be negative, and it is equal to − 0.440 and − 0.657 with respect to using two-step and one-step methods, but it insignificant. Table 7 shows estimates for the correlations between output value and the reasons of nonresponse. It is seen that correlation between output value and refusal is − 0.576 and − 0.411 in BSM using two-step and one-step methods, respectively, which is significant, i.e., the higher is the value of the output, the higher is the probability of refusal. In addition, correlation between output value and noncontact is 0.069 and 0.052 in using two-and one-step methods, respectively, which is not significant, that is, nonresponse due to noncontact has no association with the value of output. This shows that the nonresponse mechanism is different among the levels of nonresponse, so with considering just one level as USM, the estimates will be biased as it is shown in the case 1 of the simulation study. Moreover, in USM, it is seen that the correlation between output value and nonresponse is − 0.393 and − 0.287 in using two-and one-step methods, respectively, which is significant, but by this model, it is not known which levels of nonresponse causes this noningnorable nonresponse.

Estimation
Based on the results of Table 8, the main model coefficient estimates in the BSM method have differences with those of the USM and CC methods, but their standard errors are almost the same. These differences are due to the distinction between reasons of nonresponse, consideration of the BSM, the unbiased property of the estimates in the BSM method (as shown in Sect. 3) and consideration of different nonresponse mechanisms among different nonresponse levels. The causes of having biases of the parameters estimates of using the USM and CC methods are the lack of the above-mentioned reasons. Moreover, BSM method in two forms (two-step and one-step) has lower MSE than other methods. Also, the MSEs using USM method have less value than that of the complete case method.

Sensitivity analysis
In order to assess the influence of the sample selections on the estimates, we consider three cases of deviation ρ 02 = 0 , ρ 01 = 0 and ρ 01 = ρ 02 = 0 . Figure 3 shows the graph of likelihood-displacement for the first two cases. These graphs are obtained by the equation given in (13).
It can be seen from the left panel of Fig. 3 that the value of LD(ρ) is not large and l(.|ρ = 0) is not curved around zero, and it can be concluded that the estimates will not be affected by noncontact nonresponse. But for refusal nonresponse, it can be seen in Fig. 3, the right panel, that the value of LD(ρ) is large and so the refusal nonresponse has large effect on the estimates of the parameters. In order to assess the influence of the third case, we apply the equation given in (14). The maximum curvature ( C max ) in this case is larger than 3, and it can be concluded that estimates are sensitive to the kind of missing mechanism. These results are also consistent with the assessment of the model by using the Akaike information criterion (AIC). The values of AIC are 4075.748, 4089.000 and 4087.746, respectively, for case 1 to case 3. AIC for the model with three correlation coefficient is 4077.728 which is slightly more than that of the model with no ρ 02 .

Conclusion and discussion
In this paper, we presented a method for analysis of survey data with modeling of dependent multilevel nonresponse. In this method, the number of selection models is equal to the number of reasons of nonresponse. We assumed a multivariate normal distribution for the error terms of these models and the response model. The parameters can be estimated using a two-step method or the one-step (full likelihood) method. In this approach, we assumed that there is only one variable of interest as response for modeling. However, this approach can be extended to cases where there are more than one variable of interest.
In a set of simulation studies, performance of the proposed method in the case of BSM, and that of Heckman model were compared. It turns out that the proposed method (in two forms of one-step and two-step) has better performance than that of USM in the cases with different signs of correlation for dependent variable of interest and the nonresponse levels. Moreover, it performs at least as well as the USM when the sign of correlation is the same and one-step method is used. In other words, well-performance of the BSM using twostep method is less than that of the USM using one-step method.
The normality assumption for errors is mostly assumed due to flexibility in computation and mathematical formulation. However, sensitivity to such an assumption was considered by using multivariate t distribution with degrees of freedom of 3 or more, because of its heavier tail than that of the multivariate normal distribution. The results of simulation show that the estimates are biased in both bivariate and univariate selection models and CC analysis, but the bias of BSM