Sample selection models attempt to correct for non-randomly selected data in a two-model hierarchy where, on the first level, a binary selection equation determines whether a particular observation will be available for the second level (outcome equation). If the non-random selection mechanism induced by the selection equation is ignored, the coefficient estimates in the outcome equation may be severely biased. When the selection mechanism leads to many censored observations, few data are available for the estimation of the outcome equation parameters, giving rise to computational difficulties. In this context, the main reference is Greene (2008) who extends the results obtained by Manski and Lerman (1977), and develops an estimator which requires the knowledge of the true proportion of occurrences in the outcome equation. We develop a method that exploits the advantages of response-based sampling schemes in the context of binary response models with a sample selection, relaxing this assumption. Estimation is based on a weighted version of Heckman’s likelihood, where the weights take into account the sampling design. In a simulation study, we found that, for the outcome equation, the results obtained with our estimator are comparable to Greene’s in terms of mean square error. Moreover, in a real data application, it is preferable in terms of the percentage of correct predictions.
Response-Based Sampling for Binary Choice Models With Sample Selection / Arezzo, Maria Felice; Guagnano, Giuseppina. - In: ECONOMETRICS. - ISSN 2225-1146. - ELETTRONICO. - 6:(2018). [10.3390/econometrics6010012]
Response-Based Sampling for Binary Choice Models With Sample Selection
Maria Felice Arezzo
;Giuseppina Guagnano
2018
Abstract
Sample selection models attempt to correct for non-randomly selected data in a two-model hierarchy where, on the first level, a binary selection equation determines whether a particular observation will be available for the second level (outcome equation). If the non-random selection mechanism induced by the selection equation is ignored, the coefficient estimates in the outcome equation may be severely biased. When the selection mechanism leads to many censored observations, few data are available for the estimation of the outcome equation parameters, giving rise to computational difficulties. In this context, the main reference is Greene (2008) who extends the results obtained by Manski and Lerman (1977), and develops an estimator which requires the knowledge of the true proportion of occurrences in the outcome equation. We develop a method that exploits the advantages of response-based sampling schemes in the context of binary response models with a sample selection, relaxing this assumption. Estimation is based on a weighted version of Heckman’s likelihood, where the weights take into account the sampling design. In a simulation study, we found that, for the outcome equation, the results obtained with our estimator are comparable to Greene’s in terms of mean square error. Moreover, in a real data application, it is preferable in terms of the percentage of correct predictions.File | Dimensione | Formato | |
---|---|---|---|
Arezzo_Response-Based-Sampling_2018.pdf
accesso aperto
Tipologia:
Versione editoriale (versione pubblicata con il layout dell'editore)
Licenza:
Creative commons
Dimensione
1.13 MB
Formato
Adobe PDF
|
1.13 MB | Adobe PDF |
I documenti in IRIS sono protetti da copyright e tutti i diritti sono riservati, salvo diversa indicazione.