Regression modeling of censored data based on compound scale mixtures of normal distributions

Luis Benites, Camila B. Zeller, Heleno Bolfarine, Víctor H. Lachos

Producción científica: Contribución a una revistaArtículo de revista revisión exhaustiva

1 Cita (Scopus)

Resumen

In the framework of censored regression models, the distribution of the error term can depart significantly from normality, for instance, due to the presence of multimodality, skewness and/or atypical observations. In this paper we propose a novel censored linear regression model where the random errors follow a finite mixture of scale mixtures of normal (SMN) distribution. The SMN is an attractive class of symmetrical heavy-tailed densities that includes the normal, Student-t, slash and the contaminated normal distribution as special cases. This approach allows us to model data with great flexibility, accommodating simultaneously multimodality, heavy tails and skewness depending on the structure of the mixture components. We develop an analyt-ically tractable and efficient EM-type algorithm for iteratively computing the maximum likelihood estimates of the parameters, with standard errors and prediction of the censored values as a by-products. The proposed algorithm has closed-form expressions at the E-step, that rely on formulas for the mean and variance of the truncated SMN distributions. The efficacy of the method is verified through the analysis of simulated and real datasets. The methodology addressed in this paper is implementeIn the framework of censored regression models, the distribution of the error term can depart significantly from normality, for instance, due to the presence of multimodality, skewness and/or atypical observations. In this paper we propose a novel censored linear regression model where the random errors follow a finite mixture of scale mixtures of normal (SMN) distribution. The SMN is an attractive class of symmetrical heavy-tailed densities that includes the normal, Student-t, slash and the contaminated normal distribution as special cases. This approach allows us to model data with great flexibility, accommodating simultaneously multimodality, heavy tails and skewness depending on the structure of the mixture components. We develop an analytically tractable and efficient EM-type algorithm for iteratively computing the maximum likelihood estimates of the parameters, with standard errors and prediction of the censored values as a by-products. The proposed algorithm has closed-form expressions at the E-step, that rely on formulas for the mean and variance of the truncated SMN distributions. The efficacy of the method is verified through the analysis of simulated and real datasets. The methodology addressed in this paper is implemented in the R package CensMixRegd in the R package CensMixReg.

Idioma originalInglés
Páginas (desde-hasta)282-312
Número de páginas31
PublicaciónBrazilian Journal of Probability and Statistics
Volumen37
N.º2
DOI
EstadoPublicada - jun. 2023

Nota bibliográfica

Funding Information:
The authors were supported by FAPEMIG, CAPES and CNPq from Brazil.

Publisher Copyright:
© Brazilian Statistical Association, 2023.

Palabras clave

  • Modelo de regresión censurada
  • Algoritmos de tipo EM
  • Modelos de mezclas finitas
  • distribuciones de colas pesadas
  • Límite de detección
  • Modelo Tobit

Huella

Profundice en los temas de investigación de 'Regression modeling of censored data based on compound scale mixtures of normal distributions'. En conjunto forman una huella única.

Citar esto