BMC Bioinformatics | |
Latent class distributional regression for the estimation of non-linear reference limits from contaminated data sources | |
Tobias Hepp1  Andreas Mayr2  Jakob Zierk3  Manfred Rauh3  Markus Metzler3  | |
[1] Institut für Medizininformatik, Biometrie und Epidemiologie, Friedrich-Alexander-Universität Erlangen-Nürnberg, Waldstraße 6, 91054, Erlangen, Germany;Institut für Medizinische Biometrie, Informatik und Epidemiologie, Universitätsklinikum Bonn, Venusberg-Campus 1, 53127, Bonn, Germany;Kinder- und Jugendklinik, Universitätsklinikum Erlangen, Loschgestraße 15, 91054, Erlangen, Germany; | |
关键词: Latent class regression; Finite mixture models; Distributional regression; Reference limits; | |
DOI : 10.1186/s12859-020-03853-3 | |
来源: Springer | |
【 摘 要 】
BackgroundMedical decision making based on quantitative test results depends on reliable reference intervals, which represent the range of physiological test results in a healthy population. Current methods for the estimation of reference limits focus either on modelling the age-dependent dynamics of different analytes directly in a prospective setting or the extraction of independent distributions from contaminated data sources, e.g. data with latent heterogeneity due to unlabeled pathologic cases. In this article, we propose a new method to estimate indirect reference limits with non-linear dependencies on covariates from contaminated datasets by combining the framework of mixture models and distributional regression.ResultsSimulation results based on mixtures of Gaussian and gamma distributions suggest accurate approximation of the true quantiles that improves with increasing sample size and decreasing overlap between the mixture components. Due to the high flexibility of the framework, initialization of the algorithm requires careful considerations regarding appropriate starting weights. Estimated quantiles from the extracted distribution of healthy hemoglobin concentration in boys and girls provide clinically useful pediatric reference limits similar to solutions obtained using different approaches which require more samples and are computationally more expensive.ConclusionsLatent class distributional regression models represent the first method to estimate indirect non-linear reference limits from a single model fit, but the general scope of applications can be extended to other scenarios with latent heterogeneity.
【 授权许可】
CC BY
【 预 览 】
Files | Size | Format | View |
---|---|---|---|
RO202104287551159ZK.pdf | 1586KB | download |