Quantile Function on Scalar Regression Analysis for Distributional Data

2020
Radiomics involves the study of tumor images to identify quantitative markers explaining cancer heterogeneity. The predominant approach is to extract hundreds to thousands of image features, including histogram features comprised of summaries of the marginal distribution of pixel intensities, which leads to multiple testing problems and can miss out on insights not contained in the selected features. In this paper, we present methods to model the entire marginal distribution of pixel intensities via the quantile function as functional data, regressed on a set of demographic, clinical, and genetic predictors to investigate their effects of imaging-based cancer heterogeneity. We call this approach quantile functional regression, regressing subject-specific marginal distributions across repeated measurements on a set of covariates, allowing us to assess which covariates are associated with the distribution in a global sense, as well as to identify distributional features characterizing these differences, including mean, variance, skewness, heavy-tailedness, and various upper and lower quantiles. To account for smoothness in the quantile functions, account for intrafunctional correlation, and gain statistical power, we introduce custom basis functions we call quantlets that are sparse, regularized, near-lossless, and empirically defined, adapting to the features of a given dataset and containing a Gaussian subspace so non-Gaussianness can be assessed. We fit this model using a Bayesian framework that uses nonlinear shrinkage of quantlet coefficients to regularize the functional regression coefficients and provides fully Bayesian inference after fitting a Markov chain Monte Carlo. We demonstrate the benefit of the basis space modeling through simulation studies, and apply the method to Magnetic resonance imaging (MRI)-based radiomic dataset from Glioblastoma Multiforme to relate imaging-based quantile functions to various demographic, clinical, and genetic predictors, finding specific differences in tumor pixel intensity distribution between males and females and between tumors with and without DDIT3 mutations. for this article, including a standardized description of the materials available for reproducing the work, are available as an online supplement.
JOURNAL OF THE AMERICAN STATISTICAL ASSOCIATION
页码:90-106|卷号:115|期号:529
ISSN:0162-1459
收录类型
SSCI
发表日期
2020
学科领域
循证社会科学-方法
国家
美国
语种
英语
DOI
10.1080/01621459.2019.1609969
其他关键词
TUMOR HETEROGENEITY; SHRINKAGE; VARIABLES; MODELS; ROBUST
EISSN
1537-274X
资助机构
NCI NIH HHSUnited States Department of Health & Human ServicesNational Institutes of Health (NIH) - USANIH National Cancer Institute (NCI) [R01 CA160736, R01 CA194391, R01 CA178744] Funding Source: Medline
被引频次(WOS)
4
被引更新日期
2022-01
来源机构
University of Texas System UTMD Anderson Cancer Center University of Texas System UTMD Anderson Cancer Center
关键词
Basis functions Bayesian modeling Functional regression Imaging genetics Markov chain Monte Carlo Probability density function