Frontiers in Applied Mathematics and Statistics,2023年
Siva Sivaganisan, Chen Chen, Bin Huang, Jinzhong Liu
LicenseType:Unknown |
A Gaussian process (GP) covariance function is proposed as a matching tool for causal inference within a full Bayesian framework under relatively weaker causal assumptions. We demonstrate that matching can be accomplished by utilizing GP prior covariance function to define matching distance. The matching properties of GPMatch is presented analytically under the setting of categorical covariates. Under the conditions of either (1) GP mean function is correctly specified; or (2) the GP covariance function is correctly specified, we suggest GPMatch possesses doubly robust properties asymptotically. Simulation studies were carried out without assuming any a priori knowledge of the functional forms of neither the outcome nor the treatment assignment. The results demonstrate that GPMatch enjoys well-calibrated frequentist properties and outperforms many widely used methods including Bayesian Additive Regression Trees. The case study compares the effectiveness of early aggressive use of biological medication in treating children with newly diagnosed Juvenile Idiopathic Arthritis, using data extracted from electronic medical records. Discussions and future directions are presented.
Frontiers in Applied Mathematics and Statistics,2023年
Wenhao Pan, Anil Aswani, Chen Chen
LicenseType:Unknown |
The problem of tensor completion has applications in healthcare, computer vision, and other domains. However, past approaches to tensor completion have faced a tension in that they either have polynomial-time computation but require exponentially more samples than the information-theoretic rate, or they use fewer samples but require solving NP-hard problems for which there are no known practical algorithms. A recent approach, based on integer programming, resolves this tension for non-negative tensor completion. It achieves the information-theoretic sample complexity rate and deploys the blended conditional gradients algorithm, which requires a linear (in numerical tolerance) number of oracle steps to converge to the global optimum. The tradeoff in this approach is that, in the worst case, the oracle step requires solving an integer linear program. Despite this theoretical limitation, numerical experiments show that this algorithm can, on certain instances, scale up to 100 million entries while running on a personal computer. The goal of this study is to further enhance this algorithm, with the intention to expand both the breadth and scale of instances that can be solved. We explore several variants that can maintain the same theoretical guarantees as the algorithm but offer potentially faster computation. We consider different data structures, acceleration of gradient descent steps, and the use of the blended pairwise conditional gradients algorithm. We describe the original approach and these variants, and conduct numerical experiments in order to explore various tradeoffs in these algorithmic design choices.