Bayesian Knockoff Filter for False Discovery Control用于错误发现控制的贝叶斯 Knockoff 筛选方法

时间:2025-06-06 15:28    来源:     阅读:

光华讲坛——社会名流与企业家论坛第6756期

主题Bayesian Knockoff Filter for False Discovery Control用于错误发现控制的贝叶斯 Knockoff 筛选方法

主讲人:香港大学计算与数据科学学院副院长 尹国圣教授

主持人:统计与数据科学学院 林华珍教授

时间:6月9日10:30-11:30

地点:柳林校区弘远楼408会议室

主办单位:统计与数据科学学院 科研处

主讲人简介:

Guosheng Yin is Patrick Poon Endowed Chair Professor in Department of Statistics and Actuarial Science and Associate Director in School of Computing and Data Science at University of Hong Kong. After receiving Ph.D. in Biostatistics from University of North Carolina at Chapel Hill in 2003, he worked as Assistant/Associate Professor in Department of Biostatistics at University of Texas M.D. Anderson Cancer Center as well as Chair in Statistics in Department of Mathematics at Imperial College London. He was Head of Department of Statistics and Actuarial Science at University of Hong Kong in 2017-2023. He was elected as a fellow of American Statistical Association and a fellow of Institute of Mathematical Statistics. He served as associate editor for Journal of American Statistical Association, Bayesian Analysis, Contemporary Clinical Trials etc. He has published over 260 peer-reviewed papers in statistical, medical journals and AI and machine learning conferences, as well as two books on clinical trial designs.

尹国圣教授是香港大学统计与精算学系潘燊昌基金讲席教授,同时担任香港大学计算与数据科学学院副院长。他于2003年在北卡罗来纳大学教堂山分校获得生物统计学博士学位后,曾在德克萨斯大学MD安德森癌症中心生物统计学系担任助理/副教授,并曾在帝国理工大学数学系担任统计学讲座教授。2017年至2023年期间,他曾担任香港大学统计与精算学系系主任。尹教授被选为美国统计协会会士和国际数理统计学会会士。他曾担任《美国统计协会杂志》《贝叶斯分析》《当代临床试验》等期刊的副主编。至今,他在统计学、医学期刊以及人工智能与机器学习会议上发表了260余篇同行评审论文,并出版了两本关于临床试验设计的专著。

内容提要:

In many scientific fields, researchers are interested in discovering important features with substantial effect on the response from a large number of features while controlling the proportion of false discoveries. By incorporating the knockoff procedure in a fully Bayesian framework, we develop the Bayesian knockoff filter (BKF) for selecting features that have important effect on the response. In contrast to the fixed knockoff variables in a frequentist procedure, we allow the knockoff variables to be continuously updated in each iteration of the Markov chain Monte Carlo. Based on the posterior samples and the elaborated greedy selection procedure, our method can distinguish the truly important features from unimportant ones and the Bayesian false discovery rate can be controlled at a desirable level. Numerical experiments on both synthetic and real data demonstrate the advantages of our BKF over existing knockoff methods and Bayesian variable selection approaches, i.e., the BKF possesses higher power and yields a lower false discovery rate, especially for weak signals.

在许多科学领域,研究人员关注于从大量特征中发现对响应变量具有显著影响的重要特征,同时控制错误发现比例。我们在一个完全贝叶斯的框架中引入了 knockoff 程序,提出了贝叶斯 knockoff 筛选方法(Bayesian Knockoff Filter, BKF),用于选择对响应变量有重要影响的特征。与频率学派方法中固定的 knockoff 变量不同,我们的方法允许在马尔可夫链蒙特卡洛(MCMC)迭代的每一步中持续更新 knockoff 变量。基于后验样本和精心设计的贪婪选择过程,我们的方法能够区分真正重要的特征与不重要的特征,并且可以在期望的水平上控制贝叶斯错误发现率(Bayesian FDR)。在合成数据和真实数据上的数值实验表明,与现有的 knockoff 方法和贝叶斯变量选择方法相比,BKF 具有更高的检测能力(power)和更低的错误发现率,尤其在识别弱信号方面表现更为优越。

西南财经大学  版权所有 webmaster@swufe.edu.cn     蜀ICP备 05006386-1号      川公网安备51010502010087号