88858cc永利官网建院40周年系列活动之学术讲座第51期
88858cc永利官网统计系系列 Seminar 第77期
主题: High dimensional clustering: Covariance clustering for mixture data
主讲人: 刘一鸣
主持人:郑贤
会议工具:腾讯会议ID:772 703 114
会议时间:2020年12月3日19:30-21:00
摘要
Clustering is one of the most important problems in unsupervised learning. This paper focuses on the clusters that are characterized by the different covariance matrices. We propose a distribution free approach, called covariance clustering method, to conduct clustering for the general high dimensional mixture models. Step one of the new two-stage approach is to choose an appropriate nonlinear transformation of the original data and step two is to conduct the clustering from the perspective of the leading eigenvectors of the sample covariance matrices of the transformed data. We prove that the misclustering error for the new algorithm converges to zero with probability tending to one under mild conditions. Simulation studies also demonstrate that the covariance clustering method outperforms the other methods under a variety of settings.
主讲人简介
刘一鸣博士于2020年毕业于新加坡南洋理工大学,在博士期间已在《Statistica Sinica》,《Computational Statistics & Data Analysis》,《Science China Mathematics》等SCI杂志发表6篇论文。目前主要的研究方向为高维统计推断,机器学习结合随机矩阵理论的应用等。