컨텐츠 시작



제출번호(No.) 0521
분류(Section) Invited Talk
분과(Session) Probability / Stochastic Process / Statistics (SS-12)
High-dimensional classification method incorporating interaction among variables
Inchi Hu1
Department of ISOM, HKUST1
초록본문(Abstract) In this talk, we will focus on a specific problem - the use of gene expression data to predict clinical outcomes of cancer, even though the method can be applied much more broadly. The problem is challenging not just because the number of variables p is much greater than the number of observations n. What's even more challenging is that one needs to consider the interactive effects among variables in addition to their individual marginal effects. We present a classification method incorporating interactions among variables using an influence measure introduced by Lo and Zheng (2002) as a basic tool. The classification rule is a boosting ensemble of logistic-regression classifiers. Each classifier involves a cluster of variables, where interaction among variables in the cluster is explicitly incorporated. The proposed classification method is intended to have two desirable properties. First, the classification rule derived from the method has low error rates. Secondly, in the process of constructing the classification rule, influential variables responsible for the response are identified. That is, not only the classification result is accurate but also the classification rule contains important information in understanding the phenomenon under study. We applied the proposed classification method to three well-known gene expression miroarray datasets and obtain impressive results. The talk is based on the joint work with Maggie Wang, Shaw-Hwa Lo, and Tian Zheng.
(MSC number(s))
키워드(Keyword(s)) computation biology, data analysis, machine learning, prediction, statistics
강연 형태
(Language of Session (Talk))