统计与数学学院(统计交叉科学研究中心)学术报告预告—Mixture Conditional Regression with Ultrahigh Dimensional Text Data for Estimating Extralegal Factor Effects

报告题目:Mixture Conditional Regression with Ultrahigh Dimensional Text Data for Estimating Extralegal Factor Effects

报告人:Hansheng Wang

主持人:安起光

地点:天舜大厦第一会议室

报告时间:2024年6月20日(星期四)9:00—10:30

主办单位:山东财经大学统计与数学学院(统计交叉科学研究中心)

摘要:

Testing judicial impartiality is a problem of fundamental importance in empirical legal studies, for which standard regression methods have been popularly used to estimate the extralegal factor effects. However, those methods cannot handle control variables with ultrahigh dimensionality, such as those found in judgment documents recorded in text format. To solve this problem, we develop a novel mixture conditional regression (MCR) approach, assuming that the whole sample can be classified into a number of latent classes. Within each latent class, a standard linear regression model can be used to model the relationship between the response and a key feature vector, which is assumed to be of a fixed dimension. Meanwhile, ultrahigh dimensional control variables are then used to determine the latent class membership, where a naive Bayes type model is used to describe the relationship. Hence, the dimension of control variables is allowed to be arbitrarily high. A novel expectation-maximization algorithm is developed for model estimation. Therefore, we are able to estimate the key parameters of interest as efficiently as if the true class membership were known in advance. Simulation studies are presented to demonstrate the proposed MCR method. A real dataset of Chinese burglary offenses is analyzed for illustration purposes.

报告人简介:王汉生,北京大学光华管理学院商务统计与经济计量系,教授,博导。国家杰出青年基金获得者,教育部长江学者特聘教授,全国工业统计学教学研究会青年统计学家协会创始会长,美国数理统计协会(IMS)Fellow,美国统计学会(ASA)Fellow,国际统计协会(ISI)Elected Member。先后历任9个国际学术期刊副主编(Associate Editor / Editor)。国内外各种专业杂志上发表文章100+篇,并合著有英文专著共1本,(合)著中文教材4本。爱思唯尔中国高被引学者学者(数学类,2014—2019;应用经济学类:2020;统计学类:2021—2022)。