江南烟雨

江南烟雨在哈工大工作生活的简介!

« 毕业答辩周要开始了!

普林斯顿大学贡三元教授讲座:A Robust Regression Approach to Machine Learning

 时间:2012年7月11日,下午4:00,新技术楼618房间

Abstract: Regression analysis has been a major theoretical pillar for supervised machine learning as it is applicable to various identification and classification problems. Aiming at robust regressors, two major approaches have been adopted. The first category contains a variety of regularization techniques whose principle lies in incorporating both the error and penalty terms into the cost function. It is represented by the ridge regressor.   Other prominent examples include RBF approximation networks by Poggio and Girosi [1990] and Least-Squares SVM introduced by Suykens and Vandewalle [1999].
The second category is based on the premise that robustness of the regressor could be enhanced by explicitly accounting for measurement errors in the independent variables. This is known as errors-in-variables models in statistics and is relatively new to the machine learning community. Based on such models, we have developed an approach named perturbation-regularized (PR) regressor.   (1) It can yield a desirable smoothing effect on the regressor result. (2) It can enhance the robustness of classification results. (3) The regressor can facilitate identification and removal of outliers from the training dataset (a notion closely related to PPDA).
There is no doubt that a regressor would certainly yield a better estimation if the original input is directly available.   Additional estimation error will inevitably occur due to the fact the input information is only indirectly available under the errors-in-variables model.   Our   PR regression analysis is founded on an effective decoupling between the usages of direct and indirect information.   Our main result is a "Two-Projection Theorem". It facilitates the error analysis by effectively dividing the estimation into two stages. More exactly, the first estimation (i.e. projection) reveals the effect of output noise and model-induced   error (caused by under-represented regressors). Then, the second projection leads to a tradeoff analysis between order and error. This facilitates our determination of a practical order of kernel regressor (under the Gaussian assumption).    By making use of the property of orthogonal polynomials, the regressor may be expressed as a linear combination of many simple Hermit Estimators, each focusing on one (and only one) orthogonal polynomial.
Ultimately, the two-projection analysis leads to a closed-form formula for two FAQs: ``What error is for a given regressor order?" or ``What order should be adopted in order to achieve a specified error?" Based on simulation on synthetic data (on nonlinear Inverse System Identification), performances of ridge and PR regressors are compared. Several examples on the order/error tradeoff will be highlighted. The issues raised on the outliers also prompt a PPDA classifier, enhancing the inference accuracy by removal of “anti-support” training vectors.   Based on simulation on the MIT-BIH ECG dataset, the effectiveness of the proposed methods will be demonstrated.

 

http://www.princeton.edu/~kung/

发表评论:

◎欢迎参与讨论,请在这里发表您的看法、交流您的观点。

日历

最新评论及回复

最近发表

Powered By Z-Blog 1.8 Arwen Build 90619

Copyright;2009-2009 blog.hit.edu.cn All Rights Reserved 哈工大网络与信息中心