Seminar 2021-09-03

Algorithmic Robustness in Classification

Jie Shen
Assistant Professor, Department of Computer Science, Stevens Institute of Technology

Date: Friday, September 3, 2021
Time: 11am–noon
Location: JC Gold Room

To attend this seminar, please RSVP by the time of the event!

Abstract

Learning linear classifiers (i.e. halfspaces) is one of the fundamental problems in machine learning dating back to 1950s. In the presence of benign label noise such as random classification noise, the problem is well understood. However, when the data are corrupted by more realistic noise, even establishing polynomial-time learnability can be nontrivial. In this talk, I will introduce our recent work on learning with Massart noise and with malicious noise that significantly advances the state of the art. In particular, for the Massart noise where each label is flipped with an unknown probability across the domain, we present the first polynomial-time algorithm that is robust to any noise rate <1/2. For the malicious noise where an adversary may inspect the learning algorithm and inject malicious data, we present the first sample-optimal learning algorithm that achieves information-theoretic noise tolerance. In both works, the developed algorithms are active in nature, and are nearly label-optimal. Finally, I will discuss some important directions such as list-decodable classification, where the majority of the data are contaminated.

About the speaker

Dr. Jie Shen is an Assistant Professor in the Computer Science Department at Stevens Institute of Technology, and is also a faculty member of the Stevens AI Institute. The goal of his research is to understand fundamental limits of learning under real-world constraints such as limited availability of labeled data and the presence of high level noise, and to design efficient algorithms with provable guarantees. His recent works investigate interactive learning from untrusted data, where learning algorithms are involved in data acquisition for optimal data efficiency and robustness. Over the past few years, he has published around 15 papers in top machine learning conferences such as ICML, NeurIPS, and ALT, and has served as senior program committee member for IJCAI, program committee member/journal reviewer for ICML, NeurIPS, COLT, ICLR, AISTATS, AAAI, JMLR, ML, TIT, TPAMI, TSP, PR etc. He obtained his BS degree in Mathematics at Shanghai Jiao Tong University, and completed his Ph.D. in Computer Science at Rutgers University in 2018. He was a visiting scholar at National University of Singapore and Duke University. He received the NSF CRII award in 2020.