## Department of Statistics Launches Major Initiative in Data Analytics

The Department of Statistics has recently made several targeted hires of faculty working in modern data analytics, and will be incorporating statistical analytics in its Statistical Science academic programs as well, at the undergraduate, master's, and doctoral levels. Students can also get an M.S. in Data Analytics Engineering with a concentration Statistical Analytics. Data analytics is the analysis of high-dimensional data sets, or "Big Data"; such data sets are generated in the technology, banking, and government sectors, among many others. Statisticians are particularly important in applying data reduction techniques, visualization, and estimation procedures to accurately interpret massive data.

## Faculty of the Department include:

Dan Carr, Professor of Statistics. Dr. Carr's research includes exploratory visualization of moderate-dimensional aggregated data summaries. For example, his research has addressed the use of NASA's cluster-compressed summaries of global multivariate multi-altitude data from the Atmospheric Infrared Sounder.

The clusters-compressed summaries were aggregated in geospatial regions. The distance between pair of regions was computed using earth mover's distance as applied to the constituent cluster-compressed summaries. The resulting large distance matrix supported clustering and visualization of geospatial region clusters on the global for chosen time periods.

Wanli Qiao, Assistant Professor of Statistics. Dr. Qiao works on data in the form of point clouds embedded in high-dimensional Euclidean space, which are numerous in many scientific fields such as geoscience, astronomy and neuroscience. The geometric objects of the system that produces the data are particularly interesting to many researchers. Deep study of their estimation raises challenges in modern statistics and machine learning. His research interests involve algorithms, statistical inference and probability theory related to these geometric objects.

Martin Slawski, Assistant Professor of Statistics. Dr. Slawski works on random projections to provide a computationally convenient approach to dimensionality reduction for high-dimensional data. He explores the use of subsequent quantization of the projected data to achieve additional data compression. This gives rise to an interesting trade-off between dimension and bit rate that depends on the quantity one wants to infer and the statistical estimation procedure used in order to do so. Specifically, he has analyzed this trade-off for linear signal recovery and for similarity estimation and search, and we plan to investigate other common problems such as linear classification and clustering in the future.

Anand Vidyashankar, Associate Professor of Statistics. Dr. Vidyashankar is currently focusing on statistical problems arising in privacy and security analytics. He is collaborating with scientists from McKesson Corporation to identify sources of risk and statistical methods to measure and mitigate risk in real-time environments. The work involves integrating aspects of regularity guidelines with novel statistical methods in ultra-high dimensions to develop next-generation privacy and security guidelines.

Yunpeng Zhao, Assistant Professor of Statistics. Dr. Zhao's primary research interest is machine learning methodology and theory in network analysis with applications in biology and social sciences. His area of interest include community detection and extraction in networks, link prediction for partially observed and time-varying networks, as well as inference of implicit network structure. In addition, Dr. Zhao is also working on high dimensional data analysis with applications in genomics.