I am late to congratulating Xiaoliu Wu (graduated Winter 2022) and right on time to congratulate Ran Sun for completing their PhDs at UC Davis. They did outstanding impactful work!

Xiaoliu Wu

Thesis: Conditional Independence Testing for Neural Networks

While the title of Xiaoliu’s thesis is quite specific his thesis was actually a diverse group of practical and methodological work. This is because Xiaoliu was instrumental in the Healthy Davis Together effort, which pulled him away from the usual Machine Learning methods work for my students and into the fast paced world of doing data science in real time for Covid response. With his help we were able to do the following…

  1. Provide real time analyses for the Covid response, this required some quick data analysis. Good judgement calls require statistical experience, and Xiaoliu demonstrated that he has it.
  2. Played a fundamental role in Wastewater monitoring with Prof. Bischel’s group. He was responsible for imputation of missing entries in qPCR data in the wastewater network.


The imputed concentration time series for the wastewater treatment plan compared to simpler imputation methods.

His work on Conditional Independence Testing proposed and studied a new way to do test the independence of binary variables (think treatment and effect) given a complex confounding variable (think medical images). His solution actually worked, and was not just a theoretical upper bound. I am still convinced that it is one of the few practical approaches to the problem (as far as I know). The publication remains to be completed however! (cough, cough)

Ran Sun

Thesis: Statistical Learning for High-Dimensional Networked Data in Transportation Systems

Ran is a researcher in tranportation science and will be going on to do a postdoc at U of Michigan! Ran has a clear research direction: finding and exploiting low dimensional representations of traffic data in road networks. It is surprising how complicated analyses of traffic data over networks can get, and there are many complications (in contrast to the simplified problems that we typically study in graph signal processing). For example, data may be traffic flows from sensors in traffic lanes. They are directed, noisy, and measured on the edges of the network. Also, traffic networks are heterogeneous and only partially observed (you have to do some work to go from the road network and traffic to get to the actual graph). He has three distinct chapters that address real and important problems in transportation science using new graph signal processing methods. Here is an image of a road network in LA…