The Machine Learning Lunch Seminar is a weekly series, covering all areas of machine learning theory, methods, and applications. Each week, over 70 students and faculty from across Rice gather for a catered lunch, ML-related conversation, and a 45-minute research presentation. If you’re a member of the Rice community and interested in machine learning, please join us! Unless otherwise announced, the ML lunch occurs every Wednesday of the academic year at 12:00pm in Duncan Hall 3092 (the large room at the top of the stairs on the third floor).

The student coordinators are Michael Weylandt, Tan Nguyen, Chen Luo, Lorenzo Luzi, and Cannon Lewis, and the faculty coordinator is Reinhard Heckel. The ML Lunch Seminar is sponsored by Marathon Oil.

Announcements about the ML Lunch Seminars and other ML events on campus are sent to the mailing list. Click here to join.

Leveraging structure in cancer imaging data to predict clinical outcomes
September 18th, 12:00 pm- 1:00 pm in DCH 3092
Speaker: Souptik Barua (ELEC)

Please indicate interest, especially if you want lunch, here.

Immunotherapy and radiation therapy are two of the most prominent strategies used to treat cancer. While both these treatments have succeeded in removing the disease in many patients and cancer types, they are known to not work well for all patients, sometimes even leading to adverse side effects. There is thus a critical need to be able to predict how patients might respond to these therapies and accordingly design optimal treatment plans. In this talk, I describe data-driven frameworks that leverage structure in two types of cancer imaging data (multiplexed immunofluorescence or mIF images, and CT scans) to predict clinical outcomes of interest. In the case of mIF images, I use ideas from spatial statistics and functional data analysis to design metrics that describe the spatial distributions of immune cells in a tumor. i then show that these metrics can predict outcomes such as survival and risk of progression in pancreatic cancer. In the case of CT scans, I use a functional data analysis technique to capture temporal changes in CT scans captured at multiple time points, and use that to predict two clinical outcomes a) if patients undergoing radiation treatment are likely to have a complete response, b) if patients are going to develop long-term side effects of radiation treatment such as osteoradionecrosis.

Strong mixed-integer programming formulations for trained neural networks
September 11th, 12:00 pm- 1:00 pm in DCH 3092
Speaker: Joey Huchette (CAAM)

Please indicate interest, especially if you want lunch, here.

We present mixed-integer programming (MIP) formulations for high-dimensional piecewise linear functions that correspond to trained neural networks. These formulations can be used for a number of important tasks, such as: 1) verifying that an image classification network is robust to adversarial inputs, 2) designing DNA sequences that exhibit desirable therapeutic properties, 3) producing good candidate policies as a subroutine in deep reinforcement learning algorithms, and 4) solving decision problems with machine learning models embedded inside (i.e. the “predict, then optimize” paradigm). We provide formulations for networks with many of the most popular nonlinear operations (e.g. ReLU and max pooling) that are strictly stronger than other approaches from the literature. We corroborate this computationally on image classification verification tasks, where we show that our formulations are able to solve to optimality in orders of magnitude less time than existing methods.

Data Integration: Data-Driven Discovery from Diverse Data Sources
September 4th, 12:00 pm- 1:00 pm in DCH 3092
Speaker: Genevera Allen (ECE)

Please indicate interest, especially if you want lunch, here.

Data integration, or the strategic analysis of multiple sources of data simultaneously, can often lead to discoveries that may be hidden in individual analyses of a single data source. In this talk, we present several new techniques for data integration of mixed, multi-view data where multiple sets of features, possibly each of a different domain, are measured for the same set of samples. This type of data is common in heathcare, biomedicine, national security, multi-senor recordings, multi-modal imaging, and online advertising, among others. In this talk, we specifically highlight how mixed graphical models and new feature selection techniques for mixed, mutli-view data allow us to explore relationships amongst features from different domains. Next, we present new frameworks for integrated principal components analysis and integrated generalized convex clustering that leverage diverse data sources to discover joint patterns amongst the samples. We apply these techniques to integrative genomic studies in cancer and neurodegenerative diseases to make scientific discoveries that would not be possible from analysis of a single data set.

Photo Gallery: