The Machine Learning Lunch Seminar is a weekly series, covering all areas of machine learning theory, methods, and applications. Each week, over 70 students and faculty from across Rice gather for a catered lunch, ML-related conversation, and a 45-minute research presentation. If you’re a member of the Rice community and interested in machine learning, please join us! Unless otherwise announced, the ML lunch occurs every Wednesday of the academic year at 12:00pm in Duncan Hall 3092 (the large room at the top of the stairs on the third floor).

The student coordinators are Michael Weylandt, Tan Nguyen, Chen Luo, Lorenzo Luzi, and Cannon Lewis, and the faculty coordinator is Reinhard Heckel. The ML Lunch Seminar is sponsored by EOG Resources.

Announcements about the ML Lunch Seminars and other ML events on campus are sent to the ml-l@rice.edu mailing list. Click here to join.


Learning-Optimized Phase Imaging via Unrolled Algorithms

Nov. 14, 12:00 pm- 1:00 pm in DCH 3092
Speaker: Emrah Bostan
Please indicate interest, especially if you want lunch, here.
Abstract: Computational imaging systems aim at co-designing algorithms and imaging hardware to go beyond what is achievable by their conventional counterparts. As a result of the recent success of deep learning in various applications, design paradigms in computational imaging are now centered around data-driven approaches where one obtains a parametrized relationship between given inputs and outputs by training. However, such end-to-end frameworks are typically model-specific and do not necessarily consider physical or mathematical attributes of the imaging system. In this talk, we develop data-driven frameworks that can incorporate our prior information about the system physics and reconstruction algorithm into the learning process as design constraints. To illustrate the advantages of our method, we consider phase retrieval from coded-illumination measurements and propose a physics-based learning scheme for optimizing the illumination. In particular, using only 2 measurements recorded with the learned illumination designs, we image 3T3 Embryotic Fibroblast cells and obtain phase reconstructions that are similar to those retrieved by Fourier Ptychographic techniques with 69 measurements.

Systems for Large-Scale Machine Learning: Where We Are, and Where We Need to Go

Nov. 7, 12:00 pm- 1:00 pm in DCH 3092
Speaker: Chris Jermaine
Please indicate interest, especially if you want lunch, here.
Abstract: Modern, deep neural networks can do an amazing things, performing tasks that would have seemed impossible a decade ago. While there have been significant advances in modeling and algorithms, arguably much of the expansion in the capabilities of deep learning systems has been enabled by advances in computer systems. On the hardware side, GPUs have made it possible to train very large and deep networks in reasonable time. On the software side, modern systems for automatic differentiation such as Google’s TensorFlow make it easy for a programmer to implement learning algorithms for even the most complex networks.

However, one area in which systems for machine learning are wanting is in their support for distributed learning—scaling machine learning algorithms to clusters of machines or multiple GPUs. Getting machine learning computations to work in a distributed setting is often very challenging. The dominant abstraction for distributed learning is the so-called “parameter server”, which is essentially a key-value store. A programmer wishing to distribute a learning computation on top of a parameter server often ends up manually distributing computation and data to machines and/or processors. This is not always easy, and can result in brittle codes that are not easily adapted to changes in hardware and/or data. In this talk, I will argue that designers of machine learning systems are making some of the same errors made by the designers of early data management systems. ML system designers would be better off trying to develop with a basic theory of distributed learning system design, and using that as a guide to building real-life systems. It may make sense to start with a relational-style system, utilizing decades of theory and practice related to ensuring the independence of code from the underlying data and hardware.

 

Some Results (and Open Problems) in Optimization and Generalization for Neural Networks

Oct. 31, 12:00 pm- 1:00 pm in DCH 3092
Speaker: Hamid Javadi
Please indicate interest, especially if you want lunch, here.
Abstract: In the first part of this talk I will present some ideas and results regarding optimization in neural networks using random features. In the second part, I will discuss the problem of compressed sensing (denoising) using a learned (approximate) deep prior and I will make some connections to a few important features of a neural network which are related to some open problems regarding generalization in neural networks.

 

Three Concepts in ML that Require Rethinking: Implicit Regularization, Over-Parameterization, and Momentum Acceleration

Oct. 24, 12:00 pm- 1:00 pm in DCH 3092
Speaker: Anastasios Kyrillidis
Please indicate interest, especially if you want lunch, here.
Abstract: In theory, most machine learning systems are hard to train. Take as an example deep neural networks, which prevail in modern computer science research. Despite difficulties, such systems have led to recent success of machine learning and artificial intelligence in real-life applications. This  antithesis between theory and practice has sparked the interest of the algorithmic research community towards better understanding how training algorithms generate models that generalize well on unseen data.

In this talk, we will discuss three concepts in machine learning that have attracted increased interest lately: algorithmic implicit regularization, model over-parameterization, and the use of momentum acceleration. In all cases, we will review succinctly the most recent literature, and study the prevailing perspectives on these matters. For each case though, we will propose alternatives in theory and practice that show that all three concepts still require further study, and there are widely open questions.

 

Deep Decoder: Concise Image Representations From Untrained Non-Convolutional Networks

Oct. 17, 12:00 pm- 1:00 pm in DCH 3092
Speaker: Reinhard Heckel
Please indicate interest, especially if you want lunch, here.
Abstract: Deep neural networks, in particular convolutional neural networks, have become highly effective tools for compressing images and solving inverse problems including denoising, inpainting, and reconstruction of images from few and noisy measurements. This success can be attributed in part to their ability to generate or represent images well. However, contrary to methods based on classical image models such as wavelets, image-generating deep neural networks have a large number of parameters—typically a multiple of their output dimension—and need to be trained on large datasets. In this paper, we propose an untrained simple image model in form of a deep neural network that can generate natural images from very few parameters and demonstrate that as a consequence, this network enables efficient compression of images and solving inverse problems like denoising. Contrary to previous deep image generators (trained or not) the network is underparameterized and thus conforms with the classical perspective that an efficient model maps a low-dimensional parameter space to a high-dimensional image space. In addition, the model highlights aspects of deep networks that enable efficient image representations: It consists of upsampling units, linear combinations of channels, and ReLU-non-linearities, and most importantly, does not require convolutional layers.

 

ML Research Careers in Industrial R&D and National Labs Environments

Oct. 10, 12:00 pm- 1:00 pm in DCH 3092
Speaker: Senior Students from Rice
Please indicate interest, especially if you want lunch, here.
Abstract: We will have presentations from students who have done internships at ML-focused start-ups, ML research groups within large corporations, and national labs, followed by a Q&A session.
We hope that this discussion will be particularly useful for junior students who are considering ML internships next summer and for more senior students who are trying to decide after they graduate.
On the topic of career paths in ML, I’ve included two great opportunities below – one in Palo Alto and one in Oslo. You can travel the world with ML!

 

Programmatically Interpretable Reinforcement Learning

Oct. 3, 12:00 pm- 1:00 pm in DCH 3092
Speaker: Abhinav Verma
Please indicate interest, especially if you want lunch, here.
Abstract:  We present a reinforcement learning framework, called Programmatically Interpretable Reinforcement Learning (PIRL), that is designed to generate interpretable and verifiable agent policies. Unlike the popular Deep Reinforcement Learning (DRL) paradigm, which represents policies by neural networks, PIRL represents policies using a high-level, domain-specific programming language. Such programmatic policies have the benefits of being more easily interpreted than neural networks, and being amenable to verification by symbolic methods. We propose a new method, called Neurally Directed Program Search (NDPS), for solving the challenging nonsmooth optimization problem of finding a programmatic policy with maximal reward. NDPS works by first learning a neural policy network using DRL, and then performing a local search over programmatic policies that seeks to minimize a distance from this neural “oracle”. We evaluate NDPS on the task of learning to drive a simulated car in the TORCS car-racing environment. We demonstrate that NDPS is able to discover human-readable policies that pass some significant performance bars. We also show that PIRL policies can have smoother trajectories, and can be more easily transferred to environments not encountered during training, than corresponding policies discovered by DRL.

 

Algorithmic Regularization for Efficient Large-Scale Statistical Machine Learning

Sep. 26, 12:00 pm- 1:00 pm in DCH 3092
Speaker: Michael Weylandt
Please indicate interest, especially if you want lunch, here.
Abstract:  In this talk, I introduce algorithmic regularization, a novel approach to efficient computation of the solution path of penalized estimators. The power of algorithmic regularization is illustrated with an extended application to convex clustering, where it reduces computation time from almost a day to under a minute. This speed-up allows us to develop `clustRviz`, a tool for interactive and dynamic visualization of convex clustering solutions. Next, I will show some theoretical guarantees for algorithmic regularization and demonstrate its accuracy with a simulation study. Finally, I will discuss the application of algorithmic regularization to other problems in machine learning, and discuss some future directions. This talk is based on joint work with Genevera Allen, John Nagorski, and Yue Hu.

 

Building Tools for Data-Driven Discovery: Graphical Models & Data Integration

Sep. 19, 12:00 pm- 1:00 pm in DCH 3092
Speaker: Genevera Allen
Please indicate interest, especially if you want lunch, here.
Abstract:  In this two part talk, I will present two major research projects motivated by large and complex data sets in neuroscience and genomics.  First, and following up from last week’s presentation, I will present current work in my group on graphical models for estimating functional connectivity from large-scale neural recordings.  Second, I will present recent work on graphical models and dimension reduction methods for integration of data from diverse data sources.

 

Inferring Interactions between Neurons, Stimuli, and Behavior

Sep. 12, 12:00 pm- 1:00 pm in DCH 3092
Speaker: Genevera Allen (STAT + BCM), Ankit Patel (BCM + ECE), Xaq Pitkow (ECE + BCM), Krešimir Josić (UH + BIOS), and Andreas Tolias (BCM + ECE)
Please indicate interest, especially if you want lunch, here.
Abstract:  Our understanding of the properties of individual neurons in the neocortex and their role in brain computations has advanced significantly over the last decades. However, we are still far from understanding how large assemblies of cells interact to process stimulus information and generate behavior. These causal interactions are not a static property of neural circuits, but can change rapidly depending on the current task and global brain state. As the BRAIN Initiative develops critical experimental tools to record massive numbers of neural activities in functioning brains, we need mathematical tools to analyze and interpret this data. Currently there are no high-throughput methods available to measure neural interactions in cortical circuits in vivo, and relate them to stimuli and behavior. We will develop statistical tools to achieve this goal. We propose to infer the interactions between neurons, stimuli, and behavior, while incorporating both sparse connectivity constraints and accounting for latent common inputs. Our first aim is to optimize our statistical method of inferring synaptic efficacy in silico by applying it to large, realistic artificial neural networks. We will empirically validate the method in mouse cortex by combining high-speed 3D two-photon imaging in vivo with multi-cell patching experiments on the same tissue in vitro. Our second aim is to generalizes this statistical tool to measure neural computation, by inferring nonlinear interactions simultaneously between the stimulus and neural activity (encoding), and neural activity and behavior (decoding). We will test these techniques using a variety of models that perform realistic inference in the presence of nuisance variables. This will include deep convolutional networks that achieve or exceed human performance. While related statistical methods have been used in other disciplines, our approach is novel in neuroscience. To achieve these goals we have assembled a collaborative team of systems neuroscientists, mathematicians, computational neuroscientists and statisticians. We are committed to sharing code and data, and we have also assembled a group of End Users to be early adopters and testers of our methods. Our combination of revolutionary experimental and mathematical tools will provide neuroscience with unprecedented insights into the brains distributed neural computations.

 

Data Science for Networks

Sep. 5, 12:00 pm- 1:00 pm in DCH 3092
Speaker: Santiago Segarra
Please indicate interest, especially if you want lunch, here.
Abstract:  Understanding networks and networked behavior has emerged as one of the foremost intellectual challenges of the 21st century. While we obviously master the technology to engineer transformational networks — from communication infrastructure to online social networks — our theoretical understanding of fundamental phenomena that arise in networked systems remains limited. My goal is to combine network science and signal processing in order to leverage the structure of networks to better understand data defined on them. In this context, the term Data Science for Networks can be understood as a joint effort to understand both network structures and network data. I will introduce the fundamental building blocks of graph signal processing (GSP) as a toolbox to study network data, and delve deeper into the problem of inferring a graph from nodal observations. I will then discuss ongoing work and potential future research directions.

 

Predicting mental health and measuring sleep using machine learning and wearable sensors/mobile phones

Aug. 28, 12:00 pm- 1:00 pm in DCH 3092
Speaker: Akane Sano
Please indicate interest, especially if you want lunch, here.
Abstract:  Sleep, stress and mental health have been major health issues in modern society. Poor sleep habits and high stress, as well as reactions to stressors and sleep habits, can depend on many factors. Internal factors include personality types and physiological factors and external factors include behavioral, environmental and social factors. What if 24/7 rich data from mobile devices could identify which factors influence your bad sleep or stress problem and provide personalized early warnings to help you change behaviors, before sliding from a good to a bad health condition such as depression? In my talk, I will present a series of studies and systems we have developed to investigate how to leverage multi-modal data from mobile/wearable devices to measure, understand and improve mental well-being. First, I will talk about methodology and tools I developed for the SNAPSHOT study, which seeks to measure Sleep, Networks, Affect, Performance, Stress, and Health using Objective Techniques. To learn about behaviors and traits that impact health and well-being, we have measured over 200,000 hours of multi-sensor and smartphone use data as well as trait data such as personality from about 300 college students exposed to sleep deprivation and high stress. Second, I will describe statistical analysis and machine learning models to characterize, model, and forecast mental well-being using the SNAPSHOT study data. I will discuss behavioral and physiological markers and models that may provide early detection of a changing mental health condition. Third, I will introduce recent projects that might help people to reflect on and change their behaviors for improving their well-being.

 

The Spatial Proximity and Connectivity (SPC) Method for Measuring and Analyzing Residential Segregation

Aug. 22, 12:00 pm- 1:00 pm in DCH 3092
Speaker: Elizabeth Roberto
Please indicate interest, especially if you want lunch, here.
Abstract: In recent years, there has been increasing attention to the spatial dimensions of residential segregation—including the spatial arrangement of segregated neighborhoods and the geographic scale or relative size of segregated areas. However, the methods used to measure segregation do not incorporate features of the built environment, such as the road connectivity between locations or the physical barriers that divide groups. My talk will introduce the Spatial Proximity and Connectivity (SPC) method for measuring and analyzing segregation. The method addresses the limitations of current approaches by taking into account how the physical structure of the built environment affects the proximity and connectivity of locations. I will describe the method and demonstrate its application for studying segregation and spatial inequality. The SPC method contributes to scholarship on residential segregation by capturing the effect of an important yet understudied mechanism of segregation—the connectivity, or physical barriers, between locations—on the level and spatial pattern of segregation, and it enables further consideration of the role of the built environment in segregation processes.

 

 


Photo Gallery: