The Machine Learning Lunch Seminar is a weekly series, covering all areas of machine learning theory, methods, and applications. Each week, over 70 students and faculty from across Rice gather for a catered lunch, ML-related conversation, and a 45-minute research presentation. If you’re a member of the Rice community and interested in machine learning, please join us! Unless otherwise announced, the ML lunch occurs every Wednesday of the academic year at 12:00pm in Duncan Hall 3092 (the large room at the top of the stairs on the third floor).

The student coordinators are Michael Weylandt, Tan Nguyen, Chen Luo, Lorenzo Luzi, and Cannon Lewis, and the faculty coordinator is Reinhard Heckel. The ML Lunch Seminar is sponsored by Marathon Oil.

Announcements about the ML Lunch Seminars and other ML events on campus are sent to the mailing list. Click here to join.

Catching the Flu: Tracking the Spread of Acute Respiratory Infections in real-time on a University Campus.

April. 17, 12:00 pm- 1:00 pm in DCH 3092
Speaker: Todd J. Treangen
Please indicate interest, especially if you want lunch, here.
Abstract: Acute respiratory infections (ARIs), such as Influenza, account for over 4 million deaths per year worldwide. Limiting the morbidity and mortality of acute respiratory infections (ARIs), such as the flu, requires improved understanding of ARI transmission; however, current diagnostic and screening methods are outpaced by active ARI outbreaks, and are agnostic to covariates of transmission. Furthermore, previous studies have yet to combine the relative importance of social, behavioral and physical environment in the transmission of ARIs. In this talk, I will present preliminary findings of this interdisciplinary study for tracking ARIs on the University of Maryland campus during the current academic year. I will also cover the use of a novel Model of Contagiousness (MoC) in addition to Weighted gene correlation network analysis (WGCNA) on this college dormitory ARI transmission study for the discovery of key ARI transmission covariates.


A simple visual task demonstrating the benefit of lateral or feedback connections

April. 10, 12:00 pm- 1:00 pm in DCH 3092
Speaker: Josue Ortega Caro
Please indicate interest, especially if you want lunch, here.
Abstract: In this talk, Currently, the best predictive models of cortical responses are convolutional neural networks. However, these models still have difficulty learning simple visual tasks that involve more abstract or global elements akin to Gestalt rules. The visual system often uses these elements to learn and achieve superior performance on complex visual tasks. How could we introduce these elements into a model? One way could be via inductive bias realized through architectural features of the underlying neuronal circuits, such as feedback or lateral connections. To test this, we design a classification task that requires distinguishing between an open and closed polygons of different gray scales. After extensive training, state-of-the-art feed-forward architectures like ResNet-18 perform poorly on the task, despite having hundreds of thousands of training examples. In contrast, adding minimal feedback or lateral connections enables the network to dramatically improve performance. In addition, by using image-sampling via feature-matching, we found that sensitivity to the open polygon decreases over time. Together, these results strongly suggest that feedback connections suppresses information about task-irrelevant features. Our work shows how a minimal neurally-inspired architectural bias can dramatically affect a network’s performance and learnability on a perceptual task.


FlatCam: Thin Lensless Camera

April. 3, 12:00 pm- 1:00 pm in DCH 3092
Speaker: Jasper Tan
Please indicate interest, especially if you want lunch, here.
Abstract: In this talk, I will discuss the FlatCam, our lab’s design for an ultra-thin lensless camera. Many applications, such as in biomedical cases or the Internet of Things, would benefit from having very thin and inexpensive imaging systems. However, most imaging system designs require physical lenses, which place restrictions on the overall device size. Our solution is to design a camera wherein we completely remove the physical lens and replace it with a thin mask. In this talk, I will explain how such a lensless mask-based camera works and discuss the FlatCam’s applications in microscopy and inference. I will also discuss opportunities for collaboration, especially those involving machine learning. As with any imaging talk, I will show many pretty pictures.


Causal Inference

March. 19, 12:00 pm- 1:00 pm in DCH 3092
Speaker: Ashutosh Sabharwal
Please indicate interest, especially if you want lunch, here.
Abstract: Prof. Sabharwal will be speaking on causal inference, with a focus on the specific challenges associated with establishing links between health interventions and outcomes, as well as work the Scalable Health Labs are doing in this domain.


Taking Assortment Optimization From Theory to Practice: Evidence From Large Field Experiments on Alibaba

March. 19, 12:00 pm- 1:00 pm in DCH 3092
Speaker:Jacob Feldman
Please indicate interest, especially if you want lunch, here.
Abstract: We compare the performance of two approaches for finding the optimal set of products to display to customers landing on Alibaba’s two online marketplaces, Tmall and Taobao. Both approaches were placed online simultaneously and tested on real customers for one week. The first approach we test is Alibaba’s current practice. This procedure embeds hundreds of product and customer features within a sophisticated machine learning algorithm that is used to estimate the purchase probabilities of each product for the customer at hand. The products with the largest expected revenue (revenue * predicted purchase probability) are then made available for purchase. The downside of this approach is that it does not incorporate customer substitution patterns; the estimates of the purchase probabilities are independent of the set of products that eventually are displayed. Our second approach uses a featurized multinomial logit (MNL) model to predict purchase probabilities for each arriving customer. In this way we use less sophisticated machinery to estimate purchase probabilities, but we employ a model that was built to capture customer purchasing behavior and, more specifically, substitution patterns. We use historical sales data to fit the MNL model and then, for each arriving customer, we solve the cardinality-constrained assortment optimization problem under the MNL model online to find the optimal set of products to display. Our experiments show that despite the lower prediction power of our MNL-based approach, it generates 28% higher revenue per visit compared to the current machine learning algorithm with the same set of features. We also conduct various heterogeneous-treatment-effect analyses to demonstrate that the current MNL approach performs best for sellers whose customers generally only make a single purchase. In addition to developing the first full-scale, choice-model-based product recommendation system, we also shed light on new directions for improving such systems for future use.


What’s So Hard About Natural Language Understanding?

March. 6, 12:00 pm- 1:00 pm in DCH 3092
Speaker: Alan Ritter
Please indicate interest, especially if you want lunch, here.
Abstract: In recent years, advances in speech recognition and machine translation (MT) have led to wide adoption, for example, helping people issue voice commands to their phones and talk with people who do not speak the same language. These advances are made possible by the use of neural network methods on large, high-quality datasets. However, computers still struggle to understand the meaning of natural language. In this talk, I will present two efforts to scale up natural language understanding, drawing inspiration from recent successes in speech and MT. First, I will discuss conversational agents that are learned from scratch in a purely data-driven way, by adapting techniques from statistical machine translation. In the second part of the talk, I will describe an effort to extract structured knowledge from text, without relying on slow and expensive human labeling. Our approach combines the benefits of structured learning and neural networks and accurately predicts sentence-level relation mentions given only indirect supervision from a knowledge base. By explicitly reasoning about missing data during learning, this method enables large-scale training of convolutional neural networks while mitigating the issue of label noise inherent in distant supervision. Our method achieves state-of-the-art results on minimally supervised sentence-level relation extraction, outperforming several baselines, including a competitive approach that uses the attention layer of a purely neural model.


Automated Characterization of Subannual Urbanization Dynamics in Houston using Satellite Remote-Sensing

Feb. 27, 12:00 pm- 1:00 pm in DCH 3092
Speaker: Chris Hakkenberg
Please indicate interest, especially if you want lunch, here.
Abstract: The steady deployment of space-borne remote sensing platforms in recent decades, coupled with the development of burgeoning technologies in computer science and statistics, is enabling novel insights into the mechanics of urban socio-ecological systems at increasingly fine spatio-temporal resolutions and vast extents. Land cover time series generated from remotely-sensed imagery offer the potential to characterize increasingly precise landscape dynamics previously elusive with temporally-sparse time series. However, the derivation of temporally-dense urbanization time series is constrained due to sparse acquisitions of high-quality imagery, inter-scene variation and inconsistency, as well as subpixel variability in heterogeneous environments. In this study, we address all three concerns, using the novel automatic adaptive signature generalization and regression (AASGr) algorithm to generate a unique product: a subannual, continuous fields fractional cover time series of the rapidly urbanizing Greater Houston area from 1997-2018. Automated characterization of continuous fields land cover change offers several advantages over discrete classification approaches by more precisely characterizing heterogeneity and intensity among cover types in spatially complex areas. We employ this unique data product for continuous landscape gradient analysis to precisely quantify higher-order spatio-temporal dynamics of urbanization trends like periodicity, abrupt transitions, and time lags at a level of thematic and temporal precision (e.g. onset, duration, etc.) unachievable with discrete class, annual products. In addition to providing unique insights into the relationship between Houston’s spatio-temporal urbanization morphologies and their underlying socio-economic drivers, results shed light on climate change mitigation efforts, landscape urban hydrology, and landscape conservation planning for one of the United States’ largest, most diverse, and fastest growing urban systems.


Will Machine Learning help my Underground Dark Matter Telescopes? 

Feb. 20, 12:00 pm- 1:00 pm in DCH 3092
Speaker: Chris Tunnell
Please indicate interest, especially if you want lunch, here.
Abstract: The new Artificial Intelligence (AI) Spring has born the fruits of ‘deep learning’ with neural networks in tandem with a more-general rejuvenated interest in Machine Learning (ML) techniques. In response, scientists are exploring which classes of problems are now trackable, where this is an active area of interdisciplinary research. Within particle physics, there is a long history of adopting ML techniques, even during the long so-called AI winter. ML went mainstream in particle physics with the 2006 ML-enabled discovery of single top-quark production. Since then, the field of particle physics that used to pioneer new computational techniques (e.g. CERN LHC) is being usurped in various ways by astronomy (e.g. LSST) for a range of issues ranging from cyberinfrastructure limitations to the types of analyses that are performed. For example, the major breakthroughs in ML technology appear to have come from the field of computer vision (dog/cat photo classifier), which is more easily applicable to astronomy given that this is also imaging data. The concept of ‘seeing’ has a different meaning when discussing particles others than photons.

In this talk, I’ll give a subjective history of machine learning’s place within particle physics and astronomy, especially in relation to this post-Tensor-Flow era in applied Machine Learning. I will then discuss at a high-level the type of Science and experimentation that astroparticle physicists are doing to try to understand the Nature of Dark Matter such that we understand the types of datasets and problems that are encountered. Subsequently, I’ll discuss the types of challenges we face – including computational – applying ML techniques to our field. Then I will optimistically present different classes of problems that we face where ML (incl. manifold learning) appears to result in major advances in how we do our science.


Deep Neural Networks in Geo-Inverse Problems 

Feb. 13, 12:00 pm- 1:00 pm in DCH 3092
Speaker: Maarten de Hoop
Please indicate interest, especially if you want lunch, here.
Abstract: His work touches a wide range of areas, including inverse problems, multi-dimensional imaging, non-linear boundary value problems, multi-scale methods, and massively parallel algorithms for structured matrix operations, all with applications in seismology. He will be speaking to us on applications of machine learning to (some of) these areas.


K-nearest-neighbors and Graph Thresholding with Multi-armed Bandits

Feb. 6, 12:00 pm- 1:00 pm in DCH 3092
Speaker: Daniel LeJeune
Please indicate interest, especially if you want lunch, here.
Abstract:  Many problems of interest can be reduced to sequential decision making or multi-armed bandit problems, in which we seek to optimally handle the exploration-exploitation tradeoff. In this talk I will present two recent works which fall into this category. In the first, we show that by considering the k-nearest-neighbor problem as a multi-armed bandit problem, we can adapt our computation to the specific problem instance rather than to the worst case. In the second, we address the problem of identifying level sets of functions defined on the nodes of a graph and show how leveraging the information sharing between nodes allows us to converge much more quickly to the correct level set compared to algorithms that ignore the graph structure.


A Max Affine Spline Perspective on Recurrent Neural Networks

Jan. 30, 12:00 pm- 1:00 pm in DCH 3092
Speaker: Jack Wang
Please indicate interest, especially if you want lunch, here.
Abstract:  We develop a framework for understanding and improving recurrent neural net-works (RNNs) using max-affine spline operators (MASO). We prove that RNNs using piecewise affine and convex non-linearities can be written as a simple piecewise affine spline operator. The resulting representation provides several new perspectives for analyzing RNNs, three of which we study in this paper. First, we show that an RNN internally partitions the input space during training and that it builds up the partition through time. Second, we show that the affine parameter of an RNN corresponds to an input-specific template, from which we can interpret an RNN as performing a simple template matching (matched filtering) given the input. Third, by closely examining the MASO RNN formula, we prove that injecting Gaussian noise in the initial hidden state in RNNs corresponds to an explicit L2 regularization on the affine parameters, which links to exploding gradient issues and improves generalization. Extensive experiments on several data sets of various modalities demonstrate and validate each of the above analyses. In particular, using initial hidden states elevates simple RNNs to state-of-the-art performance on these data sets.



Data-Driven Energy Informatics

Jan. 23, 12:00 pm- 1:00 pm in DCH 3092
Speaker: Cristina Heghedus
Please indicate interest, especially if you want lunch, here.
Abstract:  The large automation of appliances and adoption of high energy consuming utilities in homes has the potential to exert a huge constraint on the current energy infrastructure. Majority of the energy supply chain infrastructure across the world are incapable to handle large and concentrated energy demands. Therefore, electric energy suppliers are challenged to make accurate and granular forecasting of the future electricity demand, ensuring efficiency and preventing energy waste and theft. Within Data Analysis, Machine Learning and Deep Learning models are efficiently used for electricity demand forecasting, price forecasting, peak prediction and so on. Data analysis has brought already numerous advantages to the energy field, started to evolve in the oil and gas industry as well. The oil and gas industry operates now with huge number of sensors installed in different facilities, particularly in production and injections wells. These sensors provide millions measurements like pressure, temperature and rate every year for every well. We apply DL models, that already produced remarkable results in Energy Informatics, to such measurements. The reasonable performance of these models on given data sets (electricity, oil and gas, transportation) brings new benefits to the energy field, facilitating decision making and efficient resource management.


Image data and analysis for tropical conservation

Jan. 16, 12:00 pm- 1:00 pm in DCH 3092
Speaker: Lydia Beaudrot
Please indicate interest, especially if you want lunch, here.
Abstract:  Over the past two decades, trail cameras that photograph animals as they pass by have become a primary method for collecting data on wildlife. These images have enabled ecological and
conservation research that was not previously possible. In this talk, I will 1) provide examples of how we have employed statistical models to address conservation questions, 2) describe limitations in current
statistical approaches and 3) discuss future directions for applications of machine learning.


Locality Sensitive Hashing for Large Scale Machine Learning

Jan. 9, 12:00 pm- 1:00 pm in DCH 3092
Speaker: Anshumali Shrivastava
Please indicate interest, especially if you want lunch, here.
Abstract:  Anshu will introduce some recent works in Rush Lab. The mission of Rush Lab is to push machine learning to the extreme-scale. They design and implement exponentially resource-frugal and scalable machine learning (ML) algorithms by using randomized hashing and sketching algorithms, suited for modern big-data constraints. Apart from being exponentially cheap, the algorithms are embarrassingly parallelizable. The extremely low resource requirements make our techniques ideal for IoT devices as well. Furthermore, the algorithms are naturally privacy-preserving as they do not work directly with data attributes and instead only operate on secure hashes or sketches.


Photo Gallery: