The next great ACM TechTalk, “Learning From Data: The Two Cultures,” will be presented on Friday, July 9.
In his influential paper Statistical Modeling: The Two Cultures, written in 2001, Leo Breiman identified and contrasted two approaches to statistical modeling: one that assumes there is a probabilistic model generating the data–the data modeling culture–and another that focuses on mapping inputs to outputs through a black-box–the algorithmic modeling culture. Twenty years later, there is a growing community of researchers working on methodologies embracing both cultures. However, when looking at the broader problem of learning from data, which statistical modeling is an approach to, we can identify two cultures by two separate communities. The first is the statistical modeling culture itself, which starts with a question and/or data. The second, which is driving a lot of the AI breakthroughs, is the task modeling culture, which corresponds to a task-first approach. We revisit Breiman’s take on statistical modeling and highlight some of the works embracing the two cultures he identified. We then discuss task modeling, highlighting how the failure modes in this culture can be addressed by adopting principles and practices from statistical modeling, e.g. careful data selection and experimental design.
Adji Buosso Dieng is the founder of “The Africa I Know“, a researcher at Google, and an incoming tenure-track assistant professor of computer science at Princeton University.
Adji Bousso Dieng is a Senegalese computer scientist and statistician working in the field of artificial intelligence. She received her PhD in Statistics from Columbia University where she was advised by David Blei and John Paisley. Her doctoral work, at the intersection of probabilistic graphical modeling and deep learning, received many recognitions, including a Google PhD Fellowship in Machine Learning.
David Blei, Professor, Columbia University; ACM Prize in Computing Recipient
David Blei is a Professor of Statistics and Computer Science at Columbia University, and a member of the Columbia Data Science Institute. He studies probabilistic machine learning and Bayesian statistics, including theory, algorithms, and application. David has received several awards for his research. He received a Sloan Fellowship (2010), Office of Naval Research Young Investigator Award (2011), Presidential Early Career Award for Scientists and Engineers (2011), Blavatnik Faculty Award (2013), ACM Prize in Computing Award (previously known as the ACM-Infosys Foundation Award, 2013), a Guggenheim fellowship (2017), and a Simons Investigator Award (2019). He is the co-editor-in-chief of the Journal of Machine Learning Research. He is a fellow of the ACM and the IMS.
The event is free and you can register now.
Leave your comments and questions now and any time before the live event on ACM’s Discourse Page. And check out the page after the webcast for extended discussion with your peers in the computing community, as well as further resources on data science, statistical modeling, and more.
(If you’d like to attend but can’t make it to the virtual event, you still need to register to receive a recording of the TechTalk when it becomes available.)
Visit learning.acm.org/techtalks-archive for the full archive of past TechTalks.