DMLR Workshop: Data-centric Machine Learning Research
Ce Zhang · Praveen Paritosh · Newsha Ardalani · Nezihe Merve Gürel · William Gaviria Rojas · Yang Liu · Rotem Dror · Manil Maskey · Lilith Bat-Leah · Tzu-Sheng Kuo · Luis Oala · Max Bartolo · Ludwig Schmidt · Alicia Parrish · Daniel Kondermann · Najoung Kim

This is the third edition of highly successful workshops focused on data-centric AI, following the success of the Data-Centric AI workshop at NeurIPS 2021 and DataPerf workshop at ICML 2022. Data, and operations over data (e.g., cleaning, debugging, curation) have been continually fueling the success of machine learning for decades. While historically the ML community has focused primarily on model development, recently the importance of data quality has attracted intensive interest from the community, including the creation of the NeurIPS dataset and benchmark track, several data-centric AI benchmarks (e.g., DataPerf), and the flourishing of data consortiums such as LAION, the community’s attention has been directed to the quality of data used for ML training and evaluation. The goal of this workshop is to facilitate these important topics in what we call Data-centric Machine Learning Research, which includes not only datasets and benchmarks, but tooling and governance, as well as fundamental research on topics such as data quality and data acquisition for dataset creation and optimization.

Introduction and Opening (Opening Remarks)
Keynote 1: Andrew Ng (Landing AI) (Keynote)
Invited Talk 1: Dina Machuve (DevData Analytics) - Data for Agriculture (Talk)
Coffee break / networking break (Break)
Keynote 2: Mihaela van der Schaar (University of Cambridge) - Data quality (Keynote)
Invited Talk 2: Olga Russakovsky (Princeton University) (Talk)
Invited Talk 3: Masashi Sugiyama (RIKEN & UTokyo) - Data distribution shift (Talk)
Lunch Break / networking break (Break)
Keynote 3: Isabelle Guyon (Google Brain) - Data creation (Keynote)
DataPerf Challenge - Peter Mattson (Google & MLCommons) (Talk)
Announcement and open discussion on DMLR (Selected members of DMLR Advisory Board) (Discussion Panel)
Panel Discussion (Discussion Panel)
Coffee break / networking break (Break)
Poster Session 1 (Poster Session - In Person)
Poster Session 2 (Virtual) (Poster Session - Virtual)