Timezone: »

DMLR Workshop: Data-centric Machine Learning Research
Ce Zhang · Praveen Paritosh · Newsha Ardalani · Nezihe Merve Gürel · William Gaviria Rojas · Yang Liu · Rotem Dror · Manil Maskey · Lilith Bat-Leah · Tzu-Sheng Kuo · Luis Oala · Max Bartolo · Ludwig Schmidt · Alicia Parrish · Daniel Kondermann · Najoung Kim

Sat Jul 29 12:00 PM -- 08:00 PM (PDT) @ Ballroom C
Event URL: https://dmlr.ai »

This is the third edition of highly successful workshops focused on data-centric AI, following the success of the Data-Centric AI workshop at NeurIPS 2021 and DataPerf workshop at ICML 2022. Data, and operations over data (e.g., cleaning, debugging, curation) have been continually fueling the success of machine learning for decades. While historically the ML community has focused primarily on model development, recently the importance of data quality has attracted intensive interest from the community, including the creation of the NeurIPS dataset and benchmark track, several data-centric AI benchmarks (e.g., DataPerf), and the flourishing of data consortiums such as LAION, the community’s attention has been directed to the quality of data used for ML training and evaluation. The goal of this workshop is to facilitate these important topics in what we call Data-centric Machine Learning Research, which includes not only datasets and benchmarks, but tooling and governance, as well as fundamental research on topics such as data quality and data acquisition for dataset creation and optimization.

Author Information

Ce Zhang (ETH Zurich)
Praveen Paritosh (Google)
Newsha Ardalani (Meta AI Research (FAIR))
Nezihe Merve Gürel (TU Delft)
William Gaviria Rojas (Coactive AI)
Yang Liu (UC Santa Cruz/ByteDance Research)
Rotem Dror (University of Pennsylvania)
Manil Maskey (NASA)
Lilith Bat-Leah (dPrism Advisors)

Lilith Bat-Leah is Vice President, Data Services at dPrism, responsible for consulting on use cases for data analytics, data science, and machine learning. Lilith has over 11 years of experience managing, delivering, and consulting on identification, preservation, collection, processing, review, annotation, analysis, and legal production of data. She also has experience in research and development of machine learning software for eDiscovery. She speaks and writes about various topics in eDiscovery, such as evaluation of machine learning systems, ESI protocols, and discovery of databases. Lilith holds a BSGS in Organization Behavior from Northwestern University, where she graduated magna cum laude. She is a current member of MLCommons/DataPerf/DynaBench and formerly served as Co-Trustee of the EDRM Analytics and Machine Learning project, as a member of the EDRM Global Advisory Council, as Vice President of the Chicago ACEDS chapter, and as President of the New York Metro ACEDS Chapter.

Tzu-Sheng Kuo (CMU)
Luis Oala (Dotphoton)
Max Bartolo (Cohere, UCL)
Ludwig Schmidt (University of Washington)
Alicia Parrish (Google)
Daniel Kondermann (Quality Match GmbH)
Najoung Kim (Boston University)

More from the Same Authors