Timezone: »
This is the third edition of highly successful workshops focused on data-centric AI, following the success of the Data-Centric AI workshop at NeurIPS 2021 and DataPerf workshop at ICML 2022. Data, and operations over data (e.g., cleaning, debugging, curation) have been continually fueling the success of machine learning for decades. While historically the ML community has focused primarily on model development, recently the importance of data quality has attracted intensive interest from the community, including the creation of the NeurIPS dataset and benchmark track, several data-centric AI benchmarks (e.g., DataPerf), and the flourishing of data consortiums such as LAION, the community’s attention has been directed to the quality of data used for ML training and evaluation. The goal of this workshop is to facilitate these important topics in what we call Data-centric Machine Learning Research, which includes not only datasets and benchmarks, but tooling and governance, as well as fundamental research on topics such as data quality and data acquisition for dataset creation and optimization.
Sat 12:00 p.m. - 12:05 p.m.
|
Introduction and Opening
(
Opening Remarks
)
|
🔗 |
Sat 12:05 p.m. - 12:40 p.m.
|
Keynote 1: Andrew Ng (Landing AI)
(
Keynote
)
|
Andrew Ng 🔗 |
Sat 12:40 p.m. - 1:10 p.m.
|
Invited Talk 1: Dina Machuve (DevData Analytics) - Data for Agriculture
(
Talk
)
|
🔗 |
Sat 1:10 p.m. - 1:25 p.m.
|
Coffee break / networking break
|
🔗 |
Sat 1:25 p.m. - 2:00 p.m.
|
Keynote 2: Mihaela van der Schaar (University of Cambridge) - Data quality
(
Keynote
)
|
🔗 |
Sat 2:00 p.m. - 2:30 p.m.
|
Invited Talk 2: Olga Russakovsky (Princeton University)
(
Talk
)
|
Olga Russakovsky 🔗 |
Sat 2:30 p.m. - 3:00 p.m.
|
Invited Talk 3: Masashi Sugiyama (RIKEN & UTokyo) - Data distribution shift
(
Talk
)
|
Masashi Sugiyama 🔗 |
Sat 3:00 p.m. - 4:00 p.m.
|
Lunch Break / networking break
|
🔗 |
Sat 4:00 p.m. - 4:35 p.m.
|
Keynote 3: Isabelle Guyon (Google Brain) - Data creation
(
Keynote
)
|
🔗 |
Sat 4:35 p.m. - 5:15 p.m.
|
DataPerf Challenge - Peter Mattson (Google & MLCommons)
(
Talk
)
|
Peter Mattson 🔗 |
Sat 5:15 p.m. - 5:30 p.m.
|
Announcement and open discussion on DMLR (Selected members of DMLR Advisory Board)
(
Discussion Panel
)
|
🔗 |
Sat 5:30 p.m. - 6:15 p.m.
|
Panel Discussion
(
Discussion Panel
)
|
🔗 |
Sat 6:15 p.m. - 6:30 p.m.
|
Coffee break / networking break
|
🔗 |
Sat 6:30 p.m. - 7:30 p.m.
|
Poster Session 1
(
Poster Session - In Person
)
|
🔗 |
Sat 7:30 p.m. - 8:00 p.m.
|
Poster Session 2 (Virtual)
(
Poster Session - Virtual
)
|
🔗 |
Author Information
Ce Zhang (ETH Zurich)
Praveen Paritosh (Google)
Newsha Ardalani (Meta AI Research (FAIR))
Nezihe Merve Gürel (TU Delft)
William Gaviria Rojas (Coactive AI)
Yang Liu (UC Santa Cruz/ByteDance Research)
Rotem Dror (University of Pennsylvania)
Manil Maskey (NASA)
Lilith Bat-Leah (dPrism Advisors)
Lilith Bat-Leah is Vice President, Data Services at dPrism, responsible for consulting on use cases for data analytics, data science, and machine learning. Lilith has over 11 years of experience managing, delivering, and consulting on identification, preservation, collection, processing, review, annotation, analysis, and legal production of data. She also has experience in research and development of machine learning software for eDiscovery. She speaks and writes about various topics in eDiscovery, such as evaluation of machine learning systems, ESI protocols, and discovery of databases. Lilith holds a BSGS in Organization Behavior from Northwestern University, where she graduated magna cum laude. She is a current member of MLCommons/DataPerf/DynaBench and formerly served as Co-Trustee of the EDRM Analytics and Machine Learning project, as a member of the EDRM Global Advisory Council, as Vice President of the Chicago ACEDS chapter, and as President of the New York Metro ACEDS Chapter.
Tzu-Sheng Kuo (CMU)
Luis Oala (Dotphoton AG)
Max Bartolo (Cohere, UCL)
Ludwig Schmidt (University of Washington)
Alicia Parrish (Google)
Daniel Kondermann (Quality Match GmbH)
Najoung Kim (Boston University)
More from the Same Authors
-
2020 : Contributed Talk: Incentives for Federated Learning: a Hypothesis Elicitation Approach »
Yang Liu · Jiaheng Wei -
2020 : Contributed Talk: Linear Models are Robust Optimal Under Strategic Behavior »
Wei Tang · Chien-Ju Ho · Yang Liu -
2021 : Linear Classifiers that Encourage Constructive Adaptation »
Yatong Chen · Jialu Wang · Yang Liu -
2021 : When Optimizing f-divergence is Robust with Label Noise »
Jiaheng Wei · Yang Liu -
2022 : Adaptive Data Debiasing Through Bounded Exploration »
Yifan Yang · Yang Liu · Parinaz Naghizadeh -
2023 Poster: Deja Vu: Contextual Sparsity for Efficient LLMs at Inference Time »
Zichang Liu · Jue Wang · Tri Dao · Tianyi Zhou · Binhang Yuan · Zhao Song · Anshumali Shrivastava · Ce Zhang · Yuandong Tian · Christopher Re · Beidi Chen -
2023 Poster: CocktailSGD: Fine-tuning Foundation Models over 500Mbps Networks »
Jue Wang · Yucheng Lu · Binhang Yuan · Beidi Chen · Percy Liang · Chris De Sa · Christopher Re · Ce Zhang -
2023 Poster: FedHPO-Bench: A Benchmark Suite for Federated Hyperparameter Optimization »
Zhen WANG · Weirui Kuang · Ce Zhang · Bolin Ding · Yaliang Li -
2023 Poster: Identifiability of Label Noise Transition Matrix »
Yang Liu · Hao Cheng · Kun Zhang -
2023 Poster: Model Transferability with Responsive Decision Subjects »
Yatong Chen · Zeyu Tang · Kun Zhang · Yang Liu -
2023 Poster: High-throughput Generative Inference of Large Language Models with a Single GPU »
Ying Sheng · Lianmin Zheng · Binhang Yuan · Zhuohan Li · Max Ryabinin · Beidi Chen · Percy Liang · Ce Zhang · Ion Stoica · Christopher Re -
2023 Poster: Weak Proxies are Sufficient and Preferrable for Fairness with Missing Sensitive Attributes »
Zhaowei Zhu · Yuanshun Yao · Jiankai Sun · Hang Li · Yang Liu -
2023 Oral: Deja Vu: Contextual Sparsity for Efficient LLMs at Inference Time »
Zichang Liu · Jue Wang · Tri Dao · Tianyi Zhou · Binhang Yuan · Zhao Song · Anshumali Shrivastava · Ce Zhang · Yuandong Tian · Christopher Re · Beidi Chen -
2023 Oral: High-throughput Generative Inference of Large Language Models with a Single GPU »
Ying Sheng · Lianmin Zheng · Binhang Yuan · Zhuohan Li · Max Ryabinin · Beidi Chen · Percy Liang · Ce Zhang · Ion Stoica · Christopher Re -
2023 : Practice »
Lilith Bat-Leah -
2023 Workshop: Knowledge and Logical Reasoning in the Era of Data-driven Learning »
Nezihe Merve Gürel · Bo Li · Theodoros Rekatsinas · Beliz Gunel · Alberto Sngiovanni Vincentelli · Paroma Varma -
2022 : Data Valuation »
Newsha Ardalani -
2022 : Model Transferability With Responsive Decision Subjects »
Yang Liu · Yatong Chen · Zeyu Tang · Kun Zhang -
2022 Workshop: DataPerf: Benchmarking Data for Data-Centric AI »
Lora Aroyo · Newsha Ardalani · Colby Banbury · Gregory Diamos · William Gaviria Rojas · Tzu-Sheng Kuo · Mark Mazumder · Peter Mattson · Praveen Paritosh -
2022 Poster: Estimating Instance-dependent Bayes-label Transition Matrix using a Deep Neural Network »
Shuo Yang · Erkun Yang · Bo Han · Yang Liu · Min Xu · Gang Niu · Tongliang Liu -
2022 Poster: Detecting Corrupted Labels Without Training a Model to Predict »
Zhaowei Zhu · Zihao Dong · Yang Liu -
2022 Poster: Understanding Instance-Level Impact of Fairness Constraints »
Jialu Wang · Xin Eric Wang · Yang Liu -
2022 Spotlight: Understanding Instance-Level Impact of Fairness Constraints »
Jialu Wang · Xin Eric Wang · Yang Liu -
2022 Spotlight: Estimating Instance-dependent Bayes-label Transition Matrix using a Deep Neural Network »
Shuo Yang · Erkun Yang · Bo Han · Yang Liu · Min Xu · Gang Niu · Tongliang Liu -
2022 Poster: Metric-Fair Classifier Derandomization »
Jimmy Wu · Yatong Chen · Yang Liu -
2022 Poster: Beyond Images: Label Noise Transition Matrix Estimation for Tasks with Lower-Quality Features »
Zhaowei Zhu · Jialu Wang · Yang Liu -
2022 Spotlight: Detecting Corrupted Labels Without Training a Model to Predict »
Zhaowei Zhu · Zihao Dong · Yang Liu -
2022 Spotlight: Metric-Fair Classifier Derandomization »
Jimmy Wu · Yatong Chen · Yang Liu -
2022 Spotlight: Beyond Images: Label Noise Transition Matrix Estimation for Tasks with Lower-Quality Features »
Zhaowei Zhu · Jialu Wang · Yang Liu -
2022 Poster: To Smooth or Not? When Label Smoothing Meets Noisy Labels »
Jiaheng Wei · Hangyu Liu · Tongliang Liu · Gang Niu · Masashi Sugiyama · Yang Liu -
2022 Poster: Certifying Out-of-Domain Generalization for Blackbox Functions »
Maurice Weber · Linyi Li · Boxin Wang · Zhikuan Zhao · Bo Li · Ce Zhang -
2022 Oral: To Smooth or Not? When Label Smoothing Meets Noisy Labels »
Jiaheng Wei · Hangyu Liu · Tongliang Liu · Gang Niu · Masashi Sugiyama · Yang Liu -
2022 Spotlight: Certifying Out-of-Domain Generalization for Blackbox Functions »
Maurice Weber · Linyi Li · Boxin Wang · Zhikuan Zhao · Bo Li · Ce Zhang -
2021 Poster: Knowledge Enhanced Machine Learning Pipeline against Diverse Adversarial Attacks »
Nezihe Merve Gürel · Xiangyu Qi · Luka Rimanic · Ce Zhang · Bo Li -
2021 Spotlight: Knowledge Enhanced Machine Learning Pipeline against Diverse Adversarial Attacks »
Nezihe Merve Gürel · Xiangyu Qi · Luka Rimanic · Ce Zhang · Bo Li -
2021 Poster: Clusterability as an Alternative to Anchor Points When Learning with Noisy Labels »
Zhaowei Zhu · Yiwen Song · Yang Liu -
2021 Spotlight: Clusterability as an Alternative to Anchor Points When Learning with Noisy Labels »
Zhaowei Zhu · Yiwen Song · Yang Liu -
2021 Poster: Understanding Instance-Level Label Noise: Disparate Impacts and Treatments »
Yang Liu -
2021 Oral: Understanding Instance-Level Label Noise: Disparate Impacts and Treatments »
Yang Liu -
2021 Poster: 1-bit Adam: Communication Efficient Large-Scale Training with Adam's Convergence Speed »
Hanlin Tang · Shaoduo Gan · Ammar Ahmad Awan · Samyam Rajbhandari · Conglong Li · Xiangru Lian · Ji Liu · Ce Zhang · Yuxiong He -
2021 Spotlight: 1-bit Adam: Communication Efficient Large-Scale Training with Adam's Convergence Speed »
Hanlin Tang · Shaoduo Gan · Ammar Ahmad Awan · Samyam Rajbhandari · Conglong Li · Xiangru Lian · Ji Liu · Ce Zhang · Yuxiong He -
2021 Poster: Evolving Attention with Residual Convolutions »
Yujing Wang · Yaming Yang · Jiangang Bai · Mingliang Zhang · Jing Bai · JING YU · Ce Zhang · Gao Huang · Yunhai Tong -
2021 Spotlight: Evolving Attention with Residual Convolutions »
Yujing Wang · Yaming Yang · Jiangang Bai · Mingliang Zhang · Jing Bai · JING YU · Ce Zhang · Gao Huang · Yunhai Tong -
2020 Workshop: Incentives in Machine Learning »
Boi Faltings · Yang Liu · David Parkes · Goran Radanovic · Dawn Song -
2020 : Spotlight Talk 5: Detecting Failure Modes in Image Reconstructions with Interval Neural Network Uncertainty »
Luis Oala -
2020 Poster: Don't Waste Your Bits! Squeeze Activations and Gradients for Deep Neural Networks via TinyScript »
Fangcheng Fu · Yuzheng Hu · Yihan He · Jiawei Jiang · Yingxia Shao · Ce Zhang · Bin Cui -
2020 Poster: Peer Loss Functions: Learning from Noisy Labels without Knowing Noise Rates »
Yang Liu · Hongyi Guo -
2019 : Networking Lunch (provided) + Poster Session »
Abraham Stanway · Alex Robson · Aneesh Rangnekar · Ashesh Chattopadhyay · Ashley Pilipiszyn · Benjamin LeRoy · Bolong Cheng · Ce Zhang · Chaopeng Shen · Christian Schroeder · Christian Clough · Clement DUHART · Clement Fung · Cozmin Ududec · Dali Wang · David Dao · di wu · Dimitrios Giannakis · Dino Sejdinovic · Doina Precup · Duncan Watson-Parris · Gege Wen · George Chen · Gopal Erinjippurath · Haifeng Li · Han Zou · Herke van Hoof · Hillary A Scannell · Hiroshi Mamitsuka · Hongbao Zhang · Jaegul Choo · James Wang · James Requeima · Jessica Hwang · Jinfan Xu · Johan Mathe · Jonathan Binas · Joonseok Lee · Kalai Ramea · Kate Duffy · Kevin McCloskey · Kris Sankaran · Lester Mackey · Letif Mones · Loubna Benabbou · Lynn Kaack · Matthew Hoffman · Mayur Mudigonda · Mehrdad Mahdavi · Michael McCourt · Mingchao Jiang · Mohammad Mahdi Kamani · Neel Guha · Niccolo Dalmasso · Nick Pawlowski · Nikola Milojevic-Dupont · Paulo Orenstein · Pedram Hassanzadeh · Pekka Marttinen · Ramesh Nair · Sadegh Farhang · Samuel Kaski · Sandeep Manjanna · Sasha Luccioni · Shuby Deshpande · Soo Kim · Soukayna Mouatadid · Sunghyun Park · Tao Lin · Telmo Felgueira · Thomas Hornigold · Tianle Yuan · Tom Beucler · Tracy Cui · Volodymyr Kuleshov · Wei Yu · yang song · Ydo Wexler · Yoshua Bengio · Zhecheng Wang · Zhuangfang Yi · Zouheir Malki -
2019 Poster: Fairness without Harm: Decoupled Classifiers with Preference Guarantees »
Berk Ustun · Yang Liu · David Parkes -
2019 Poster: Distributed Learning over Unreliable Networks »
Chen Yu · Hanlin Tang · Cedric Renggli · Simon Kassing · Ankit Singla · Dan Alistarh · Ce Zhang · Ji Liu -
2019 Oral: Fairness without Harm: Decoupled Classifiers with Preference Guarantees »
Berk Ustun · Yang Liu · David Parkes -
2019 Oral: Distributed Learning over Unreliable Networks »
Chen Yu · Hanlin Tang · Cedric Renggli · Simon Kassing · Ankit Singla · Dan Alistarh · Ce Zhang · Ji Liu -
2019 Poster: Exploring the Landscape of Spatial Robustness »
Logan Engstrom · Brandon Tran · Dimitris Tsipras · Ludwig Schmidt · Aleksander Madry -
2019 Oral: Exploring the Landscape of Spatial Robustness »
Logan Engstrom · Brandon Tran · Dimitris Tsipras · Ludwig Schmidt · Aleksander Madry -
2019 Poster: DL2: Training and Querying Neural Networks with Logic »
Marc Fischer · Mislav Balunovic · Dana Drachsler-Cohen · Timon Gehr · Ce Zhang · Martin Vechev -
2019 Oral: DL2: Training and Querying Neural Networks with Logic »
Marc Fischer · Mislav Balunovic · Dana Drachsler-Cohen · Timon Gehr · Ce Zhang · Martin Vechev -
2018 Poster: On the Limitations of First-Order Approximation in GAN Dynamics »
Jerry Li · Aleksander Madry · John Peebles · Ludwig Schmidt -
2018 Oral: On the Limitations of First-Order Approximation in GAN Dynamics »
Jerry Li · Aleksander Madry · John Peebles · Ludwig Schmidt -
2018 Poster: A Classification-Based Study of Covariate Shift in GAN Distributions »
Shibani Santurkar · Ludwig Schmidt · Aleksander Madry -
2018 Poster: Asynchronous Decentralized Parallel Stochastic Gradient Descent »
Xiangru Lian · Wei Zhang · Ce Zhang · Ji Liu -
2018 Poster: $D^2$: Decentralized Training over Decentralized Data »
Hanlin Tang · Xiangru Lian · Ming Yan · Ce Zhang · Ji Liu -
2018 Oral: A Classification-Based Study of Covariate Shift in GAN Distributions »
Shibani Santurkar · Ludwig Schmidt · Aleksander Madry -
2018 Oral: $D^2$: Decentralized Training over Decentralized Data »
Hanlin Tang · Xiangru Lian · Ming Yan · Ce Zhang · Ji Liu -
2018 Oral: Asynchronous Decentralized Parallel Stochastic Gradient Descent »
Xiangru Lian · Wei Zhang · Ce Zhang · Ji Liu -
2017 Poster: ZipML: Training Linear Models with End-to-End Low Precision, and a Little Bit of Deep Learning »
Hantian Zhang · Jerry Li · Kaan Kara · Dan Alistarh · Ji Liu · Ce Zhang -
2017 Talk: ZipML: Training Linear Models with End-to-End Low Precision, and a Little Bit of Deep Learning »
Hantian Zhang · Jerry Li · Kaan Kara · Dan Alistarh · Ji Liu · Ce Zhang