Timezone: »
Deep extreme multi-label learning (XML) requires training deep architectures that can tag a data point with its most relevant subset of labels from an extremely large label set. XML applications such as ad and product recommendation involve labels rarely seen during training but which nevertheless hold the key to recommendations that delight users. Effective utilization of label metadata and high quality predictions for rare labels at the scale of millions of labels are thus key challenges in contemporary XML research. To address these, this paper develops the SiameseXML framework based on a novel probabilistic model that naturally motivates a modular approach melding Siamese architectures with high-capacity extreme classifiers, and a training pipeline that effortlessly scales to tasks with 100 million labels. SiameseXML offers predictions 2--13% more accurate than leading XML methods on public benchmark datasets, as well as in live A/B tests on the Bing search engine, it offers significant gains in click-through-rates, coverage, revenue and other online metrics over state-of-the-art techniques currently in production. Code for SiameseXML is available at https://github.com/Extreme-classification/siamesexml
Author Information
Kunal Dahiya (IIT Delhi)
Ananye Agarwal (IIT Delhi)
Deepak Saini (Microsoft Research India)
Gururaj K (Microsoft)
Jian Jiao (Microsoft)
Amit Singh (Microsoft)
Sumeet Agarwal (Indian Institute of Technology Delhi)
Purushottam Kar (IIT Kanpur)
Manik Varma (Microsoft Research)
Related Events (a corresponding poster, oral, or spotlight)
-
2021 Spotlight: SiameseXML: Siamese Networks meet Extreme Classifiers with 100M Labels »
Wed. Jul 21st 12:20 -- 12:25 PM Room
More from the Same Authors
-
2021 Poster: BANG: Bridging Autoregressive and Non-autoregressive Generation with Large Scale Pretraining »
Weizhen Qi · Yeyun Gong · Jian Jiao · Yu Yan · Weizhu Chen · Dayiheng Liu · Kewen Tang · Houqiang Li · Jiusheng Chen · Ruofei Zhang · Ming Zhou · Nan Duan -
2021 Spotlight: BANG: Bridging Autoregressive and Non-autoregressive Generation with Large Scale Pretraining »
Weizhen Qi · Yeyun Gong · Jian Jiao · Yu Yan · Weizhu Chen · Dayiheng Liu · Kewen Tang · Houqiang Li · Jiusheng Chen · Ruofei Zhang · Ming Zhou · Nan Duan -
2020 : Invited Talk 1 Q&A - Manik Varma »
Manik Varma -
2020 : Invited Talk 1 - DeepXML: A Framework for Deep Extreme Multi-label Learning - Manik Varma »
Manik Varma -
2020 : Introduction to Extreme Classification »
Manik Varma · Yashoteja Prabhu -
2017 Workshop: ML on a budget: IoT, Mobile and other tiny-ML applications »
Manik Varma · Venkatesh Saligrama · Prateek Jain -
2017 Poster: ProtoNN: Compressed and Accurate kNN for Resource-scarce Devices »
Chirag Gupta · ARUN SUGGALA · Ankit Goyal · Saurabh Goyal · Ashish Kumar · Bhargavi Paranjape · Harsha Vardhan Simhadri · Raghavendra Udupa · Manik Varma · Prateek Jain -
2017 Talk: ProtoNN: Compressed and Accurate kNN for Resource-scarce Devices »
Chirag Gupta · ARUN SUGGALA · Ankit Goyal · Saurabh Goyal · Ashish Kumar · Bhargavi Paranjape · Harsha Vardhan Simhadri · Raghavendra Udupa · Manik Varma · Prateek Jain -
2017 Poster: On Context-Dependent Clustering of Bandits »
Claudio Gentile · Shuai Li · Purushottam Kar · Alexandros Karatzoglou · Giovanni Zappella · Evans Etrue Howard -
2017 Poster: Resource-efficient Machine Learning in 2 KB RAM for the Internet of Things »
Ashish Kumar · Saurabh Goyal · Manik Varma -
2017 Talk: On Context-Dependent Clustering of Bandits »
Claudio Gentile · Shuai Li · Purushottam Kar · Alexandros Karatzoglou · Giovanni Zappella · Evans Etrue Howard -
2017 Talk: Resource-efficient Machine Learning in 2 KB RAM for the Internet of Things »
Ashish Kumar · Saurabh Goyal · Manik Varma