Skip to yearly menu bar Skip to main content

( events)   Timezone:  
Sat Jul 18 09:10 AM -- 04:00 PM (PDT)
Machine Learning for Media Discovery
Erik Schmidt · Oriol Nieto · Fabien Gouyon · Yves Raimond · Katherine Kinnaird · Gert Lanckriet

Workshop Home Page

The ever-increasing size and accessibility of vast media libraries has created a demand more than ever for AI-based systems that are capable of organizing, recommending, and understanding such complex data.

While this topic has received only limited attention within the core machine learning community, it has been an area of intense focus within the applied communities such as the Recommender Systems (RecSys), Music Information Retrieval (MIR), and Computer Vision communities. At the same time, these domains have surfaced nebulous problem spaces and rich datasets that are of tremendous potential value to machine learning and the AI communities at large.

This year's Machine Learning for Media Discovery (ML4MD) aims to build upon the five previous Machine Learning for Music Discovery editions at ICML, broadening the topic area from music discovery to media discovery. The added topic diversity is aimed towards having a broader conversation with the machine learning community and to offer cross-pollination across the various media domains.

One of the largest areas of focus in the media discovery space is on the side of content understanding. The recommender systems community has made great advances in terms of collaborative feedback recommenders, but these approaches suffer strongly from the cold-start problem. As such, recommendation techniques often fall back on content-based machine learning systems, but defining the similarity of media items is extremely challenging as myriad features all play some role (e.g., cultural, emotional, or content features, etc.). While significant progress has been made, these problems remain far from solved.

In addition, these complex data present many challenges beyond the development of machine learning systems to model and understand them. One of the largest challenges is scale. One example is commercial music libraries, which span into the tens of millions. However, user-generated content platforms such as YouTube and Pinterest have libraries stretching into the billions--a scale at which many of the traditional approaches discussed in the literature simply cannot perform.

On the other side of this problem sits the recent explosion of work in the area of Creative AI. Relevant examples include Google Magenta, Amazon's DeepComposer, who seek to develop algorithms capable of composing and performing completely original (and compelling) works of music. The same also happens in the world of visual media creation (e.g., DeepDream, Deep Fakes). Certain work in this area adds an interesting dimension to the conversation as understanding how content is created is a prerequisite to generating.

This workshop proposal is timely in that it will bridge these separate pockets of otherwise very related research. In addition to making progress on the challenges above, we hope to engage the wide AI and machine learning community with our rich problem space, and connect them with the many available datasets the community has to offer.

Welcome Remarks (Welcome)
Graph Neural Networks for Reasoning over Multimodal Content (Invited Talk)
Novel Audio Embeddings for Personalized Recommendations on Newly Released Tracks (Accepted Talk)
Musical Word Embedding: Bridging the Gap between Listening Contexts and Music (Accepted Talk)
Poster Session #1 (Posters)
Graphs for music analysis (Invited Talk)
Deep Active Learning Toward Crisis-related Tweets Classification (Accepted Talk)
The Unsung Heroes of Music Recommendation: an Essay (Invited Talk)
Lunch (Break)
Beyond Being Accurate: Solving Real-World Recommendation Problems with Neural Modeling (Invited Talk)
Character-focused Video Thumbnail Retrieval (Accepted Talk)
HitPredict: Using Spotify Data to Predict Billboard Hits (Accepted Talk)
Poster Session #2 (Posters)
Hit Song Prediction (Invited Talk)
I know why you like this movie: Interpretable Efficient Mulitmodal Recommender (Accepted Talk)
Content-based Music Similarity with Siamese Networks (Accepted Talk)