Despite the recent success of deep learning, the nature of the transformations they apply to the input features remains poorly understood. This study provides an empirical framework to study the encoding properties of node activations in various layers of the network, and to construct the exact function applied to each data point in the form of a linear transform. These methods are used to discern and quantify properties of feed-forward neural networks trained to map acoustic features to phoneme labels. We show a selective and nonlinear warping of the feature space, achieved by forming prototypical functions to account for the possible variation of each class. This study provides a joint framework where the properties of node activations and the functions implemented by the network can be linked together.
Tasha Nagamine (Columbia University)
Nima Mesgarani (Columbia University)
Related Events (a corresponding poster, oral, or spotlight)
2017 Talk: Visualizing and Understanding Multilayer Perceptron Models: A Case Study in Speech Processing »
Tue Aug 8th 06:06 -- 06:24 AM Room Darling Harbour Theatre