MATRIX ANALYSIS OF NEURAL NETWORK ARCHITECTURES FOR AUDIO SIGNAL CLASSIFICATION

Authors
V S Paul, P A Nelson

The development of computational power in recent years has enabled the use of large training datasets in machine learning. Training neural networks can still be very time consuming and this has led to the development of computationally efficient neural network implementations such as TensorFlow1 , PyTorch 2 , Keras and Scikit-Learn3 or the MATLAB Deep Learning Toolbox4 . With the availability of such powerful deep learning frameworks, experiments research in machine learning are often conducted using such software as “black boxes”, with little focus on the properties of the networks and their evolution during training. Understanding the mathematics of neural network models is a first step to better understanding how information propagates through a network architecture. Besides the original papers describing the different neural network architectures such as Multi-Layer Perceptrons5 (MLPs), Recurrent Neural Networks6,7 (RNNs), Long Short-Term Memory Networks8 (LSTMs) or ConvolutionalNeural Networks9 (CNNs),thereare fewaccessible introductions that focus on the basic neural network models. There are also several papers and books that discuss the implementation of the underlying mathematics, although most of these publications either discuss only one network architecture in detail10–13 or analyse the most popular network models, but without going into the mathematical details14,15 . The main contribution of this paper is a derivation in matrix form of the forward and backward propagation equations for an MLP with any number of hidden layers. The analysis described can also be applied to other architectures. The equations derived will be implemented in MATLAB, and as an initial example of an application to audio signal classification, the network will be trained to distinguish between two closely related spectra. The singular value decomposition (SVD) is applied to the weight matrices during the weight update process to better understand the behaviour of the network. The initial results are discussed, and areas for future work proposed. 2