- Activation function: In artificial neural networks, the activation function of a node defines the output of that node given an input or set of inputs.
- Backpropagation: In machine learning, backpropagation is a widely used algorithm for training feedforward neural networks, such as an MLP.
- Hidden layer (MLP): Layers of nodes between the input and output layers. There may be one or more of these layers. These interior layers of an MLP are typically called “hidden layers” because they are not directly observable from the systems inputs and outputs.
- Input layer (MLP): Input variables, sometimes called the visible layer.
- Mean: The arithmetic mean is the average of the numbers, a calculated "central" value of a set of numbers. You can find it as values in the sound descriptors from Freesound.
- Mel-frequency cepstral coefficients (MFCCs): In sound processing, the mel-frequency cepstrum (MFC) is a representation of the short-term power spectrum of a sound. MFCCs are coefficients that collectively make up an MFC. Mel Frequency Cepstral Coefficents (MFCCs) are a feature widely used in automatic speech and speaker recognition.
- Multilayer perceptron (MLP): An MLP consists of at least three layers of nodes: an input layer, a hidden layer and an output layer. A multilayer perceptron (MLP) is a class of feedforward artificial neural network (ANN). Multilayer perceptrons are sometimes colloquially referred to as "vanilla" neural networks, especially when they have a single hidden layer. MLP utilizes a supervised learning technique called backpropagation for training. It is based on non-linear activation.
- Neural network (NN): A neural network is a network or circuit of neurons, also known as an artificial neural network, composed of artificial neurons or nodes.
- Neurons (MLP): See nodes.
- Nodes (MLP): A neural network is a series of nodes, or neurons. Within each node is a set of inputs, weight, and a bias value.
- Output layer (MLP): A layer of nodes that produce the output variables.
- Principal component analysis (PCA): A dimensionality-reduction method that is often used to reduce the dimensionality of large data sets, by transforming a large set of variables into a smaller one that still contains most of the information in the large set.
- Relu (activation function): Short for Rectified Linear Unit, it is a piecewise linear function that will output the input directly if it is positive, otherwise, it will output zero.
- Sigmoid (activation function): The sigmoid (s - shaped) activation function, also called the logistic function, is traditionally a very popular activation function for neural networks. The input to the function is transformed into a value between 0.0 and 1.0.
- Sound descriptors (Freesound): Analysis descriptors that characterise a sound. They include low-level descriptors, rhythm descriptors, tonal descriptors, and so on.
- Spectral flatness: A measure (sound descriptor) that quantifies how tonal or noisy a sound is.
- Spectral centroid: A measure (sound descriptor) that is used in digital signal processing to characterise a spectrum. Perceptually, it has a robust connection with the impression of brightness of a sound.
- Supervised learning: Supervised learning is the machine learning task of learning a function that maps an input to an output based on example input-output pairs. It infers a function from labeled training data consisting of a set of training examples.
- Tanh (activation function): The tanh activation function is from (-1 to 1) and is also sigmoidal (s - shaped).
- Training dataset: The model is initially fit on a training dataset, which is a set of examples used to fit the parameters (e.g. weights of connections between neurons in artificial neural networks) of the model.
- Test dataset: The test dataset is a dataset used to provide an unbiased evaluation of a final model fit on the training dataset.
- Unsupervised learning: Unsupervised learning is a type of machine learning that looks for previously undetected patterns in a data set with no pre-existing labels and with a minimum of human supervision.
- Validation dataset: After training, the fitted model is used to predict the responses for the observations in a second dataset called the validation dataset. The validation dataset provides an unbiased evaluation of a model fit on the training dataset while tuning the model's hyperparameters (e.g. the number of hidden units (layers and layer widths) in a neural network).
- Variance: In probability theory and statistics, variance is the expectation of the squared deviation of a random variable from its mean. It basically measures how far a set of numbers is spread out from their average value. You can find it as values in the sound descriptors from Freesound.
- Weights (MLP): Weight is the parameter within a neural network that transforms input data within the network's hidden layers. As an input enters the node, it gets multiplied by a weight value and the resulting output is either observed, or passed to the next layer in the neural network.