Machine learning

File size: 8.5KB
Lines of code: 146

`Machine learning`

Understand theoretical basis for neural networks and practise tools to build deep learning models.

Covers...

Feed Forward Neural Networks
Convolutional Neural Networks
Recurrent Neural Networks
Autoencoders
Reinforcement Learning
Attention (through transformers)

Definitions

Neural network: model comprised of many artificial neurons that takes in training examples as input and infers rules to arrive at a specified output (accuracy increases as the sample size of training examples grows larger)
Artifical neuron: basic building block of neural networks, of which there are 2 main types
1. Perceptron
  - older model developed in 1950s to 1960s by Frank Rosenblatt
  - each perceptron receives a binary input and returns a single binary output
  - binary output is determined by whether the weighted sum surpasses a designated threshold value
  - perceptron's model too simplistic since even a small $\Delta$ in a perceptron's weight could result in a large $\Delta$ flip of its binary output
2. Sigmoid neuron
  - deep learning models required an artificial neuron that allowed a small $\Delta$ in its weight to result in a corresponding small $\Delta$ in its output
  - each sigmoid neuron receives one or more inputs of floating-point value and returns a single floating-point output
  - output is calculated by mapping the sigmoid function onto each input training example
Weights: real number ($\mathbb{R}$) that expresses the importance a given input has to its corresponding output
Bias: negative threshold value
Input layer: first layer of neurons in a neural network that are fed as input to the model
Hidden layer(s): any number of intermediary layers of neurons in a neural network whose outputs are fed as inputs to the next layer of neurons
Output layer: final layer in a neural network, where a single neuron's output is the returned value of the entire model
Feed forward neural network: output from one layer is input for another layer, modelled mathematically as $f(g(h(x)))$ where information is only fed forward
- more useful for deep learning models
Recurrent neural network: output from one layer is fed as staggered input to the same layer, modelled mathematically as $f(f(f(x)))$ where recursive feedback loops are allowed
- less useful for deep learning models
- more accurately simulates how the human brain handles and reinforces information
Machine learning: process by which machines learn to perform tasks they were not explicitly programmed to, of which there are 4 variants
- supervised: model takes in known input and purposefully predicts a desired output
- unsupervised: model takes in known input and derives/describes patterns observed from the input
- parametric: model takes in a fixed number of input parameters
- non-parametric: model takes in an unspecified, possibly infinite number of input parameters

[!NOTE]
Machine learning models are either Parametric OR Non-parametric and Supervised OR Unsupervised.

Mean squared error (MSE): measures degree of inaccuracy a predicted output has compared to the actual output
Gradient descent: attributes error by allocating blame for a non-zero MSE value to a specific neuron's weight, then tweaking that weight to decrease the MSE in one of three ways
1. Full gradient descent: neural network calculates the AVERAGE weights over the entire training example dataset for a minimum MSE, and weights are tweaked after the FULL AVERAGE has been computed
2. Stochastic gradient descent: repeatedly iterates through the entire training example dataset, tweaking weights for EACH input value based on the MSE, until a weight configuration that works for ALL training examples is arrived at
3. Batch gradient descent: a BATCH size of $n$ is specified beforehand, and the neural network updates the weights after $n$ training examples have been fed to the model as input

[!TIP]
Most situations designed to train deep learning models can be modelled with matrices using NumPy and pandas.

Natural language processing (NLP): parses text for the following three purposes
1. Label a region of text (speech tagging, sentiment classification, named-entity recognition)
2. Link 2 or more regions of text (co-reference)
3. Fill in missing information based on context

Machine learning

Machine learning

Definitions

More on

`Machine learning`