Motivations

Non-linear Hypothesis

Representation
- Problem
- For non-linear classification, hypothesis is a high order polynomial
  - if it is a quadratic function of 100 features, hypothesis will be close to 5000 features
    - Time Complexity is O ( n^2 )
  - if it is a cubic function of 100 features, hypothesis will be close to 1,70,000
Computer Vision
- Why it is hard ?
  - Computer see the matrix of pixel density of the image
- How does it work?
  - We give the classifier image of cars with labels and not images of cars with labels, it train on them.
  - We give a test image to predict, if it’s a car or not car.
- If we have 50 x 50 pixels images, then the n will 2500 ( greyscale ) 7500 ( RGB )
Neurons and the Brain
- Neural Networks
  - Origin: Algorithms that try to mimic the brain
- Neural Rewiring Experiments
  - Rewiring the auditory cortex with eyes rather than ears, it learns to see or visual discrimination with that tissue
  - In the same way, by rewiring the somatosensory cortex with the eyes rather than the hands, it learns to visual discrimination with that tissue
  - It is “ one learning algorithm “, whatever input it receives it generalises it perform that particular task.
- Examples
  - Seeing with your tongue
  - Human echolocation
  - Haptic Belt
  - Implementing 3rd eye in the frog
Neural Networks

Model Representation 1
- Neurons
  - There are input wires called ‘Dendrites’.
  - There are output wire called ‘Axon’
  - There is also cell body and nucleus
  - Working of a Neuron
    - One neuron sends information to other neuron by sending electric pulses ( called “spikes” )
    - Axon terminal of one neuron is connected to the dendrites of the other neuron
- Neuron Model
  - In our model, our dendrites are like the input features ( x_1, …. x_n ) and the output is the result of our hypothesis function.
  - In this model our x_0 input node is sometimes called the “bias unit.” It is always equal to 1.
  - Parameters are also called as weights
  - x_0 is a bias unit
  - Sigmoid ( logistic ) activation function
    - activation function = Hypothesis of logistic
Artificial Neural Network
- First layer is called as the input layer ( x )
- Last layer is called as the output layer ( y )
- Layer between the first and the last layer is called as the hidden layer
- First unit of the layer is called the bias unit
- a_i^j = “activation” of unit i in layer j
- theta^j = matrix of weight controlling function mapping from layer j to layer j + 1
- If network has s^j unit in layer j, s^j+1 units in layer j + 1, then theta^j will be of dimension s_j+1 x (s_j + 1)
- We apply each row of the parameters to our inputs to obtain the value for one activation node.
- Our hypothesis output is the logistic function applied to the sum of the values of our activation nodes, which have been multiplied by yet another parameter matrix theta^2 containing the weights for our second layer of nodes.

Model Representation 2

Forward Propagation
- Activation flows from input layer to output layer
- vectorised implementation of the above functions.
- Notice that in this last step, between layer j and layer j+1, we are doing exactly the same thing as we did in logistic regression.
- Adding all these intermediate layers in neural networks allows us to more elegantly produce interesting and more complex non-linear hypotheses.
Neural Network learning it’s own features
Other Neural Network architectures

Applications

Examples and Intuitions 1

XOR / XNOR
AND
- A simple example of applying neural networks is by predicting x_1 AND x_2, which is the logical ‘and’ operator and is only true if both x_1 and x_2 are 1
OR
- Neural networks can also be used to simulate all the other logical gates.
- The following is an example of the logical operator ‘OR’, meaning either x_1 is true or x_2 is true, or both

Examples and Intuitions 2

Negation
XNOR
- Combining AND, NOT and OR function we have a XNOR operator
Intuition

Multiclass Classification

One-vs-All
- To classify data into multiple classes, we let our hypothesis function return a vector of values.
- Say we wanted to classify our data into one of four categories. We will use the following example to see how this classification is done.
- This algorithm takes as input an image and classifies it accordingly:
- Our resulting hypothesis for one set of inputs may look like: hΘ(x)=[0010]

Motivations

Non-linear Hypothesis

Neurons and the Brain

Neural Networks

Model Representation 1

Model Representation 2

Applications

Examples and Intuitions 1

Examples and Intuitions 2

Multiclass Classification