Machine Learning By Andew Ng - Week 4
Motivations
Non-linear Hypothesis
- 
    Representation - 
        Problem 
- 
        For non-linear classification, hypothesis is a high order polynomial - 
            if it is a quadratic function of 100 features, hypothesis will be close to 5000 features - Time Complexity is O ( n^2 )
 
- 
            if it is a cubic function of 100 features, hypothesis will be close to 1,70,000 
 
- 
            
  
- 
        
- 
    Computer Vision - 
        Why it is hard ? - Computer see the matrix of pixel density of the image
  
- 
        How does it work? - 
            We give the classifier image of cars with labels and not images of cars with labels, it train on them. 
- 
            We give a test image to predict, if it’s a car or not car. 
  
- 
            
- 
        If we have 50 x 50 pixels images, then the n will 2500 ( greyscale ) 7500 ( RGB ) 
  Neurons and the Brain- 
        Neural Networks - Origin: Algorithms that try to mimic the brain
  
- 
        Neural Rewiring Experiments - Rewiring the auditory cortex with eyes rather than ears, it learns to see or visual discrimination with that tissue
  - In the same way, by rewiring the somatosensory cortex with the eyes rather than the hands, it learns to visual discrimination with that tissue
  - It is “ one learning algorithm “, whatever input it receives it generalises it perform that particular task.
 
- 
        Examples - 
            Seeing with your tongue 
- 
            Human echolocation 
- 
            Haptic Belt 
- 
            Implementing 3rd eye in the frog 
  
- 
            
 Neural NetworksModel Representation 1- 
        Neurons - 
            There are input wires called ‘Dendrites’. 
- 
            There are output wire called ‘Axon’ 
- 
            There is also cell body and nucleus 
  - 
            Working of a Neuron - 
                One neuron sends information to other neuron by sending electric pulses ( called “spikes” ) 
- 
                Axon terminal of one neuron is connected to the dendrites of the other neuron 
  
- 
                
 
- 
            
- 
        Neuron Model - 
            In our model, our dendrites are like the input features ( x_1, …. x_n ) and the output is the result of our hypothesis function. 
- 
            In this model our x_0 input node is sometimes called the “bias unit.” It is always equal to 1. 
- 
            Parameters are also called as weights 
- 
            x_0 is a bias unit 
- 
            Sigmoid ( logistic ) activation function - activation function = Hypothesis of logistic
 
  
- 
            
 
- 
        
- 
    Artificial Neural Network - 
        First layer is called as the input layer ( x ) 
- 
        Last layer is called as the output layer ( y ) 
- 
        Layer between the first and the last layer is called as the hidden layer 
- 
        First unit of the layer is called the bias unit 
  - 
        a_i^j = “activation” of unit i in layer j 
- 
        theta^j = matrix of weight controlling function mapping from layer j to layer j + 1 
- 
        If network has s^j unit in layer j, s^j+1 units in layer j + 1, then theta^j will be of dimension s_j+1 x (s_j + 1) 
- 
        We apply each row of the parameters to our inputs to obtain the value for one activation node. 
- 
        Our hypothesis output is the logistic function applied to the sum of the values of our activation nodes, which have been multiplied by yet another parameter matrix theta^2 containing the weights for our second layer of nodes. 
  
- 
        
Model Representation 2
- 
    Forward Propagation - 
        Activation flows from input layer to output layer 
- 
        vectorised implementation of the above functions. 
- 
        Notice that in this last step, between layer j and layer j+1, we are doing exactly the same thing as we did in logistic regression. 
- 
        Adding all these intermediate layers in neural networks allows us to more elegantly produce interesting and more complex non-linear hypotheses. 
  
- 
        
- 
    Neural Network learning it’s own features  
- 
    Other Neural Network architectures  
Applications
Examples and Intuitions 1
- 
    XOR / XNOR  
- 
    AND - A simple example of applying neural networks is by predicting x_1 AND x_2, which is the logical ‘and’ operator and is only true if both x_1 and x_2 are 1
  
- 
    OR - 
        Neural networks can also be used to simulate all the other logical gates. 
- 
        The following is an example of the logical operator ‘OR’, meaning either x_1 is true or x_2 is true, or both 
  
- 
        
Examples and Intuitions 2
- 
    Negation  
- 
    XNOR - Combining AND, NOT and OR function we have a XNOR operator
  
- 
    Intuition  
Multiclass Classification
- 
    One-vs-All - 
        To classify data into multiple classes, we let our hypothesis function return a vector of values. 
- 
        Say we wanted to classify our data into one of four categories. We will use the following example to see how this classification is done. 
- 
        This algorithm takes as input an image and classifies it accordingly: 
  - Our resulting hypothesis for one set of inputs may look like: hΘ(x)=[0010]
  
-