The delta rule Learn from your mistakes



Yüklə 178 Kb.
tarix17.09.2018
ölçüsü178 Kb.
#68939


The delta rule


Learn from your mistakes



If it ain’t broke, don’t fix it.



Outline

  • Supervised learning problem

  • Delta rule

  • Delta rule as gradient descent

  • Hebb rule



Supervised learning

  • Given examples



Example: handwritten digits

  • Find a perceptron that detects “two”s.



Delta rule

  • Learning from mistakes.

  • “delta”: difference between desired and actual output.

  • Also called “perceptron learning rule”



Two types of mistakes



Objective function



Perceptron convergence theorem

  • Cycle through a set of examples.

  • Suppose a solution with zero error exists.

  • The perceptron learning rule finds a solution in finite time.



If examples are nonseparable

  • The delta rule does not converge.

  • Objective function is not equal to the number of mistakes.

  • No reason to believe that the delta rule minimizes the number of mistakes.



Memorization & generalization

  • Prescription: minimize error on the training set of examples

  • What is the error on a test set of examples?

  • Vapnik-Chervonenkis theory

    • assumption: examples are drawn from a probability distribution
    • conditions for generalization


contrast with Hebb rule

  • Assume that the teacher can drive the perceptron to produce the desired output.

  • What are the objective functions?



Is the delta rule biological?

  • Actual output: anti-Hebbian

  • Desired output: Hebbian

  • Contrastive



Objective function



Supervised vs. unsupervised

  • Classification vs. generation

  • I shall not today attempt further to define the kinds of material [pornography] … but I know it when I see it.

    • Justice Potter Stewart


Smooth activation function

  • same except for slope of f

  • update is small when the argument of f has large magnitude.



Objective function

  • Gradient update

  • Stochastic gradient descent on

  • E=0 means zero error.



Smooth activation functions are important for generalizing the delta rule to multilayer perceptrons.



Yüklə 178 Kb.

Dostları ilə paylaş:




Verilənlər bazası müəlliflik hüququ ilə müdafiə olunur ©genderi.org 2023
rəhbərliyinə müraciət

    Ana səhifə