Geoffrey Hinton (Google Scientific Advisor) – Neural Networks for Machine Learning by Geoffrey Hinton (Lecture 1/16 (Dec 2016)


Chapters

00:00:29 Machine Learning: From Basic Concepts to Advanced Examples
00:10:55 Deep Neural Networks Revolutionizing Speech Recognition
00:13:21 How Neurons Work
00:17:55 Neural Networks: Learning and Flexibility in the Brain
00:22:00 Neurons in Neural Networks: Linear, Binary Threshold, and Rectified Linear
00:27:19 Neurons and Learning Algorithms in Artificial Neural Networks
00:34:21 Types of Machine Learning: Supervised, Reinforcement, and Unsupervised
00:40:12 Unsupervised Learning: Goals and Techniques

Abstract

The Evolution and Applications of Machine Learning: From Patterns to Brain-like Computation



Abstract:

Machine learning, a branch of computer science that enables computers to learn without explicit programming, has significantly impacted various domains. Its proficiency lies in identifying patterns and irregularities in data, which is crucial in image and sensor pattern recognition, predictive analysis, and anomaly detection.



Machine learning algorithms are broadly categorized into supervised and unsupervised learning. Supervised learning algorithms use labeled data to learn a function mapping input data to output labels, while unsupervised learning algorithms uncover structures and patterns in unlabeled data. A classic example of supervised learning is the MNIST database of 70,000 grayscale images of handwritten digits, serving as a benchmark for machine learning performance.

Neural networks, drawing inspiration from the human brain, have advanced object recognition capabilities, successfully identifying various object classes under different conditions. In speech recognition, deep neural networks have improved the accuracy of systems by excelling in acoustic modeling. A notable achievement in this field is Darla Mohammed’s system, which achieved a 20.7% error rate, surpassing the previous best of 24.4%.

The shift from artificial neural networks to biological neurons reveals a different approach to information processing. Neurons, the brain’s fundamental units, are vastly different from traditional serial processors, and synapses, the junctions between neurons, are key to learning and complex computations. These synapses offer compactness, efficiency, and adaptability but understanding their role in learning is a continuing challenge.

The human brain’s computational power is immense, with its network of neurons and synaptic weights facilitating high-bandwidth knowledge processing. The cortex is particularly flexible, capable of adapting to new sensory inputs and relocating functions in case of damage. This contrasts with conventional computers, which depend on fast central processors and stored programs for flexibility. The brain’s efficiency lies in its parallel computation and flexibility through learned synaptic weights.

Machine learning encompasses various types, including supervised, reinforcement, and unsupervised learning, each addressing unique challenges. Unsupervised learning, for instance, aims to transform high-dimensional inputs into more economical codes. The field seeks to develop internal representations that facilitate learning, moving beyond traditional clustering.

The article concludes by emphasizing the evolution of machine learning from simple pattern recognition to emulating brain-like computation. Despite the use of simplified neuron models for foundational understanding, ongoing research continues to reveal the complexities and capabilities of this expanding field.

Neurons and Communication:

Neurons in the brain receive inputs from other neurons or receptors and communicate through spikes of activity. Synaptic weights, which can be positive or negative, control the impact of these inputs.

Weight Adaptation and Learning:

By adjusting synaptic weights, the network learns to perform tasks like object recognition, language comprehension, and body movement control.

Storage Capacity:

The brain’s approximately 10^11 neurons, each with about 10^4 weights, result in a vast number of synaptic weights, enabling immense bandwidth for stored knowledge.

Modularity and Functional Localization:

The cortex’s modularity means different parts learn different functions. Inputs from the senses influence the functionality of specific cortex areas. Local brain damage affects particular functions, such as language comprehension or object recognition.

Brain Scanning and Function Location:

Brain scanners reveal active brain areas during specific tasks through blood flow visualization. The cortex’s uniform appearance suggests a universal learning algorithm.

Functional Relocation and Adaptability:

Early brain damage can lead to functional relocation, showing the brain’s adaptability in function assignment. Experiments with baby ferrets demonstrate this adaptability through the learning of new functions when sensory inputs are rerouted.

General-Purpose Cortex and Flexibility:

The cortex transforms into specialized hardware for specific tasks based on experience, combining rapid parallel computation and flexibility for new function learning.

Comparison with Conventional Computers:

In contrast to conventional computers that rely on stored sequential programs, the brain’s flexibility stems from adapting synaptic weights, facilitating parallel computation and efficient learning.

Understanding Neural Idealization

Idealization simplifies complex systems like the brain, making them manageable for study. This approach allows the application of mathematics and familiar system analogies, with the understanding that models, while imperfect, can be valuable.

Linear Neurons

Linear neurons produce outputs based on a function of bias, weighted activities from input lines, and synaptic weights. The input-output relationship is represented graphically as a straight line.

Binary Threshold Neurons

McCulloch and Pitts introduced binary threshold neurons, influencing the design of universal computers. These neurons compute a weighted sum of inputs, triggering a spike of activity if a threshold is exceeded. They initially represented the brain as a logic-based system, but current understanding views it as a combiner of various unreliable evidence sources.

Rectified Linear Neurons (ReLUs)

ReLUs combine linear and binary threshold neurons’ features. They compute a linear weighted sum of inputs and apply a non-linear function to the result. Their input-output curve is linear above zero and zero otherwise, incorporating linear system properties while allowing for decisive outputs.

Logistic Neurons

Logistic neurons, commonly used in artificial neural networks, output real values as a smooth function of their total input. The logistic function, ranging from 0 to 1, is used for output computation, with its derivatives being smooth and continuous, aiding learning algorithms.

Stochastic Binary Neurons

These neurons use logistic functions to calculate the probability of producing a spike, outputting a binary result based on probabilistic decisions. They introduce randomness in spike production timing, akin to a Poisson process.

Rectified Linear Units (ReLUs)

Different from the previously mentioned ReLUs, these units activate only when the input is positive, introducing randomness in output by determining spike production rates.

Example of Machine Learning:

A simple neural network trained to recognize handwritten shapes has input neurons representing pixel intensities and output neurons for shape classes. Pixels vote for shapes, with the most votes determining recognition.

Weight Display and Learning Algorithm:

The network’s weights are visualized in a map, with each output unit’s connections represented by black and white blobs indicating magnitude and sign. A learning algorithm adjusts weights based on training examples, incrementing weights for active pixels to the correct class and decrementing for the guessed class.

Results:

After training with several hundred examples, the network’s weights converge to templates resembling each shape, effectively learning shape recognition.

Overview of Supervised, Reinforcement, and Unsupervised Learning

Challenges of Simple Neural Networks for Handwritten Digit Recognition:

Geoffrey Hinton highlights the limitations of basic neural networks in recognizing handwritten digits. These networks struggle with capturing the variations in handwritten digits, failing to model allowable variations through feature extraction and arrangement analysis. The inadequacy of whole shape templates for this task is evident in presented examples.

Three Types of Machine Learning:

Hinton outlines three main groups of machine learning algorithms: supervised, reinforcement, and unsupervised learning. Supervised learning predicts outputs from input vectors, aiming to minimize target and actual output discrepancies. Reinforcement learning selects actions to maximize rewards, facing challenges from delayed and sparse rewards. Unsupervised learning discovers efficient internal representations of input data.

Supervised Learning:

Supervised learning is categorized into regression, predicting real numbers or vectors, and classification, predicting categories from alternatives. The goal is to adjust model parameters to fit the training data, commonly using the squared difference to minimize output discrepancies.

Reinforcement Learning:

This learning type involves action selection based on rewards, priorit izing immediate rewards using discount factors. It’s challenging due to delayed rewards and the difficulty of learning from sparse information.

Unsupervised Learning:

Unsupervised learning’s goal is to discover efficient representations of inputs, moving from high-dimensional representations like images to more manageable forms. This includes finding manifolds in high-dimensional spaces and representing inputs in terms of learned features for economical representation.

Definition of Unsupervised Learning:

Traditionally seen as clustering, unsupervised learning aims to create internal representations useful for subsequent learning, going beyond mere clustering.

Two-Stage Learning:

This approach separates unsupervised and supervised/reinforcement learning to avoid using rewards/punishments for setting parameters in visual systems, exemplified by learning the distance to a surface without relying on constant feedback.

Goals of Unsupervised Learning:

The objective is to compress high-dimensional inputs into lower-dimensional representations. This involves finding manifolds in high-dimensional spaces and using features like binary or sparse real-valued features for representation.



Machine learning’s evolution from simple pattern recognition to brain-like computation underscores its vast potential in various applications. The field continues to grow, delving deeper into the complexities and capabilities of both artificial and biological computational models.


Notes by: datagram