On the challenge of learning complex functions

Yoshua Bengio

doi:10.1016/S0079-6123(06)65033-4

On the challenge of learning complex functions

Prog Brain Res. 2007:165:521-34. doi: 10.1016/S0079-6123(06)65033-4.

Author

Yoshua Bengio¹

Affiliation

¹ Department IRO, Université de Montréal, P.O. Box 6128, Downtown Branch, Montreal, QC, H3C 3J7, Canada. bengioy@iro.umontreal.ca

PMID: 17925268
DOI: 10.1016/S0079-6123(06)65033-4

Abstract

A common goal of computational neuroscience and of artificial intelligence research based on statistical learning algorithms is the discovery and understanding of computational principles that could explain what we consider adaptive intelligence, in animals as well as in machines. This chapter focuses on what is required for the learning of complex behaviors. We believe it involves the learning of highly varying functions, in a mathematical sense. We bring forward two types of arguments which convey the message that many currently popular machine learning approaches to learning flexible functions have fundamental limitations that render them inappropriate for learning highly varying functions. The first issue concerns the representation of such functions with what we call shallow model architectures. We discuss limitations of shallow architectures, such as so-called kernel machines, boosting algorithms, and one-hidden-layer artificial neural networks. The second issue is more focused and concerns kernel machines with a local kernel (the type used most often in practice) that act like a collection of template-matching units. We present mathematical results on such computational architectures showing that they have a limitation similar to those already proved for older non-parametric methods, and connected to the so-called curse of dimensionality. Though it has long been believed that efficient learning in deep architectures is difficult, recently proposed computational principles for learning in deep architectures may offer a breakthrough.

Publication types

Research Support, Non-U.S. Gov't
Review

MeSH terms

Artificial Intelligence*
Humans
Learning / physiology*
Models, Psychological*
Pattern Recognition, Automated