What is Perceptual Learning?

 

Perceptual learning (PL) addresses the ability to improve performance with practice in a perceptual task (Gibson, 1953).  Training in a perceptual task enables one to become more efficient at distinguishing subtle differences in sensory information or input after repeated exposure. To perform more efficiently an observer needs the ability to detect or discriminate the difference between the most relevant and least relevant information. In other words, an observer learns to distinguish signal from noise with greater facility.  For example, a radiologist can easily detect anomalies in an X-ray whereas a novice cannot.  Learned perceptual skills can also be found in other sensory modalities such as taste, touch and hearing (Gibson, 1953).  The blind refine their sense of touch to learn Braille and an oenologist refines his/her sense of taste and smell to sample the finest wines. The important element is that learning to perceive subtle differences in stimuli reflects a trainable ability to amplify the most relevant information and reduce the least informative.  Moreover, it reflects experience-dependent modifications that persist for extended periods of time, from months to years (Ball & Sekuler, 1987; Fiorentini & Berardi: 1981; Karni & Sagi, 1993).

            In its simplest form, the learned ability to perceive differences in visual stimuli with greater accuracy after extended practice is what is known as perceptual learning. The field of perceptual learning, in vision in particular, concerns itself with the conditions for which improvement is observed, its underlying mechanisms, and its generalizability.  For perceptual learning to be ultimately valuable, it must generalize to similar tasks and attributes.  Generalizability, or transfer, is defined as the ability to use what has been learned and apply it effectively to a new situation (e.g. perceptual task).  Conversely, specificity is the extent to which performance improvements in training are independent of performance in a testing phase.   The issue of generalizability or transfer is still an open question in the field. In reality, the evidence of transfer can be minimal (See Fahle & Poggio, 2002 for review).  The observation of little or no transfer has generated important theoretical claims based on cortical properties in the visual system (e.g. Fiorentini & Berardi, 1980).   For the most part, the literature has been dominated by the localization of perceptual learning in early visual cortex based on observations of specificity (Gilbert, Sigman, & Crist, 2001; although see Dosher & Lu, 1998, 1999; Mollon & Danilova, 1996).

            The focus of my research is to understand the contributing factors that influence the observation of specificity (or, transfer).  Characterizing the conditions under which an observer can optimize performance in training and further generalize to similar tasks with different dimensions is an important step towards understanding the underlying mechanisms involved in PL.  To do so, I employ controlled, experimental learning conditions through manipulations in training and manipulations of stimulus attributes and/or dimensions.  I then ask how the experimental manipulations contribute to specificity (or the lack thereof) in similar tasks under different conditions. 

Characteristics of Perceptual Learning

            Experience-dependent changes in sensory cortex due to practice are functionally characterized as perceptual ‘plasticity’.  At one time, not too long ago, it was believed that the ability for early sensory cortex to physically change its functional response to environmental stimuli was limited to early childhood development. The classical study by Hubel and Wiesel (1959) discovered the capacity of the visual system to recover from limited visual input in early stages of development. This remarkable ability was restricted to a “critical period”, after which, the visual system was unable to recover.  This suggested that subsequent cortical changes due to environmental manipulations became less likely as an observer reached adulthood (Gilbert et al., 2001 for review).  This viewpoint changed when it was reported that practice in a perceptual task could improve performance substantially in adults indicating experience-dependent plasticity (McKee & Westheimer, 1978; Ramachandran & Braddick, 1973).  The focus in perceptual learning then shifted towards where in the visual system this learning might occur (Fiorentini & Berardi, 1980). This will be discussed further in the section on Specificity below.

            To further understand and characterize PL, researchers have used several experimental methods such as psychophysics, single-cell recording, and fMRI.  Empirically, we use quantifiable, low-level stimuli to characterize PL through a variety of perceptual tasks.  With practice, observers can resolve the spatial offset of visual stimuli that differ by a fraction of a photoreceptor’s diameter, known as hyperacuity (Beard, Levi, & Reich, 1995; Fahle & Edelman, 1993; Mckee & Westheimer, 1978 Poggio, Fahle, & Edelman, 1992; Westheimer, 1975). Observers can also learn to discriminate the direction of motion (Ball & Sekuler, 1982; 1987, Liu & Weinshall, 2000; Lu, Chu, Dosher, & Lee 2005; Matthews & Welch, 1997), and judge the orientation of simple line stimuli (Matthews, Liu, Geesaman, & Qian,1999; Shiu & Pashler; 1992, Vogels & Orban, 1985) or simple spatial gratings (Dosher & Lu, 1998, 1999; Petrov, Dosher, & Lu, 2006; Rentschler, Juttner, & Caelli, 1994).  Tasks such as visual search (Ahissar & Hochstein, 1997; Sigman & Gilbert, 2000),   texture discrimination (Karni & Sagi, 1991, 1993), depth perception in random-dot stereograms (Ramachandran & Braddick, 1973) and discriminating spatial frequency (Fine & Jacobs, 2000; Fiorentini & Berardi, 1981) are also often found to show robust learning effects after a period of training.

            Perceptual tasks vary in complexity, from the discrimination of simple line gratings (Shiu & Pashler, 1992) to the recognition of 3-D objects (Tanaka, Saito, Fukada, & Moriya, 1991). These tend to exhibit a dynamic range of learning depending on several methodological factors (Fine & Jacobs, 2002). Physical manipulations of the stimuli or psychophysical procedures can influence the extent of the behavioral improvement observed.  For example, training for simple line stimuli is more sensitive for cardinal (principal) orientations than for oblique oriented stimuli (Appelle, 1972; Ball & Sekuler, 1982; Mayer, 1983). Performance for oblique stimuli generally starts out worse than that of cardinal orientations, but improves significantly with practice and eventually reaches the same levels of performance as cardinal orientations (Mayer, 1983; Vogels & Orban, 1985).  Performance diminishes with presentation of the stimulus in the periphery (eccentricity) relative to foveal presentations (Crist, Kapadia, Westheimer, & Gilbert 1997).    Perceptual learning does not require feedback, but progresses at a slower pace than when feedback is provided (Fahle 2004; Fahle & Edelman, 1993; Tsodyks & Gilbert, 2004). Better performance in terms of lower thresholds, higher percent correct, or faster learning rates have been observed for tasks that require “coarse” relative to “finer” angle discriminations, or rather, low precision relative to high precision tasks (Ahissar & Hochstein, 1997; Liu and Weinshall, 2000).    

            Furthermore, presenting stimuli with and without external noise (masks) has revealed underlying mechanisms that further characterize PL.  This describes the inherent internal noise due to processing inefficiencies in the visual system (Dosher & Lu, 1998, 1999).  There are two key learning mechanisms identified by Dosher and Lu (1998) that serve to 1) enhance the signal of relevant information and 2) exclude visual noise, elucidating dynamic patterns of learning.  Observers demonstrate significant learning in clear displays as well as in noisy displays.  However, the magnitude or levels of performance may differ.  More importantly, presenting stimuli with and without noise may reveal critical patterns of learning in transfer.           

Specificity of Learned Attributes  

The lack of transfer of learned attributes, or specificity, has emerged as a key property of perceptual learning, which has led researchers to further explore the underlying mechanisms of learned plasticity.  Specificity in a behavioral task reflects the extent to which performance in a transfer task is independent of performance in a training task.  Figure 1 demonstrates generic learning curves where performance improves (e.g. thresholds decrease) as a function of time.  Most perceptual learning studies follow a standard paradigm where observers first train on a single attribute or task until performance reaches asymptotic levels, then observers are tested in a separate task in the transfer stage.  Specificity (S) occurs when performance returns to baseline in the testing or transfer phase (Figure 1, upper panel).  If performance looks like it picks up in the transfer stage, where it left off in the training stage, it is considered evidence of transfer (T) (Figure 1, lower panel).  Often, reports in the literature display partial transfer, or partial specificity (Figure 1, middle).

 

Patterns of Specificity

Figure 1 Patterns of Specificity and Transfer

Generic learning curves where performance improves (e.g. thresholds decrease) are a plotted as a function of time in a training and transfer phase.  Specificity (S) occurs when performance returns to baseline in the testing or transfer phase (upper panel).  If performance looks like it picks up in the transfer stage, where it left off in the training stage, it is considered evidence of transfer (T) (lower panel).  Often, reports in the literature indicate partial transfer, or partial specificity (middle). (Adapted from Dill, M. in Fahle & Poggio, 2002).

 

Groundbreaking work by Hubel and Wiesel (1959, 1962) demonstrating the orientation and positional selectivity of neurons in visual cortex has been highly influential in the field of perceptual learning.  Specificity for low-level stimulus attributes such as retinal position, orientation and spatial frequency has been widely observed for many perceptual tasks (Fiorentini & Berardi, 1981; Schoups, Vogels, & Orban, 1995; Shiu & Pashler, 1992).  Evidence of specificity after training for a stimulus attribute or task led many to conclude that an orientation or position-specific result suggested localization of improvement to selective sites in early visual cortex (Fahle & Poggio 2000 for review; Fiorentini & Berardi, 1980; Karni & Sagi, 1991).  This view, for example, argues for the recruitment of neurons with relatively small receptive fields located in early areas of visual cortex (e.g. V1) that respond selectively to simple oriented stimuli at specific locations in the visual field (Gilbert et al., 2001 for review). This interpretation maintains that learning can be attributed to changes or modifications in sensory areas that are selective for these attributes early in the visual hierarchy (Ahissar & Hochstein, 2004; Karni & Sagi, 1991). For example, if training occurs at a low-level in the visual hierarchy, learning will be specific to orientation and location. If learning occurs at higher levels where broader encoding takes place then learning generalizes (See Figure 2).  Ahissar and Hochstein (1997) further contend that learning progresses in reverse fashion in their Reverse Hierarchy Theory. Learning is initiated by ‘high-level’ presentations of the stimuli, and later becomes more specific as the stimuli demand more detailed monitoring (Ahissar & Hochstein, 1997, 2004). Therefore, their theory contends that training at higher levels tends to generalize and training at lower levels tends to be more specific.

 

Visual Hierarchy

Figure 2 Specificity vs. Transfer in the Visual Hierarchy

If training occurs at a low-level in the visual hierarchy, learning will be specific to orientation and location. If learning occurs at higher levels where broader encoding takes place then learning generalizes.  According to Ahissar & Hochstein, training at higher levels leads to generalization whereas training at lower levels leads to specificity. (Adapted from Ahissar & Hochstein, 2004)

 

            Dosher and Lu (1998, 1999, see also Petrov et al., 2006 and Mollon and Danilova, 1996) advocate an alternative hypothesis that what is learned may drive specificity more than where it is learned.  The evidence of specificity often falls short of perfect independence, which in most cases indicates a degree of partial transfer (e.g. Liu & Weinshall, 2000).  An alternative interpretation for specificity describes mechanisms for perceptual learning that are attributed to changes of the read - out connections (or weighting structures) from representations in early visual cortex to decision units, also known as the “re-weighting hypothesis” (Dosher & Lu, 1998, 1999; Petrov et al., 2006). Here, learning involves the ongoing selection of orientation and spatial frequency channels that carry the most relevant information with respect to the task at hand (See Figure 3). Every channel that is selective for orientation and spatial frequency in some location in the visual field may be connected to a decision unit. This view takes into consideration the idea that learning may be processed at different cortical areas depending on the complexity of the task and or stimuli, thus activating “weights” associated with the stimulus that are connected to a shared decision units. As learning continues, only the most relevant “weights” are activated. When an observer is presented with a transfer task, the system must recalibrate the most relevant weights to accommodate the change in stimulus or task attribute. Under this view, specificity for a given attribute occurs due to the learned weights.    In this view, specificity reflects the weighting of inputs from early representations that are specific to orientation, spatial frequency, or location but speculates that learning does not imply changes in these representations.

 

 

Figure 3 Schematic of channels in early visual system

Separate visual channels that are selectively tuned to spatial frequencies and orientations process a Gabor patch embedded in visual noise. The Gabor patch denotes a representational unit that is connected to a decision unit via visual channels.  Prior to learning, many channels or weights are active.  After training, only the most relevant and informative weights are active, while the least informative are reduced.  (Adapted from Dosher & Lu, 1998)