Perception, according to the ninth definition from the Oxford English Dictionary (Simpson and Weiner, 1989), refers to the ``the neurophysiological processes, including memory, by which an organism becomes aware of and interprets external stimuli or sensations.'' For Tolman (1932), perception is any expectation of an external object or situation ``when this expectation results primarily from present stimuli'' (p. 452).
It is still unclear how neural processes lead to an interpretation of external stimuli, and how `seeing' comes about. At least, we know that most of seeing needs to be learned1.1 (see Gregory (1998) for a review). For example, humans that grow up blind and gain vision as an adult have difficulties to make sense of their visual experience.
To learn to interpret visual experience, active movement is important. Kittens that are only passively moved, while observing their environment, show impaired visual-guided movements (Held and Hein, 1963). In humans, the influence of action1.2 on perceptual learning can be seen in visual adaptation studies, visual recognition studies, and fMRI experiments (section 1.4.2). This link between action and perception is also plausible from an evolutionary perspective. Our survival depends on our interaction with the world, and therefore, the potential to see something that is irrelevant to our behavior will disappear through natural selection.
In this thesis, visual perception is explored from the perspective of sensorimotor models, for example, models providing a mapping from sensory data to motor commands, or inversely from motor commands to sensory data. For example, according to this concept, the shape of an object is understood (`perceived') by associating appropriate grasping postures. Such a sensorimotor approach is an alternative to a pure sensory image analysis that extracts labels such as `triangular' or `elongated'. In this thesis, robots are used to test this sensorimotor approach to perception.
To acquire sensorimotor models, infants spend years to learn the effects of their actions. They can do this without a teacher, by initially moving their limbs (seemingly) randomly. Also the robots used in this work collect training data by exploring the sensory effects of their random motor commands. Then, machine learning techniques make sense out of the sensorimotor data by finding simplified representations. Thus, relations within the data are learned in an `unsupervised' way. This thesis tackles problems that arise in the learning of sensorimotor models, for example, ambiguities and generalization.