The development of machines capable of natural linguistic interaction with humans has been an active and diverse area of research for decades. More recent frameworks, such as Cognitive Robotics, have been able to make progress on many long-standing problems in computational modeling of language acquisition -- like that of symbol grounding -- through the application of the principles of embodied cognition. Many of these systems have focused on modeling grounded word learning through statistical mappings between various sensor modalities, such as speech-to-vision or speech-to-motor control. However, the entire body of such systems has only been able to capture a tiny fraction of the developmental robustness or representational diversity observed in even the youngest of human word-learners. Children are capable of learning words in situations of extreme ambiguity, leveraging a variety of contextual knowledge to infer the targets of adults' references. And unlike children, few cognitive robotics systems have any kind of understanding of the purpose of words outside of reference. The core premise of the following thesis is that this gap is, in part, due to computational models which ignore the communicative and intentional (i.e. pragmatic) aspects of language. To address these issues, a computational framework for the learning of perceptually-grounded word meanings is presented. Our model is based on a representation of language as a useful behavior, embedded within an intentionally structured social interaction. Using techniques for inverse planning and control, the algorithms we have developed seek to understand the goal or purpose driving the behaviors of the interaction. We describe the application of these techniques to a set of human-robot interaction experiments, modeled after development studies demonstrating specific skills of children in the learning of word meanings under referential ambiguity. Through these experiments, we show how our framework allows the robotic agent to acquire knowledge about the physical and social task structure underlying the interaction, and leverage this in order to learn word meanings in many different cases of ambiguity. These include many novel situations where the robot must make inferences due to the goal-directed actions of the speaker, or even knowledge of its own embodiment and potential role in the interaction. We will show finally how our robotic platform can be made to realize this role, actively taking part in its own learning experience, and begin to see language as something useful.
【 预 览 】
附件列表
Files
Size
Format
View
Robots as language users: a computational model for pragmatic word learning