Unknown

Dataset Information

0

Humanoids Learning to Walk: A Natural CPG-Actor-Critic Architecture.


ABSTRACT: The identification of learning mechanisms for locomotion has been the subject of much research for some time but many challenges remain. Dynamic systems theory (DST) offers a novel approach to humanoid learning through environmental interaction. Reinforcement learning (RL) has offered a promising method to adaptively link the dynamic system to the environment it interacts with via a reward-based value system. In this paper, we propose a model that integrates the above perspectives and applies it to the case of a humanoid (NAO) robot learning to walk the ability of which emerges from its value-based interaction with the environment. In the model, a simplified central pattern generator (CPG) architecture inspired by neuroscientific research and DST is integrated with an actor-critic approach to RL (cpg-actor-critic). In the cpg-actor-critic architecture, least-square-temporal-difference based learning converges to the optimal solution quickly by using natural gradient learning and balancing exploration and exploitation. Futhermore, rather than using a traditional (designer-specified) reward it uses a dynamic value function as a stability indicator that adapts to the environment. The results obtained are analyzed using a novel DST-based embodied cognition approach. Learning to walk, from this perspective, is a process of integrating levels of sensorimotor activity and value.

SUBMITTER: Li C 

PROVIDER: S-EPMC3619089 | biostudies-literature | 2013

REPOSITORIES: biostudies-literature

altmetric image

Publications

Humanoids Learning to Walk: A Natural CPG-Actor-Critic Architecture.

Li Cai C   Lowe Robert R   Ziemke Tom T  

Frontiers in neurorobotics 20130408


The identification of learning mechanisms for locomotion has been the subject of much research for some time but many challenges remain. Dynamic systems theory (DST) offers a novel approach to humanoid learning through environmental interaction. Reinforcement learning (RL) has offered a promising method to adaptively link the dynamic system to the environment it interacts with via a reward-based value system. In this paper, we propose a model that integrates the above perspectives and applies it  ...[more]

Similar Datasets

| S-EPMC4183130 | biostudies-literature
| S-EPMC9446087 | biostudies-literature
| S-EPMC7523734 | biostudies-literature
| S-EPMC7264096 | biostudies-literature
| S-EPMC4033619 | biostudies-literature
| S-EPMC11762915 | biostudies-literature
| S-EPMC9466407 | biostudies-literature
| S-EPMC11888913 | biostudies-literature
| S-EPMC9868235 | biostudies-literature
| S-EPMC11232580 | biostudies-literature