Dataset Information

Reward-based option competition in human dorsal stream and transition from stochastic exploration to exploitation in continuous space.

ABSTRACT: Primates exploring and exploiting a continuous sensorimotor space rely on dynamic maps in the dorsal stream. Two complementary perspectives exist on how these maps encode rewards. Reinforcement learning models integrate rewards incrementally over time, efficiently resolving the exploration/exploitation dilemma. Working memory buffer models explain rapid plasticity of parietal maps but lack a plausible exploration/exploitation policy. The reinforcement learning model presented here unifies both accounts, enabling rapid, information-compressing map updates and efficient transition from exploration to exploitation. As predicted by our model, activity in human frontoparietal dorsal stream regions, but not in MT+, tracks the number of competing options, as preferred options are selectively maintained on the map, while spatiotemporally distant alternatives are compressed out. When valuable new options are uncovered, posterior β₁/α oscillations desynchronize within 0.4 to 0.7 s, consistent with option encoding by competing β₁-stabilized subpopulations. Together, outcomes matching locally cached reward representations rapidly update parietal maps, biasing choices toward often-sampled, rewarded options.

SUBMITTER: Hallquist MN

PROVIDER: S-EPMC10889364 | biostudies-literature | 2024 Feb

REPOSITORIES: biostudies-literature

ACCESS DATA

Publications

Reward-based option competition in human dorsal stream and transition from stochastic exploration to exploitation in continuous space.

Hallquist Michael N MN Hwang Kai K Luna Beatriz B Dombrovski Alexandre Y AY

Science advances 20240223 8

Primates exploring and exploiting a continuous sensorimotor space rely on dynamic maps in the dorsal stream. Two complementary perspectives exist on how these maps encode rewards. Reinforcement learning models integrate rewards incrementally over time, efficiently resolving the exploration/exploitation dilemma. Working memory buffer models explain rapid plasticity of parietal maps but lack a plausible exploration/exploitation policy. The reinforcement learning model presented here unifies both a ...[more]

PMID: 38394198

Similar Datasets

Project description:The face serves as a crucial cue for self-identification, while the sense of agency plays a significant role in determining our influence through actions in the environment. The current study investigates how self-identification through facial recognition may influence the perception of control via motion. We propose that self-identification might engender a belief in having control over one's own face, leading to a more acute detection and greater emphasis on discrepancies between their actions and the sensory feedback in control judgments. We refer to the condition governed by the belief in having control as the exploitation mode. Conversely, when manipulating another individual's face, the belief in personal control is absent. In such cases, individuals are likely to rely on the regularity between actions and sensory input for control judgments, exhibiting behaviors that are exploratory in nature to glean such information. This condition is termed the explorative mode. The study utilized a face-motion mixing paradigm, employing a deep generative model to enable participants to interact with either their own or another person's face through facial and head movements. During the experiment, participants observed either their own face or someone else's face (self-face vs. other-face) on the screen. The motion of the face was driven either purely by their own facial and head motion or by an average of the participant's and the experimenter's motion (full control vs. partial control). The results showed that participants reported a higher sense of agency over the other-face than the self-face, while their self-identification rating was significantly higher for the self-face. More importantly, controlling someone else's face resulted in more movement diversity than controlling one's own face. These findings support our exploration-exploitation theory: When participants had a strong belief in control triggered by the self-face, they became highly sensitive to any sensorimotor prediction errors, leading to a lower sense of agency. In contrast, when the belief of control was absent, the exploration mode triggered more explorative behaviors, allowing participants to efficiently gather information to establish a sense of agency.

Dataset Information

Reward-based option competition in human dorsal stream and transition from stochastic exploration to exploitation in continuous space.

Publications

Reward-based option competition in human dorsal stream and transition from stochastic exploration to exploitation in continuous space.

Similar Datasets

OmicsDI is part of the ELIXIR infrastructure

Tweets