Filter
Associated Lab
- Dudman Lab (47) Apply Dudman Lab filter
- Harris Lab (1) Apply Harris Lab filter
- Hermundstad Lab (1) Apply Hermundstad Lab filter
- Ji Lab (1) Apply Ji Lab filter
- Karpova Lab (3) Apply Karpova Lab filter
- Lavis Lab (2) Apply Lavis Lab filter
- Lee (Albert) Lab (1) Apply Lee (Albert) Lab filter
- Looger Lab (1) Apply Looger Lab filter
- Pachitariu Lab (1) Apply Pachitariu Lab filter
- Spruston Lab (1) Apply Spruston Lab filter
- Sternson Lab (1) Apply Sternson Lab filter
- Svoboda Lab (3) Apply Svoboda Lab filter
- Tervo Lab (2) Apply Tervo Lab filter
Associated Project Team
Publication Date
- 2024 (2) Apply 2024 filter
- 2023 (2) Apply 2023 filter
- 2022 (3) Apply 2022 filter
- 2021 (2) Apply 2021 filter
- 2020 (2) Apply 2020 filter
- 2019 (4) Apply 2019 filter
- 2018 (3) Apply 2018 filter
- 2017 (4) Apply 2017 filter
- 2016 (3) Apply 2016 filter
- 2015 (3) Apply 2015 filter
- 2014 (4) Apply 2014 filter
- 2013 (2) Apply 2013 filter
- 2011 (1) Apply 2011 filter
- 2009 (1) Apply 2009 filter
- 2007 (3) Apply 2007 filter
- 2006 (1) Apply 2006 filter
- 2005 (2) Apply 2005 filter
- 2004 (2) Apply 2004 filter
- 2003 (3) Apply 2003 filter
Type of Publication
47 Publications
Showing 1-10 of 47 resultsIn natural environments, animals must efficiently allocate their choices across multiple concurrently available resources when foraging, a complex decision-making process not fully captured by existing models. To understand how rodents learn to navigate this challenge we developed a novel paradigm in which untrained, water-restricted mice were free to sample from six options rewarded at a range of deterministic intervals and positioned around the walls of a large ( 2m) arena. Mice exhibited rapid learning, matching their choices to integrated reward ratios across six options within the first session. A reinforcement learning model with separate states for staying or leaving an option and a dynamic, global learning rate was able to accurately reproduce mouse learning and decision-making. Fiber photometry recordings revealed that dopamine in the nucleus accumbens core (NAcC), but not dorsomedial striatum (DMS), more closely reflected the global learning rate than local error-based updating. Altogether, our results provide insight into the neural substrate of a learning algorithm that allows mice to rapidly exploit multiple options when foraging in large spatial environments.
The nervous system evolved to enable navigation throughout the environment in the pursuit of resources. Evolutionarily newer structures allowed increasingly complex adaptations but necessarily added redundancy. A dominant view of movement neuroscientists is that there is a one-to-one mapping between brain region and function. However, recent experimental data is hard to reconcile with the most conservative interpretation of this framework, suggesting a degree of functional redundancy during the performance of well-learned, constrained behaviors. This apparent redundancy likely stems from the bidirectional interactions between the various cortical and subcortical structures involved in motor control. We posit that these bidirectional connections enable flexible interactions across structures that change depending upon behavioral demands, such as during acquisition, execution or adaptation of a skill. Observing the system across both multiple actions and behavioral timescales can help isolate the functional contributions of individual structures, leading to an integrated understanding of the neural control of movement.
The interplay between two major forebrain structures - cortex and subcortical striatum - is critical for flexible, goal-directed action. Traditionally, it has been proposed that striatum is critical for selecting what type of action is initiated while the primary motor cortex is involved in the online control of movement execution. Recent data indicates that striatum may also be critical for specifying movement execution. These alternatives have been difficult to reconcile because when comparing very distinct actions, as in the vast majority of work to date, they make essentially indistinguishable predictions. Here, we develop quantitative models to reveal a somewhat paradoxical insight: only comparing neural activity during similar actions makes strongly distinguishing predictions. We thus developed a novel reach-to-pull task in which mice reliably selected between two similar, but distinct reach targets and pull forces. Simultaneous cortical and subcortical recordings were uniquely consistent with a model in which cortex and striatum jointly specify flexible parameters of action during movement execution.
Recent success in training artificial agents and robots derives from a combination of direct learning of behavioural policies and indirect learning through value functions. Policy learning and value learning use distinct algorithms that optimize behavioural performance and reward prediction, respectively. In animals, behavioural learning and the role of mesolimbic dopamine signalling have been extensively evaluated with respect to reward prediction; however, so far there has been little consideration of how direct policy learning might inform our understanding. Here we used a comprehensive dataset of orofacial and body movements to understand how behavioural policies evolved as naive, head-restrained mice learned a trace conditioning paradigm. Individual differences in initial dopaminergic reward responses correlated with the emergence of learned behavioural policy, but not the emergence of putative value encoding for a predictive cue. Likewise, physiologically calibrated manipulations of mesolimbic dopamine produced several effects inconsistent with value learning but predicted by a neural-network-based model that used dopamine signals to set an adaptive rate, not an error signal, for behavioural policy learning. This work provides strong evidence that phasic dopamine activity can regulate direct learning of behavioural policies, expanding the explanatory power of reinforcement learning models for animal learning.
Animals learn trajectories to rewards in both spatial, navigational contexts and relational, non-navigational contexts. Synchronous reactivation of hippocampal activity is thought to be critical for recall and evaluation of trajectories for learning. Do hippocampal representations differentially contribute to experience-dependent learning of trajectories across spatial and relational contexts? In this study, we trained mice to navigate to a hidden target in a physical arena or manipulate a joystick to a virtual target to collect delayed rewards. In a navigational context, calcium imaging in freely moving mice revealed that synchronous CA1 reactivation was retrospective and important for evaluation of prior navigational trajectories. In a non-navigational context, reactivation was prospective and important for initiation of joystick trajectories, even in the same animals trained in both contexts. Adaptation of trajectories to a new target was well-explained by a common learning algorithm in which hippocampal activity makes dissociable contributions to reinforcement learning computations depending upon spatial context.
Recent success in training artificial agents and robots derives from a combination of direct learning of behavioral policies and indirect learning via value functions. Policy learning and value learning employ distinct algorithms that optimize behavioral performance and reward prediction, respectively. In animals, behavioral learning and the role of mesolimbic dopamine signaling have been extensively evaluated with respect to reward prediction; however, to date there has been little consideration of how direct policy learning might inform our understanding. Here we used a comprehensive dataset of orofacial and body movements to understand how behavioral policies evolve as naive, head-restrained mice learned a trace conditioning paradigm. Individual differences in initial dopaminergic reward responses correlated with the emergence of learned behavioral policy, but not the emergence of putative value encoding for a predictive cue. Likewise, physiologically-calibrated manipulations of mesolimbic dopamine produced multiple effects inconsistent with value learning but predicted by a neural network-based model that used dopamine signals to set an adaptive rate, not an error signal, for behavioral policy learning. This work provides strong evidence that phasic dopamine activity can regulate direct learning of behavioral policies, expanding the explanatory power of reinforcement learning models for animal learning.
The interaction of descending neocortical outputs and subcortical premotor circuits is critical for shaping skilled movements. Two broad classes of motor cortical output projection neurons provide input to many subcortical motor areas: pyramidal tract (PT) neurons, which project throughout the neuraxis, and intratelencephalic (IT) neurons, which project within the cortex and subcortical striatum. It is unclear whether these classes are functionally in series or whether each class carries distinct components of descending motor control signals. Here, we combine large-scale neural recordings across all layers of motor cortex with cell type-specific perturbations to study cortically dependent mouse motor behaviors: kinematically variable manipulation of a joystick and a kinematically precise reach-to-grasp. We find that striatum-projecting IT neuron activity preferentially represents amplitude, whereas pons-projecting PT neurons preferentially represent the variable direction of forelimb movements. Thus, separable components of descending motor cortical commands are distributed across motor cortical projection cell classes.
Sensory cues that precede reward acquire predictive (expected value) and incentive (drive reward-seeking action) properties. Mesolimbic dopamine neurons' responses to sensory cues correlate with both expected value and reward-seeking action. This has led to the proposal that phasic dopamine responses may be sufficient to inform value-based decisions, elicit actions, and/or induce motivational states; however, causal tests are incomplete. Here, we show that direct dopamine neuron stimulation, both calibrated to physiological and greater intensities, at the time of reward can be sufficient to induce and maintain reward seeking (reinforcing) although replacement of a cue with stimulation is insufficient to induce reward seeking or act as an informative cue. Stimulation of descending cortical inputs, one synapse upstream, are sufficient for reinforcement and cues to future reward. Thus, physiological activation of mesolimbic dopamine neurons can be sufficient for reinforcing properties of reward without being sufficient for the predictive and incentive properties of cues.
Measuring the dynamics of neural processing across time scales requires following the spiking of thousands of individual neurons over milliseconds and months. To address this need, we introduce the Neuropixels 2.0 probe together with newly designed analysis algorithms. The probe has more than 5000 sites and is miniaturized to facilitate chronic implants in small mammals and recording during unrestrained behavior. High-quality recordings over long time scales were reliably obtained in mice and rats in six laboratories. Improved site density and arrangement combined with newly created data processing methods enable automatic post hoc correction for brain movements, allowing recording from the same neurons for more than 2 months. These probes and algorithms enable stable recordings from thousands of sites during free behavior, even in small animals such as mice.
Optogenetic reagents allow for depolarization and hyperpolarization of cells with light. This provides unprecedented spatial and temporal resolution to the control of neuronal activity both in vitro and in vivo. In the intact animal this requires strategies to deliver light deep into the highly scattering tissue of the brain. A general approach that we describe here is to implant optical fibers just above brain regions targeted for light delivery. In part due to the fact that expression of optogenetic proteins is accomplished by techniques with inherent variability (e.g., viral expression levels), it also requires strategies to measure and calibrate the effect of stimulation. Here we describe general procedures that allow one to simultaneously stimulate neurons and use photometry with genetically encoded activity indicators to precisely calibrate stimulation.