Section: Ecology
Topic: Ecology, Evolution, Psychological and cognitive sciences

Bayesian reinforcement learning models reveal how great-tailed grackles improve their behavioral flexibility in serial reversal learning experiments

Corresponding author(s): Lukas, Dieter (dieter_lukas@eva.mpg.de)

10.24072/pcjournal.456 - Peer Community Journal, Volume 4 (2024), article no. e88.

Get full text PDF Peer reviewed and recommended by PCI
article image

Environments can change suddenly and unpredictably and animals might benefit from being able to flexibly adapt their behavior through learning new associations. Serial (repeated) reversal learning experiments have long been used to investigate differences in behavioral flexibility among individuals and species. In these experiments, individuals initially learn that a reward is associated with a specific cue before the reward is reversed back and forth between cues, forcing individuals to reverse their learned associations. Cues are reliably associated with a reward, but the association between the reward and the cue frequently changes. Here, we apply and expand newly developed Bayesian reinforcement learning models to gain additional insights into how individuals might dynamically modulate their behavioral flexibility if they experience serial reversals. We derive mathematical predictions that, during serial reversal learning experiments, individuals will gain the most rewards if they 1) increase their *rate of updating associations* between cues and the reward to quickly change to a new option after a reversal, and 2) decrease their *sensitivity* to their learned association to explore the alternative option after a reversal. We reanalyzed reversal learning data from 19 wild-caught great-tailed grackles (Quiscalus mexicanus), eight of whom participated in serial reversal learning experiment, and found that these predictions were supported. Their estimated association-updating rate was more than twice as high at the end of the serial reversal learning experiment than at the beginning, and their estimated sensitivities to their learned associations declined by about a third. The changes in behavioral flexibility that grackles showed in their experience of the serial reversals also influenced their behavior in a subsequent experiment, where individuals with more extreme rates or sensitivities solved more options on a multi-option puzzle box. Our findings offer new insights into how individuals react to uncertainty and changes in their environment, in particular, showing how they can modulate their behavioral flexibility in response to their past experiences.

Published online:
DOI: 10.24072/pcjournal.456
Type: Research article
Keywords: Behavioral flexibility, comparative cognition, grackle, innovativeness, multi-access box, problem solving, reversal learning

Lukas, Dieter 1; McCune, Kelsey 2, 3; Blaisdell, Aaron 4; Johnson-Ulrich, Zoe 2; MacPherson, Maggie 2; Seitz, Benjamin 4; Sevchik, August 5; Logan, Corina 1, 6

1 Department of Human Behavior, Ecology and Culture, Max Planck Institute for Evolutionary Anthropology, Leipzig, Germany
2 Institute for Social, Behavioral and Economic Research, University of California Santa Barbara, Santa Barbara, USA
3 College of Forestry, Wildlife and Environment, Auburn University, Auburn, USA
4 Department of Psychology & Brain Research Institute, University of California Los Angeles, Los Angeles, USA
5 Barrett, The Honors College, Arizona State University, Phoenix, USA
6 Neurosciences Research Institute, University of California Santa Barbara, Santa Barbara, USA
License: CC-BY 4.0
Copyrights: The authors retain unrestricted copyrights and publishing rights
@article{10_24072_pcjournal_456,
     author = {Lukas, Dieter and McCune, Kelsey and Blaisdell, Aaron and Johnson-Ulrich, Zoe and MacPherson, Maggie and Seitz, Benjamin and Sevchik, August and Logan, Corina},
     title = {Bayesian reinforcement learning models reveal how great-tailed grackles improve their behavioral flexibility in serial reversal learning experiments},
     journal = {Peer Community Journal},
     eid = {e88},
     publisher = {Peer Community In},
     volume = {4},
     year = {2024},
     doi = {10.24072/pcjournal.456},
     language = {en},
     url = {https://peercommunityjournal.org/articles/10.24072/pcjournal.456/}
}
TY  - JOUR
AU  - Lukas, Dieter
AU  - McCune, Kelsey
AU  - Blaisdell, Aaron
AU  - Johnson-Ulrich, Zoe
AU  - MacPherson, Maggie
AU  - Seitz, Benjamin
AU  - Sevchik, August
AU  - Logan, Corina
TI  - Bayesian reinforcement learning models reveal how great-tailed grackles improve their behavioral flexibility in serial reversal learning experiments
JO  - Peer Community Journal
PY  - 2024
VL  - 4
PB  - Peer Community In
UR  - https://peercommunityjournal.org/articles/10.24072/pcjournal.456/
DO  - 10.24072/pcjournal.456
LA  - en
ID  - 10_24072_pcjournal_456
ER  - 
%0 Journal Article
%A Lukas, Dieter
%A McCune, Kelsey
%A Blaisdell, Aaron
%A Johnson-Ulrich, Zoe
%A MacPherson, Maggie
%A Seitz, Benjamin
%A Sevchik, August
%A Logan, Corina
%T Bayesian reinforcement learning models reveal how great-tailed grackles improve their behavioral flexibility in serial reversal learning experiments
%J Peer Community Journal
%D 2024
%V 4
%I Peer Community In
%U https://peercommunityjournal.org/articles/10.24072/pcjournal.456/
%R 10.24072/pcjournal.456
%G en
%F 10_24072_pcjournal_456
Lukas, Dieter; McCune, Kelsey; Blaisdell, Aaron; Johnson-Ulrich, Zoe; MacPherson, Maggie; Seitz, Benjamin; Sevchik, August; Logan, Corina. Bayesian reinforcement learning models reveal how great-tailed grackles improve their behavioral flexibility in serial reversal learning experiments. Peer Community Journal, Volume 4 (2024), article  no. e88. doi : 10.24072/pcjournal.456. https://peercommunityjournal.org/articles/10.24072/pcjournal.456/

PCI peer reviews and recommendation, and links to data, scripts, code and supplementary information: 10.24072/pci.ecology.100468

Conflict of interest of the recommender and peer reviewers:
The recommender in charge of the evaluation of the article and the reviewers declared that they have no conflict of interest (as defined in the code of conduct of PCI) with the authors or with the content of the article.

[1] Addicott, M. A.; Pearson, J. M.; Sweitzer, M. M.; Barack, D. L.; Platt, M. L. A Primer on Foraging and the Explore/Exploit Trade-Off for Psychiatry Research, Neuropsychopharmacology, Volume 42 (2017) no. 10, pp. 1931-1939 | DOI

[2] Agrawal, S.; Goyal, N. Analysis of Thompson Sampling for the Multi-Armed Bandit Problem, Proceedings of the 25th Annual Conference on Learning Theory (Proceedings of Machine Learning Research), Volume 23, PMLR, Edinburgh, Scotland, 2012-06-25/2012-06-27, p. 39

[3] Bari, B. A.; Moerke, M. J.; Jedema, H. P.; Effinger, D. P.; Cohen, J. Y.; Bradberry, C. W. Reinforcement Learning Modeling Reveals a Reward-History-Dependent Strategy Underlying Reversal Learning in Squirrel Monkeys., Behavioral neuroscience, Volume 136 (2022) no. 1, p. 46 | DOI

[4] Bartolo, R.; Averbeck, B. B. Prefrontal Cortex Predicts State Switches during Reversal Learning, Neuron, Volume 106 (2020) no. 6, pp. 1044-1054 | DOI

[5] Berger-Tal, O.; Nathan, J.; Meron, E.; Saltz, D. The Exploration-Exploitation Dilemma: A Multidisciplinary Framework, PloS one, Volume 9 (2014) no. 4, p. e95693 | DOI

[6] Bitterman, M. E. The Comparative Analysis of Learning: Are the Laws of Learning the Same in All Animals?, Science, Volume 188 (1975) no. 4189, pp. 699-709 | DOI

[7] Blaisdell, A.; Seitz, B.; Rowney, C.; Folsom, M.; MacPherson, M.; Deffner, D.; Logan, C. J. Do the More Flexible Individuals Rely More on Causal Cognition? Observation versus Intervention in Causal Inference in Great-Tailed Grackles, Peer Community Journal, Volume 1 (2021), p. e50 | DOI

[8] Bond, A. B.; Kamil, A. C.; Balda, R. P. Serial Reversal Learning and the Evolution of Behavioral Flexibility in Three Species of North American Corvids (Gymnorhinus Cyanocephalus, Nucifraga Columbiana, Aphelocoma Californica)., Journal of Comparative Psychology, Volume 121 (2007) no. 4, p. 372 | DOI

[9] Boyce, M. S.; Haridas, C. V.; Lee, C. T.; Group, N. S. D. W.; others Demography in an Increasingly Variable World, Trends in Ecology & Evolution, Volume 21 (2006) no. 3, pp. 141-148 | DOI

[10] Breen, A. J.; Deffner, D. Leading an urban invasion: risk-sensitive learning is a winning strategy, eLife, 2023 | DOI

[11] Camerer, C.; Hua Ho, T. Experience-Weighted Attraction Learning in Normal Form Games, Econometrica : journal of the Econometric Society, Volume 67 (1999) no. 4, pp. 827-874 | DOI

[12] Cauchoix, M.; Hermer, E.; Chaine, A.; Morand-Ferron, J. Cognition in the Field: Comparison of Reversal Learning Performance in Captive and Wild Passerines, Scientific reports, Volume 7 (2017) no. 1, p. 12945 | DOI

[13] Chen, C. S.; Knep, E.; Han, A.; Ebitz, R. B.; Grissom, N. M. Sex Differences in Learning from Exploration, Elife, Volume 10 (2021), p. e69748 | DOI

[14] Chow, P. K.; Leaver, L. A.; Wang, M.; Lea, S. E. Serial Reversal Learning in Gray Squirrels: Learning Efficiency as a Function of Learning and Change of Tactics., Journal of Experimental Psychology: Animal Learning and Cognition, Volume 41 (2015) no. 4, p. 343 | DOI

[15] Coulon, A. Changes in Behavioral Flexibility to Cope with Environment Instability: Theoretical and Empirical Insights from Serial Reversal Learning Experiments, Peer Community in Ecology, Volume 1 (2024), p. 100468 | DOI

[16] Coulon, A. An Experiment to Improve Our Understanding of the Link between Behavioral Flexibility and Innovativeness, Peer Community in Ecology, Volume 1 (2023), p. 100407 | DOI

[17] Danwitz, L.; Mathar, D.; Smith, E.; Tuzsus, D.; Peters, J. Parameter and Model Recovery of Reinforcement Learning Models for Restless Bandit Problems, Computational Brain & Behavior, Volume 5 (2022) no. 4, pp. 547-563 | DOI

[18] Daw, N. D.; O'doherty, J. P.; Dayan, P.; Seymour, B.; Dolan, R. J. Cortical Substrates for Exploratory Decisions in Humans, Nature, Volume 441 (2006) no. 7095, pp. 876-879 | DOI

[19] Degrande, R.; Cornilleau, F.; Lansade, L.; Jardat, P.; Colson, V.; Calandreau, L. Domestic hens succeed at serial reversal learning and perceptual concept generalisation using a new automated touchscreen device, animal, Volume 16 (2022) no. 8 | DOI

[20] Donaldson-Matasci, M. C.; Bergstrom, C. T.; Lachmann, M. When Unreliable Cues Are Good Enough, The American Naturalist, Volume 182 (2013) no. 3, pp. 313-327 | DOI

[21] Dufort, R. H.; Guttman, N.; Kimble, G. A. One-Trial Discrimination Reversal in the White Rat., Journal of Comparative and Physiological Psychology, Volume 47 (1954) no. 3, p. 248 | DOI

[22] Dunlap, A. S.; Stephens, D. W. Components of Change in the Evolution of Learning and Unlearned Preference, Proceedings of the Royal Society B: Biological Sciences, Volume 276 (2009) no. 1670, pp. 3201-3208 | DOI

[23] Erdsack, N.; Dehnhardt, G.; Hanke, F. D. Serial Visual Reversal Learning in Harbor Seals (Phoca Vitulina), Animal Cognition, Volume 25 (2022) no. 5, pp. 1183-1193 | DOI

[24] Federspiel, I. G.; Garland, A.; Guez, D.; Bugnyar, T.; Healy, S. D.; Güntürkün, O.; Griffin, A. S. Adjusting Foraging Strategies: A Comparison of Rural and Urban Common Mynas (Acridotheres Tristis), Animal cognition, Volume 20 (2017) no. 1, pp. 65-74 | DOI

[25] Frömer, R.; Nassar, M. Belief Updates, Learning and Adaptive Decision Making, PsyArXiv (2023) | DOI

[26] Gallistel, C. R.; Fairhurst, S.; Balsam, P. The Learning Curve: Implications of a Quantitative Analysis, Proceedings of the National Academy of Sciences, Volume 101 (2004) no. 36, pp. 13124-13131 | DOI

[27] Gelman, A.; Rubin, D. B. Avoiding Model Selection in Bayesian Social Research, Sociological methodology, Volume 25 (1995), pp. 165-173 | DOI

[28] Gershman, S. J. Deconstructing the Human Algorithms for Exploration, Cognition, Volume 173 (2018), pp. 34-42 | DOI

[29] Izquierdo, A.; Brigman, J. L.; Radke, A. K.; Rudebeck, P. H.; Holmes, A. The Neural Basis of Reversal Learning: An Updated Perspective, Neuroscience, Volume 345 (2017), pp. 12-26 | DOI

[30] Komischke, B.; Giurfa, M.; Lachnit, H.; Malun, D. Successive Olfactory Reversal Learning in Honeybees, Learning & memory, Volume 9 (2002) no. 3, pp. 122-129 | DOI

[31] Kramer, D. L.; Weary, D. M. Exploration versus Exploitation: A Field Study of Time Allocation to Environmental Tracking by Foraging Chipmunks, Animal Behaviour, Volume 41 (1991) no. 3, pp. 443-449 | DOI

[32] Lea, S. E.; Chow, P. K.; Leaver, L. A.; McLaren, I. P. Behavioral Flexibility: A Review, a Model, and Some Exploratory Tests, Learning & Behavior, Volume 48 (2020) no. 1, pp. 173-187 | DOI

[33] Lee, V. E.; Thornton, A. Animal Cognition in an Urbanised World, Frontiers in Ecology and Evolution, Volume 9 (2021), p. 120 | DOI

[34] Leimar, O.; Quiñones, A. E.; Bshary, R. Flexible Learning in Complex Worlds, Behavioral Ecology, Volume 35 (2024) no. 1, p. arad109 | DOI

[35] Liu, Y.; Day, L. B.; Summers, K.; Burmeister, S. S. Learning to Learn: Advanced Behavioural Flexibility in a Poison Frog, Animal Behaviour, Volume 111 (2016), pp. 167-172 | DOI

[36] Lloyd, K.; Leslie, D. S. Context-Dependent Decision-Making: A Simple Bayesian Model, Journal of The Royal Society Interface, Volume 10 (2013) no. 82, p. 20130069 | DOI

[37] Logan, C. J.; Shaw, R.; Lukas, D.; McCune, K. B. How to Succeed in Human Modified Environments, OSF. In principle acceptance by PCI Ecology of the version on 8 Sep 2022 (2022) | DOI

[38] Logan, C.; Lukas, D.; Blaisdell, A.; Johnson-Ulrich, Z.; MacPherson, M.; Seitz, B.; Sevchik, A.; McCune, K. Data: Behavioral Flexibility Is Manipulable and It Improves Flexibility and Problem Solving in a New Context, Knowledge Network for Biocomplexity, Volume Data package (2023) | DOI

[39] Logan, C.; Lukas, D.; Geng, X.; LeGrande-Rolls, C.; Marfori, Z.; MacPherson, M.; Rowney, C.; Smith, C.; McCune, K. Behavioral Flexibility Is Related to Foraging, but Not Social or Habitat Use Behaviors in a Species That Is Rapidly Expanding Its Range, EcoEvoRxiv (2024) | DOI

[40] Logan, C.; McCune, K.; LeGrande-Rolls C; Marfori Z; Hubbard J; Lukas, D. Implementing a Rapid Geographic Range Expansion - the Role of Behavior Changes, Peer Community Journal, Volume 3 (2023), p. e85 | DOI

[41] Logan, C.; Lukas, D.; Blaisdell, A.; Johnson-Ulrich, Z.; MacPherson, M.; Seitz, B.; Sevchik, A.; McCune, K. Behavioral Flexibility Is Manipulable and It Improves Flexibility and Innovativeness in a New Context, Peer Community Journal, Volume 3 (2023) no. e70, p. e70 | DOI

[42] Lukas, D. Script and Code for "Bayesian Reinforcement Learning Models Reveal How Great-Tailed Grackles Improve Their Behavioral Flexibility in Serial Reversal Learning Experiments". Data package, Edmond (2024) | DOI

[43] Mackintosh, N.; McGonigle, B.; Holgate, V. Factors Underlying Improvement in Serial Reversal Learning., Canadian Journal of Psychology/Revue canadienne de psychologie, Volume 22 (1968) no. 2, p. 85 | DOI

[44] McCune, K.; Blaisdell, A.; Johnson-Ulrich, Z.; Sevchik, A.; Lukas, D.; MacPherson, M.; Seitz, B.; Logan, C. J. Using Repeatability of Performance within and across Contexts to Validate Measures of Behavioral Flexibility, PeerJ, Volume 11 (2023), p. e15773 | DOI

[45] McElreath, R. Rethinking: Statistical Rethinking Book Package, 2020 (https://github.com/Rmcelreath/Rethinking)

[46] Metha, J. A.; Brian, M. L.; Oberrauch, S.; Barnes, S. A.; Featherby, T. J.; Bossaerts, P.; Murawski, C.; Hoyer, D.; Jacobson, L. H. Separating Probability and Reversal Learning in a Novel Probabilistic Reversal Learning Task for Mice, Frontiers in behavioral neuroscience, Volume 13 (2020), p. 270 | DOI

[47] Mikhalevich, I.; Powell, R.; Logan, C. Is Behavioural Flexibility Evidence of Cognitive Complexity? How Evolution Can Inform Comparative Cognition, Interface Focus, Volume 7 (2017) no. 3, p. 20160121 | DOI

[48] Minh Le, N.; Yildirim, M.; Wang, Y.; Sugihara, H.; Jazayeri, M.; Sur, M. Mixtures of Strategies Underlie Rodent Behavior during Reversal Learning, PLOS Computational Biology, Volume 19 (2023) no. 9, p. e1011430 | DOI

[49] Neftci, E. O.; Averbeck, B. B. Reinforcement Learning in Artificial and Biological Systems, Nature Machine Intelligence, Volume 1 (2019) no. 3, pp. 133-143 | DOI

[50] R Core Team R: A Language and Environment for Statistical Computing, 2024 (https://www.r-project.org/)

[51] Rayburn-Reeves, R. M.; Stagner, J. P.; Kirk, C. R.; Zentall, T. R. Reversal Learning in Rats (Rattus Norvegicus) and Pigeons (Columba Livia): Qualitative Differences in Behavioral Flexibility., Journal of Comparative Psychology, Volume 127 (2013) no. 2, p. 202 | DOI

[52] Rescorla, R. A.; Wagner, A. R. A Theory of Pavlovian Conditioning: Variations in the Effectiveness of Reinforcement and Nonreinforcement, Classical Conditioning II: Current Theory and Research, Appleton-Century-Crofts, New York, 1972, pp. 64-99

[53] Rosati, A. G.; Machanda, Z. P.; Slocombe, K. E. Cognition in the Wild: Understanding Animal Thought in Its Natural Context, Current Opinion in Behavioral Sciences, Volume 47 (2022) no. C | DOI

[54] Shettleworth, S. J. Cognition, Evolution, and Behavior, Oxford University Press, 2010 | DOI

[55] Sih, A. Understanding Variation in Behavioural Responses to Human-Induced Rapid Environmental Change: A Conceptual Overview, Animal Behaviour, Volume 85 (2013) no. 5, pp. 1077-1088 | DOI

[56] Sol, D.; Timmermans, S.; Lefebvre, L. Behavioural Flexibility and Invasion Success in Birds, Animal behaviour, Volume 63 (2002) no. 3, pp. 495-502 | DOI

[57] Spence, K. W. The Nature of Discrimination Learning in Animals., Psychological review, Volume 43 (1936) no. 5, p. 427 | DOI

[58] Stan Development Team Stan Modeling Language Users Guide and Reference Manual, Version 2.32.0, 2023 (https://mc-stan.org/)

[59] Starrfelt, J.; Kokko, H. Bet-Hedging—a Triple Trade-off between Means, Variances and Correlations, Biological Reviews, Volume 87 (2012) no. 3, pp. 742-755 | DOI

[60] Strang, C. G.; Sherry, D. F. Serial Reversal Learning in Bumblebees (Bombus Impatiens), Animal Cognition, Volume 17 (2014), pp. 723-734 | DOI

[61] Tello-Ramos, M. C.; Branch, C. L.; Kozlovsky, D. Y.; Pitera, A. M.; Pravosudov, V. V. Spatial Memory and Cognitive Flexibility Trade-Offs: To Be or Not to Be Flexible, That Is the Question, Animal Behaviour, Volume 147 (2019), pp. 129-136 | DOI

[62] Vehtari, A.; Gelman, A.; Simpson, D.; Carpenter, B.; Bürkner, P.-C. Rank-Normalization, Folding, and Localization: An Improved Rhat for Assessing Convergence of MCMC (with Discussion), Bayesian Analysis (2021) | DOI

[63] Warren, J. Primate Learning in Comparative Perspective, Behavior of nonhuman primates, Volume 1 (1965), pp. 249-281 | DOI

[64] Warren, J.; Warren, H. B. Reversal Learning by Horse and Raccoon, The Journal of Genetic Psychology, Volume 100 (1962) no. 2, pp. 215-220 | DOI

[65] Warren, J. M. The Comparative Psychology of Learning, Annual Review of Psychology, Volume 16 (1965) no. 1, pp. 95-118 | DOI

[66] Woo, J. H.; Aguirre, C. G.; Bari, B. A.; Tsutsui, K.-I.; Grabenhorst, F.; Cohen, J. Y.; Schultz, W.; Izquierdo, A.; Soltani, A. Mechanisms of Adjustments to Different Types of Uncertainty in the Reward Environment across Mice and Monkeys, Cognitive, Affective, & Behavioral Neuroscience (2023), pp. 1-20 | DOI

[67] Wright, T. F.; Eberhard, J. R.; Hobson, E. A.; Avery, M. L.; Russello, M. A. Behavioral Flexibility and Species Invasions: The Adaptive Flexibility Hypothesis, Ethology Ecology & Evolution, Volume 22 (2010) no. 4, pp. 393-404 | DOI

Cited by Sources:

block.super