Continuously evolving rewards in an open-ended environment
Authors
Paper Information
-
Journal:
Journal of Machine Learning Research -
Added to Tracker:
Jul 15, 2025
Abstract
Unambiguous identification of the rewards driving behaviours of entities operating in complex open-ended real-world environments is difficult, in part because goals and associated behaviours emerge endogenously and are dynamically updated as environments change. Reproducing such dynamics in models would be useful in many domains, particularly where fixed reward functions limit the adaptive capabilities of agents. Simulation experiments described here assess a candidate algorithm for the dynamic updating of the reward function, RULE: Reward Updating through Learning and Expectation. The approach is tested in a simplified ecosystem-like setting where experiments challenge entities' survival, calling for significant behavioural change. The population of entities successfully demonstrate the abandonment of an initially rewarded but ultimately detrimental behaviour, amplification of beneficial behaviour, and appropriate responses to novel items added to their environment. These adjustments happen through endogenous modification of the entities' reward function, during continuous learning, without external intervention.
Author Details
Richard M. Bailey
AuthorCitation Information
APA Format
Richard M. Bailey
.
Continuously evolving rewards in an open-ended environment.
Journal of Machine Learning Research
.
BibTeX Format
@article{JMLR:v26:24-0847,
author = {Richard M. Bailey},
title = {Continuously evolving rewards in an open-ended environment},
journal = {Journal of Machine Learning Research},
year = {2025},
volume = {26},
number = {62},
pages = {1--51},
url = {http://jmlr.org/papers/v26/24-0847.html}
}