Project: RL1/RL2 (obsolete)
Collection
Older models that are no longer useful for anything in RL1 or RL2, or are now unused as experimentation discontinued. • 16 items • Updated
YAML Metadata Warning:empty or missing yaml metadata in repo card
Check out the documentation for more information.
A series of models trained while varying over duplication_factor when the cheese was always in the corner, meaning there were only 120 possible states for the environment to be in. No longer relevant as we train with alpha, given the enviromental distribution $\Lambda_{alpha} = \alpha \Lambda_{1} + (1- \alpha) \Lambda_{0}$ where
No longer relevant as duplication_factor has since been removed as there are now ~14k many states instead of 120.