快速强化学习研究
项目描述
体现
快速强化学习研究。
概述
Embodied的目标是赋予研究人员快速大规模实现新代理的能力。Embodied通过为环境和代理指定接口来实现这一点,允许用户混合匹配代理、环境和评估协议。Embodied提供了一些常见的构建块,鼓励用户在需要更多控制时进行分支。唯一的依赖项是Numpy,代理可以在任何框架中实现。
包
embodied/
core/ # Config, logging, checkpointing, simulation, wrappers
run/ # Evaluation protocols that combine agents and environments
envs/ # Environment suites such as Gym, Atari, DMC, Crafter
agents/ # Agent implementations
代理API
class Agent:
__init__(obs_space, act_space, config)
policy(obs, carry, mode='train') -> act, carry
train(data, carry) -> metrics, carry
report(data, carry) -> metrics, carry
init_policy(batch_size) -> carry
init_train(batch_size) -> carry
init_report(batch_size) -> carry
dataset(generator) -> generator
环境API
class Env:
__len__() -> int
@obs_space -> dict of spaces
@act_space -> dict of spaces
step(act) -> obs dict
render() -> array
close()