Added support for checkpointing the policy. We can save and load the policy state dict, optimizers state, optimization step and interaction step

Added functions for converting the replay buffer from and to LeRobotDataset. When we want to save the replay buffer, we convert it first to LeRobotDataset format and save it locally and vice-versa.

Co-authored-by: Adil Zouitine <adilzouitinegm@gmail.com>
This commit is contained in:
Michel Aractingi
2025-01-30 17:39:41 +00:00
parent e856ffc91e
commit 367dfe51c6
7 changed files with 217 additions and 92 deletions
+1 -1
View File
@@ -106,7 +106,7 @@ def make_policy(
# Make a fresh policy.
# HACK: We pass *args and **kwargs to the policy constructor to allow for additional arguments
# for example device for the sac policy.
policy = policy_cls(*args, **kwargs, config=policy_cfg, dataset_stats=dataset_stats)
policy = policy_cls(config=policy_cfg, dataset_stats=dataset_stats)
else:
# Load a pretrained policy and override the config if needed (for example, if there are inference-time
# hyperparameters that we want to vary).