Skip to content

Add generic trajectory logger #122

@tyler-griggs

Description

@tyler-griggs

It is very useful to dig into the complete generated trajectories (including model outputs and environment observations / feedback) to debug model behavior during training. We should support exporting trajectories in an easily-readable format.

At minimum, we should support dumping the chat history as a table to wandb.

A better solution is likely creating a generic TrajectoryLogger class that is given complete trajectories and can dump / export them in a user-customizable way. Some users simply want to read the prompts and responses (e.g., as a table in wandb), others may want to create a dataframe and do some data analysis in a notebook -- a generic TrajectoryLogger class would support this.

To start, the class could exposes a single method, log, that takes as input a trajectory in ConversationType data type. But, it is possible (and likely) that users will want more flexibility in what they pass into the class, so this is worth considering.

TODOs

  • Create generic TrajectoryLogger class
  • Create a WandbTableTrajectoryLogger instantiation of this class, which simply uploads a table of prompts and responses
  • Use the WandbTableTrajectoryLogger in SkyRLGymGenerator (and cover with flag) to add trajectory logging to wandb for all training runs that use the skyrl gym generator. One tricky detail I anticipate is handling de-tokenization of trajecotries so they are uploaded in plaintext.

Metadata

Metadata

Assignees

No one assigned

    Labels

    No labels
    No labels

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions