Optimizer / Visitor / Mapper confusion, no documentation #1755

benbaarber · 2024-05-11T19:48:54Z

I am working on a reinforcement learning crate (rl) and am using burn to implement a deep Q network example. I am trying to use the AdamW optimizer, but the documentation around using these optimizers directly is very unclear. I have read chapter 6 of the burn book, and have looked through the docs, but I am still confused as to why AdamWConfig::init() returns impl Optimizer instead of the actual AdamW struct. What is the purpose of these optimizer structs if the configs dont initialize them, and you can't manually construct one?

Overall really happy with the burn experience, and want to contribute at some point in the future, but I've noticed when I get into the weeds of a very custom use case, the documentation is unclear or missing.

In this case, I am forced to add another generic on my SnakeDQN struct and all implementations even though I know exactly which optimizer it will be using. I am not aware of a better workaround if I want to keep my current paradigm of holding initialized modules in the SnakeDQN struct.

pub struct SnakeDQN<'a, O: Optimizer<Model<B>, B>> {
    env: &'a mut GrassyField<FIELD_SIZE>,
    policy_net: Model<B>,
    target_net: Model<B>,
    memory: ReplayMemory<GrassyField<FIELD_SIZE>>,
    loss: HuberLoss<B>,
    optimizer: O,
    exploration: EpsilonGreedy,
    gamma: f32,
    tau: f32,
    lr: f32,
    episode: u32,
}

impl<'a, O> SnakeDQN<'a, O>
where 
    O: Optimizer<Model<B>, B>,
{
    pub fn new(
        env: &'a mut GrassyField<FIELD_SIZE>,
        model_config: ModelConfig,
        loss_config: HuberLossConfig,
        optim_config: AdamWConfig,
        exploration: EpsilonGreedy,
    ) -> Self {
        Self {
            env,
            policy_net: model_config.init(&*DEVICE),
            target_net: model_config.init(&*DEVICE),
            memory: ReplayMemory::new(50000),
            loss: loss_config.init(&*DEVICE),
            optimizer: optim_config.init(),
            exploration,
            gamma: 0.86,
            tau: 2.7e-2,
            lr: 3.58e-3,
            episode: 0,
        }
    }
}

The text was updated successfully, but these errors were encountered:

antimora · 2024-05-12T01:48:58Z

CC @laggui @nathanielsimard

benbaarber · 2024-05-12T17:53:50Z

Having another issue as well, am trying to implement the soft update of the target network in a deep Q learning environment, see pytorch equivalent:

pnsd, tnsd = self.policy_net.state_dict(), self.target_net.state_dict()

for key in pnsd:
  tnsd[key] = pnsd[key] * self.hp["tau"] + tnsd[key] * (1 - self.hp["tau"])

self.target_net.load_state_dict(tnsd)

I see that burn has the concepts of visitors and mappers, which seemed to be the best way to implement this. However, the documentation around visitors, mappers, and generally accessing the parameters of a model is either missing or very hard to find. Even this small section in the book is out of date with the trait definitions. There do not appear to be any examples of the intended way to use these traits either.

I feel like burn has tons of potential to be THE goto machine learning framework in rust, but the lack of clear documentation and examples is holding it back. I really think documenting existing code should be a higher priority than adding new features at this point, and would be happy to help if someone can answer my questions and clear this stuff up for me.

nathanielsimard · 2024-05-14T02:59:33Z

@benbaarber Thanks for the PR that made the optimizer concrete. Hope this solves your problem regarding holding an optimizer in your struct. Regarding the visitor and mapper, I think it's a very nice candidate for a new advanced section in the book. Adding more docs on the trait wouldn't be a bad idea either.

For your specific problem, I think you would need a mapper, similar to how our optimizers are actually implemented and update the model's parameters. See this mapper as a reference. Let me know if it helps, and don't hesitate to ask your questions on the Discord; we are a bit more responsive there!

Right now, we aren't really prioritizing more features, but rather improving performance. Feel free to open issues in areas where Burn would benefit from more documentation and examples; we may prioritize it!

benbaarber · 2024-05-14T14:30:23Z

@nathanielsimard Thanks for the response

For your specific problem, I think you would need a mapper, similar to how our optimizers are actually implemented and update the model's parameters. See this mapper as a reference.

This helps alot, thanks.

Right now, we aren't really prioritizing more features, but rather improving performance. Feel free to open issues in areas where Burn would benefit from more documentation and examples; we may prioritize it!

Sounds good. I understand the Visitor/Mapper paradigm a bit better now after looking around the codebase, though I think some really basic examples would go a long way with newcomers (like me) being able to quickly and easily use these features. The thing I was confused about at first was where to store the parameters after visiting them, then I was a bit confused about using Mapper to adjust the parameters of the target model as a function of itself and the policy model, which I will try to figure out today and will definitely ask in the discord if I'm lost there.

Thank you again for the help

antimora added the documentation Improvements or additions to documentation label May 12, 2024

benbaarber changed the title ~~Optimizer confusion, no documentation~~ Optimizer / Visitor / Mapper confusion, no documentation May 12, 2024

benbaarber mentioned this issue May 13, 2024

Replace opaque return types in optim #1767

Merged

2 tasks

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Optimizer / Visitor / Mapper confusion, no documentation #1755

Optimizer / Visitor / Mapper confusion, no documentation #1755

benbaarber commented May 11, 2024 •

edited

antimora commented May 12, 2024

benbaarber commented May 12, 2024

nathanielsimard commented May 14, 2024

benbaarber commented May 14, 2024 •

edited

Optimizer / Visitor / Mapper confusion, no documentation #1755

Optimizer / Visitor / Mapper confusion, no documentation #1755

Comments

benbaarber commented May 11, 2024 • edited

antimora commented May 12, 2024

benbaarber commented May 12, 2024

nathanielsimard commented May 14, 2024

benbaarber commented May 14, 2024 • edited

benbaarber commented May 11, 2024 •

edited

benbaarber commented May 14, 2024 •

edited