New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Problematic derivative of Tensor::abs and Huber loss #1441
Comments
Okay, so the gradients from the loss get propagated to the I suppose a sign primitive, and using that instead of burn/crates/burn-autodiff/src/ops/tensor.rs Lines 1906 to 1907 in 56f4602
|
Linking Sign OP issue here: #522 |
Submitted sign tensor operator: #1446 |
Sign tensor op PR is merged. |
Implementing the Huber loss requires comparing the absolute value against some small
kappa
, then behaving linearly with the absolute value outside this bound, but quadratic inside. An example implementation would be:Using this implementation leads to
NaN
values showing up after a few steps of training. This is probably connected to the gradient computation inTensor::abs
but as the problematic small values should be masked out and instead use the (perfectly fine) gradient computation from the quadratic term, I'm not sure how that is happening.The text was updated successfully, but these errors were encountered: