Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Add documentation to burn core nn #1746

Draft
wants to merge 42 commits into
base: main
Choose a base branch
from

Conversation

jwric
Copy link
Contributor

@jwric jwric commented May 7, 2024

Pull Request Template

Checklist

  • Confirmed that run-checks all script has been executed.
  • Made sure the book is up to date with changes in this PR.

Related Issues/PRs

Related to #594

Changes

Revised rust documentation for burn core neural network modules.

ThierryCantin-Demers and others added 30 commits May 7, 2024 13:34
Added links between the struct and the config. Added a link to the related burn_tensor function in the documentation for the forward function.
Removing the formula for relu from the module API to the functional API,
citing a paper relevant to relu
and mentionning the functional API in the module API
Adding documentation to the Linear module
mentionning that LinearConfig struct
should be used when creating a Linear Layer

Also adding links to the documentation that points people toward
the right path
Added links between the struct and the config. Added a link to the struct in the forward function for more info.
Adding documentation stating the RotaryEncoding should be created using a RotaryEncodingConfig
Adding documentation to the prelu module:
- Linking forward function documentation to the functional API
- Citing the first paper to mention prelu
- Adding documentation saying that prelu layer should be created using PReluConfig
Added links for more info. Added shape info at some places.
Provide documentation for the Gru module, including its configuration and usage. Include a link to the paper that introduced the Gated Recurrent Unit (GRU) and specify that the module should be created using GruConfig. Also, mention that the forward function returns a state tensor with specific dimensions.
Adding documentation:
- Says to use config to create the layers
- Add mathematical formula to the pwff forward pass
- Add citation in the pwff to the "Attention is all you need" paper
Provide documentation for the Lstm and BiLstm modules, including their configurations and usage. Include links to the papers that introduced Long Short-Term Memory (LSTM) and Bidirectional LSTM. Specify that the modules should be created using LstmConfig and BiLstmConfig respectively.
Adding documentation stating to use the config to create the layer
…odules

Added references to the burn_tensor associated functions. Added links between the struct and the config.
Added references to the burn_tensor associated functions. Added links between the struct and the config.
Added references to the burn_tensor associated functions. Added links between the struct and the config.
Added references to the burn_tensor associated functions. Added links between the struct and the config. Removed the backend generic from the config since it's not needed (might be a breaking change).
Copy link
Member

@laggui laggui left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks for the major clean-up 🧹 🙏

Some minor comments to address, mainly about the form to be consistent in the changes.

Comment on lines +72 to +76
// Shape `[batch_size, seq_length_1, d_model]`
query: Tensor<B, 3>,
// Shape `[batch_size, seq_length_2, d_model]`
key: Tensor<B, 3>,
// Shape `[batch_size, seq_length_2, d_model]`
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

To document struct fields we use a line doc (three forward slashes ///).

use super::checks;

/// Configuration to create an [1D convolution](Conv1d) layer.
/// Configuration to create an [1D convolution](Conv1d) layer using the [init function](Conv1dConfig::init).
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This was already there before.. but if we're fixing the docs it should be "to create a 1D convolution"


/// Configuration to create an [2D convolution](Conv2d) layer.
/// Configuration to create an [2D convolution](Conv2d) layer, using the [init function](Conv2dConfig::init).
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Same as with the conv1d: should be "to create a 2D convolution".

Also, there's no need for the comma in ", using the init function."

Comment on lines -10 to 16
#[derive(Module, Debug)]
pub struct LeakyRelu<B: Backend> {
///
/// Should be created with [LeakyReluConfig](LeakyReluConfig).
#[derive(Module, Clone, Debug)]
pub struct LeakyRelu {
/// The negative slope.
pub negative_slope: f64,
phantom: PhantomData<B>,
}
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

👌 👌

Comment on lines +28 to +32
/// - `X` is the input tensor
/// - `norm` is the normalization function
/// - `γ` is the learnable weight
/// - `β` is the learnable bias
/// - `Y` is the output tensor
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

For documentation purposes maybe have X and Y together, either at the end or top of the enumeration:

  • X is the input tensor
  • Y is the output tensor
  • norm is the normalization function
  • γ is the learnable weight
  • β is the learnable bias

/// Configuration to create an [2D max pooling](MaxPool2d) layer.
use crate::tensor::module::max_pool2d;

/// Configuration to create an [2D max pooling](MaxPool2d) layer using the [init function](MaxPool2dConfig::init).
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Was already there, but should be: "to create a 2D max pooling"..


#[cfg(not(feature = "std"))]
use num_traits::Float;

/// Configuration to create an [PositionalEncoding](PositionalEncoding) layer.
/// Configuration to create an [PositionalEncoding](PositionalEncoding) layer using the [init function](PositionalEncodingConfig::init).
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Same comment. Should be: "to create a PositionalEncoding"..

@@ -186,6 +190,10 @@ pub struct BiLstmConfig {
}

/// The BiLstm module. This implementation is for Bidirectional LSTM.
///
/// Introduced in the paper: [Framewise phoneme classification with bidirectional LSTM and other neural network architectures](https://www.cs.toronto.edu/~graves/ijcnn_2005.pdf).
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Nice literature review 😄 I wasn't aware of the origins for bidirectional LSTM

@@ -203,7 +205,7 @@ impl<B: Backend> TransformerDecoderLayer<B> {
norm_first: config.norm_first,
}
}

/// forward pass for the TransformerDecoder layer
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Let's keep a similar capitalization scheme for the line docs.

Suggestion:
Applies the TransformerDecorder forward pass to the input tensor.

///
/// # Params
///
/// - linear inner: Linear layer with `d_model` input features and `d_ff` output features.
/// - linear outer: Linear layer with `d_ff` input features and `d_model` output features.
///
/// FFN(x) = max(0, xW1 + b1)W2 + b2
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Keep the math in ticks?

FFN(x) = max(0, xW1 + b1)W2 + b2

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

4 participants