Release v0.12.0 · tracel-ai/burn

This release highlights an optimized Wgpu Backend, clearer examples and documentation, and numerous bug fixes.
Notably, breaking changes in device management mandate explicit device specification to prevent potential bugs.
Additionally, the new PyTorch recorder simplifies model porting by enabling automatic import of PyTorch's weights.
We also put a lot of efforts into improving our CI infrastructure for enhanced reliability, efficiency, and scalability.

Changes

Tensor & Module API

Added support for generic modules #1147 @nathanielsimard
Added support for tuple modules #1186 @varonroy
Enabled loading PyTorch .pt (weights/states) files directly to module's record, currently available on Linux & MacOS #1085 @antimora
Added mish and softplus activation functions #1071 @pacowong
Improved chunk performance in backends @1032 @Kelvinyu1117
[Breaking] Added the device as an argument for tensor operations that require it, replacing the previous optional device usage #1081 #518 #1110 @kpot
- Code update involves either using Default::default for the same behavior or specifying the desired device.
Allowed raw tensors to be serialized/deserialized directly with serde #1041 @jmacglashan
[Breaking] Forced the choice of the device for deserialization #1160 #1165 @nathanielsimard
Added element-wise pow operation #1133 @skewballfox
Refactored the tensor backend API names #1174 @skewballfox
[Breaking] Changed the default recorder to NamedMpkFileRecorder #1161 #1151 @laggui
- After a bit of exploration, we removed any type of compression because it adds to much overhead

Examples & Documentation

Updated the text-classification example #1044 @nathanielsimard
Fixed import and type redefinitions in mnist-web-inference #1100 @syl20bnr
Fixed documentation of Tensor::stack #1105 @PonasKovas
Fixed some typos in links in the burn-book #1127 @laggui
Added an example for a custom CSV dataset #1129 #1082 @laggui
Fixed missing ticks in Burn book and removed unused example dependency #1144 @laggui
Added a new example for regression problems #1150 #1148 @ashdtu
Added model saving and loading examples in the book #1164 #1156 @laggui
Added Rust concept notes and explanations to the Burn Book #1169 #1155 @laggui
Fixed jupyter notebook and ONNX IR example #1170 @unrenormalizable
Added a custom mnist dataset, removing the Python dependency for running the guide and the mnist example #1176 #1157 @laggui
Updated documentation and book sections on PyTorch import #1180 @antimora
Updated burn-book with improved tensor documentation #1183 #1103 @ashdtu
Updated burn-book with a new dataset transforms section #1183 #1154 @ashdtu
Update CONTRIBUTING.md with code guidelines. #1134 @syl20bnr
Fixed documentation of Multi Head Attention #1205 @ashdtu

Wgpu Backend

Optimized the repeat operation with a new kernel #1068 @louisfd
Improved reduce autotune by adding the stride to the autotune key #1070 @louisfd
Refactored binary operations to use the new JIT compiler IR #1078 @nathanielsimard
Added persistent cache for autotune #1087 @syl20bnr

Fusion

Refactored burn-fusion, making it possible to eventually save the JIT state #1104 @nathanielsimard
Improved fusion in the Wgpu backend with caching #1069 @nathanielsimard
Supported fusing int operations with burn-fusion #1093 @nathanielsimard
Supported automatic vectorization of operations fused with burn-fusion in WGPU #1123 #1111 @nathanielsimard
Supported automatically executing in-place operations fused with burn-fusion in WGPU #1128 #1124 @nathanielsimard
Heavily refactored burn-fusion to better reflect the stream optimization process #1135 @nathanielsimard
Heavily refactored burn-fusion to save all execution plans for any trigger #1143 @nathanielsimard
Supported multiple concurrent optimization streams #1149 #1117 @nathanielsimard
Supported overlapping optimization builders #1162 @nathanielsimard
Supported fusing ones, zeroes, and full operations #1159 @nathanielsimard
Supported autotuning fused element-wise kernels #1188 #1112 @nathanielsimard

Infra

Support testing accelerate(MacOS) on the burn-ndarray backend #1050 @dcvz
Improved CI output by introducing groups #1024 @dcvz
Updated scheduled CI tasks #1028 @Luni-4
Added support for Windows Pipeline #925 @Luni-4
Fixed CI for testing the wgpu backend by pinning versions #1120 @syl20bnr
Fixed burn-compute build command with no-std #1109 @syl20bnr
Temporarily disabled unnecessary steps on Windows runners to save CI time #1107 @syl20bnr
Refactored serialization of backend comparison benchmarks #1131 @syl20bnr
Fixed doc build on docs.rs #1168 @syl20bnr
Added cargo xtask commands for dependencies and vulnerabilities checks #1181 #965 @syl20bnr
Added cargo xtask command to manage books #1192 @syl20bnr

Chore

Shared some properties across the cargo workspace #1039 @dcvz
Formatted the codebase with nightly where the stable version falls short #1017 @AlexErrant
Improved panic messages on the web @1051 @dcvz
Used web-time in wasm #1060 @sigma-andex
Refactored some tensor tests #1089 @nathanielsimard
Made Embedding weights public #1094 @unrenormalizable
Updated candle version and added support for slice_assign #1095 @louisfd
Records no longer require Debug and Clone #1137 @nathanielsimard
Removed cargo warning #1108 @syl20bnr
Updated wgpu version to 0.19.0 #1166 @nathanielsimard
Added tests for Slice assign vs Cat in LSTM backward #1146 @louisfd
Updated xtask publish task #1189 @Luni-4
Enable dependabot daily #1195 @Luni-4
Updated Ratatui version #1204 @nathanielsimard
Updated tch version #1206 @laggui

Bug Fixes

Fixed a slice issue in the LibTorch backend that could corrupt tensors' data #1064 #1055 @nathanielsimard
Fixed issues with tensor stack and reshape on ndarray #1053 #1058 @AuruTus
Fixed multithread progress aggregation in dataloader #1083 #1063 @louisfd
Resolved a numerical bug with tanh on MacOS with Wgpu #1086 #1090 @louisfd
Fixed burn-fusion, where only operations followed by a sync were fused #1093 @nathanielsimard
Removed the requirement for users to add serde as a dependency for Burn #1091 @nathanielsimard
Fixed transformer prenorm on the residual path #1054 @Philonoist
Fixed conv2d initialization by supporting fan_out #1138 @laggui
Resolved the problem of sigmoid gradient generating NaN #1140 #1139 @wcshds
Fixed FullPrecisionSettings type for integers #1163 @laggui
Fixed batchnorm not working properly when training on multiple devices #1167 @wcshds
Fixed powf function in WGPU, with new tests #1193 #1207 @skewballfox @louisfd
Fixed regex in PyTorch Recorder #1196 @antimora

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

v0.12.0