Skip to content

0.43.1: Improved CUDA setup/diagnostics + 8-bit serialization, CUDA 12.4 support, docs enhancements

Latest
Compare
Choose a tag to compare
@Titus-von-Koeller Titus-von-Koeller released this 11 Apr 18:36
· 30 commits to main since this release

Improvements:

  • Improved the serialization format for 8-bit weights; this change is fully backwards compatible. (#1164, thanks to @younesbelkada for the contributions and @akx for the review).
  • Added CUDA 12.4 support to the Linux x86-64 build workflow, expanding the library's compatibility with the latest CUDA versions. (#1171, kudos to @matthewdouglas for this addition).
  • Docs enhancement: Improved the instructions for installing the library from source. (#1149, special thanks to @stevhliu for the enhancements).

Bug Fixes

  • Fix 4bit quantization with blocksize = 4096, where an illegal memory access was encountered. (#1160, thanks @matthewdouglas for fixing and @YLGH for reporting)

Internal Improvements: