Skip to content

v0.5.0 - Enchanting Elderberry

Latest
Compare
Choose a tag to compare
@psalz psalz released this 21 Dec 14:36
· 36 commits to master since this release

Right on time for the holidays we bring you a new major release with several new features, quality of life improvements and debugging facilities.

Thanks to everybody who contributed to this release: @fknorr, @GagaLP, @PeterTh, @psalz!

HIGHLIGHTS

  • The distr_queue::fence and buffer_snapshot APIs introduced in Celerity 0.4.0 are now stable (#225).
  • It some situations it may be necessary to prevent kernels from being split in a certain way (for example to prevent overlapping writes); this can now be achieved using the new experimental::constrain_split API (#212).
  • Speaking of splits, the new experimental:hint API can be used to control how a kernel is split across worker nodes (#227).
  • Celerity now warns at runtime when a task declares reads from uninitialized buffers or writes with overlapping ranges between nodes (#224).
  • The accessor out-of-bounds detection first introduced in Celerity 0.4.0 now also supports host tasks (#211).

Changelog

We recommend using the following SYCL versions with this release:

  • DPC++: 61e51015 or newer
  • hipSYCL: d2bd9fc7 or newer

Added

  • Add new environment variable CELERITY_PRINT_GRAPHS to control whether task and command graphs are logged (#197, #236)
  • Introduce new experimental for_each_item utility to iterate over a celerity range (#199)
  • Add new environment variables CELERITY_HORIZON_STEP and CELERITY_HORIZON_MAX_PARALLELISM to control Horizon generation (#199)
  • Add support for out-of-bounds checking for host accessors (also enabled via CELERITY_ACCESSOR_BOUNDARY_CHECK) (#211)
  • Add new debug::set_task_name utility for naming tasks to aid debugging (#213)
  • Add new experimental::constrain_split API to limit how a kernel can be split (#212)
  • Add GDB pretty-printers for common Celerity types (#207)
  • distr_queue::fence and buffer_snapshot are now stable, subsuming the experimental:: APIs of the same name (#225)
  • Celerity now warns at runtime when a task declares reads from uninitialized buffers or writes with overlapping ranges between nodes (#224)
  • Introduce new experimental::hint API for providing the runtime with additional information on how to execute a task (#227)
  • Introduce new experimental::hints::split_1d and experimental::hints::split_2d task hints for controlling how a task is split into chunks (#227)

Changed

  • Horizons can now also be triggered by graph breadth. This improves performance in some scenarios, and prevents programs with many independent tasks from running out of task queue space (#199)

Fixed

  • In edge cases, command graph generation would fail to generate await-push commands when re-distributing reduction results (#223)
  • Command graph generation was missing an anti-dependency between push-commands of partial reduction results and the final reduction command (#223)
  • Don't create multiple smaller push-commands instead of a single large one in some rare situations (#229)
  • Unit tests that inspect logs contained a race that would cause spurious failures (#234)

Internal

  • Improve command graph testing infrastructure (#198)
  • Overhaul internal grid region and box representation, remove AllScale dependency (#204)