Skip to content

Version 23.1.0.0

Latest
Compare
Choose a tag to compare
@BobbyRBruce BobbyRBruce released this 28 Dec 21:12
· 9 commits to stable since this release
bae3487

gem5 Version 23.1 is our first release where the development has been on GitHub.
During this release, there have been 362 pull requests merged which comprise 416 commits with 51 unique contributors.

Significant API and user-facing changes

The gem5 build can is now configured with kconfig

  • Most gem5 builds without customized options (excluding double dash options) (e.g. , build/X86/gem5.opt) are backwards compatible and require no changes to your current workflows.
  • All of the default builds in build_opts are unchanged and still available.
  • However, if you want to specialize your build. For example, use customized ruby protocol. The command scons PROTOCOL=<PROTOCAL_NAME> build/ALL/gem5.opt will not work anymore. you now have to use scons <kconfig command> to update the ruby protocol as example. The double dash options (--without-tcmalloc, --with-asan and so on) are still continue to work as normal.
  • For more details refer to the documentation here: kconfig documentation

Standard library improvements

WorkloadResource added to resource specialization

  • The Workload and CustomWorkload classes are now deprecated. They have been transformed into wrappers for the obtain_resource and WorkloadResource classes in resource.py, respectively.
  • Code utilizing the older API will continue to function as expected but will trigger a warning message. To update code using the Workload class, change the call from Workload(id='resource_id', resource_version='1.0.0') to obtain_resource(id='resource_id', resource_version='1.0.0'). Similarly, to update code using the CustomWorkload class, change the call from CustomWorkload(function=func, parameters=params) to WorkloadResource(function=func, parameters=params).
  • Workload resources in gem5 can now be directly acquired using the obtain_resource function, just like other resources.

Introducing Suites

Suites is a new category of resource being introduced in gem5. Documentation of suites can be found here: suite documentation.

Other API changes

  • All resource object now have their own id and category. Each resource class has its own __str__() function which return its information in the form of category(id, version) like BinaryResource(id='riscv-hello', resource_version='1.0.0').
  • Users can use GEM5_RESOURCE_JSON and GEM5_RESOURCE_JSON_APPEND env variables to overwrite all the data sources with the provided JSON and append a JSON file to all the data source respectively. More information can be found here.

Other user-facing changes

  • Added support for clang 15 and clang 16
  • gem5 no longer supports building on Ubuntu 18.04
  • GCC 7, GCC 9, and clang 6 are no longer supported
  • Two DRAMInterface stats have changed names (bytesRead and bytesWritten). For instance, board.memory.mem_ctrl.dram.bytesRead and board.memory.mem_ctrl.dram.bytesWritten. These are changed to dramBytesRead and dramBytesWritten so they don't collide with the stat with the same name in AbstractMemory.
  • The stats for NVMInterface (bytesRead and bytesWritten) have been change to nvmBytesRead and nvmBytesWritten as well.

Full-system GPU model improvements

  • Support for up to latest ROCm 5.7.1.
  • Various changes to enable PyTorch/TensorFlow simulations.
  • New packer disk image script containing ROCm 5.4.2, PyTorch 2.0.1, and Tensorflow 2.11.
  • GPU instructions can now perform atomics on host addresses.
  • The provided configs scripts can now run KVM on more restrictive setups.
  • Add support to checkpoint and restore between kernels in GPUFS, including adding various AQL, HSA Queue, VMID map, MQD attributes, GART translations, and PM4Queues to GPU checkpoints
  • move GPU cache recorder code to RubyPort instead of Sequencer/GPUCoalescer to allow checkpointing to occur
  • add support for flushing GPU caches, as well as cache cooldown/warmup support, for checkpoints
  • Update vega10_kvm.py to add checkpointing instructions

SE mode GPU model improvements

  • started adding support for mmap'ing inputs for GPUSE tests, which reduces their runtime by 8-15% per run

GPU model improvements

  • update GPU VIPER and Coalescer support to ensure correct replacement policy behavior when multiple requests from the same CU are concurrently accessing the same line
  • fix bug with GPU VIPER to resolve a race conflict for loads that bypass the TCP (L1D$)
  • fix bug with MRU replacement policy updates in GPU SQC (I$)
  • update GPU and Ruby debug prints to resolve various small errors
  • Add configurable GPU L1,L2 num banks and L2 latencies
  • Add decodings for new MI100 VOP2 insts
  • Add GPU GLC Atomic Resource Constraints to better model how atomic resources are shared at GPU TCC (L2$)
  • Update GPU tester to work with both requests that bypass all caches (SLC) and requests that bypass only the TCP (L1D$)
  • Fixes for how write mask works for GPU WB L2 caches
  • Added support for WB and WT GPU atomics
  • Added configurable support to better model the latency of GPU atomic requests
  • fix GPU's default number of HW barrier/CU to better model amount of concurrency GPU CUs should have

RISC-V RVV 1.0 implemented

This was a huge undertaking by a large number of people!
Some of these people include Adrià Armejach who pushed it over the finish line, Xuan Hu who pushed the most recent version to gerrit that Adrià picked up,
Jerin Joy who did much of the initial work, and many others who contributed to the implementation including Roger Chang, Hoa Nguyen who put significant effort into testing and reviewing the code.

  • Most of the instructions in the 1.0 spec implemented
  • Works with both FS and SE mode
  • Compatible with Simple CPUs, the O3, and the minor CPU models
  • User can specify the width of the vector units
  • Future improvements
    • Widening/narrowing instructions are not implemented
    • The model for executing memory instructions is not very high performance
    • The statistics are not correct for counting vector instruction execution

ArmISA changes/improvements

  • Architectural support for the following extensions:
  • FEAT_TLBIRANGE
  • FEAT_FGT
  • FEAT_TCR2
  • FEAT_SCTLR2

Other notable changes/improvements

  • Improvements to the CHI coherence protocol implementation
  • Far atomics implemented in CHI
  • Ruby now supports using the prefetchers from the classic caches, if the protocol supports it. CHI has been extended to support the classic prefetchers.
  • Bug in RISC-V TLB to fixed to correctly count misses and hits
  • Added new RISC-V Zcb instructions #399
  • RISC-V can now use a separate binary for the bootloader and kernel in FS mode
  • DRAMSys integration updated to latest DRAMSys version (5.0)
  • Improved support for RISC-V privilege modes
  • Fixed bug in switching CPUs with RISC-V
  • CPU branch preditor refactoring to prepare for decoupled front end support
  • Perf is now optional when using the KVM CPU model
  • Improvements to the gem5-SST bridge including updating to SST 13.0
  • Improved formatting of documentation in stdlib
  • By default use isort for python imports in style
  • Many, many testing improvements during the migration to GitHub actions
  • Fixed the elastic trace replaying logic (TraceCPU)

Known Bugs/Issues