Skip to content

2022.07.14 Meeting Notes

Philipp Grete edited this page Jul 14, 2022 · 3 revisions

Agenda

  • Individual/group updates
  • PR protocol
  • Developer meeting

Individual/group updates

JM

  • revisiting prolongation in one due to performance issues with using only a single team in a kernel
  • solution will also apply to restriction to reduce code duplication
  • enrolling custom prolongation/restriction operators will be beneficial for face centered field and high-order methods

LR

  • working on efficient buffer allocation for sparse variables across ranks and signaling
  • currently there's a performance issue with using IProbe to determine size incoming message to determine if sparse variables need to be allocated
  • tested three different ways, can switch between optimal memory size and optimal comm
  • finished adding state to ParArray (and make them even more generic), e.g., to allow for a per-variable sparse threshold
  • open new PR to with new sparse packing machinery including introduce a type to the packing
    • bug #668 still not fixed

GS

  • working on trying to get Parthenon/Phoebus/Riot into RFP for next gen system
  • need to determine good benchmarking case
    • potentially both Parthenon and downstream code separately

FG

  • has a mockup test problem for loop advection test for MHD (plain field loop advection)
  • using it to determine the changes in interfaces required (e.g., many interfaces assume cell centered fields such as packing)
  • idea: schedule mini hackathon to talk this through
  • will open PR with current ideas and afterwards send mail to find date for hackathon

PG

  • reviewed PR
  • should keep track in where we

AJ

  • investigating why some ranks take more time during loadbalance than others
  • split step into multiple regions: flux correction comm, bound comm, load balance
  • load imbalance (different numbers of blocks per rank) explains good chuck of different timing but not all
  • one more source is that not all ranks/blocks take part in flux correction
  • will present more in-depth data during next data and share upfront on Matrix

PR protocol

  • it currently takes a "long" time for PRs to go through
  • target: automate more or the review process, specifically include downstream codes in CI for interface and performance regression testing
  • PG will tackle the latter after reviewing sparse comm PR
  • also, small "administrative" PR can go through without waiting for separate downstream code developer review

Developer meeting

  • Added more people to list of people to invite.
    • JM will contact Charles and Ben
    • PG will contact people at MSU and Princeton
    • AJ will check with people at CMU
  • Foreign national needs about two month lead time for a visit. Everyone else about one.

Next meeting in two weeks 28. Jul

Clone this wiki locally