Skip to content

2024.04.11 Meeting Notes

Philipp Grete edited this page Apr 25, 2024 · 3 revisions

Tentative agenda

  • Individual/group updates
  • MPI/Slingshot issue
  • (Additional) ideas for Hackathon
  • review non-WIP PRs

Update

JM

  • IO update for non-cell centered vars done!!!
    • includes cleanup/refactoring
  • Added support for coordinates fields (threaded through to xdmf)
    • support arb. coordinates transformation in Visit and Paraview
    • should be useful for GR and Lagrangian codes

BP

  • discovered that our current single thread task list can cause issues because threads can currently die silently
  • now investigating how to fix this, so that the code can die gracefully -- more complex than initially anticipated
    • current design is potentially complicating things as there's no main thread and threads queue their own tasks
    • similarly, tasks rely on returning (i.e., making a simple check for futures complicated)
    • quick fix is to keep check on void and TaskStatus futures (so not fully supporting arb return types)

PM

  • SwarmPacks works and interface seems reasonable
  • basic design info right now
    • particles only work with base register
    • still some code duplication (so could potentially be merged with SparsePacks in the long run)
    • Pack caching currently disabled/not working as particles views internally can be resized
  • Discovered multi-d par_for inner and added functionality for reductions
    • will share with BP and then upstreams

LR

  • Forest got merged
  • Waiting for review on multiple PRs
  • Unified flux correction (and fluxes are their own variables)
  • Updating logical boundary communication to support multiple trees
  • Investigating logical, topological coordinate transformations between trees (as i,j,k indices do not need to be aligned)

PG

  • Fighting with OpenPMD restarts (attributes and metadata work, reading blocks is still missing)
  • encountered issue with h5py when trying to copy the Info group
    • code just segfaults, unclear why, not going to fix/investigate as copying contents from that group (e.g., Attributes works fine)

MPI/Slingshot issue

  • GS found issue on Slingshot interconnect (likely related to number of concurrent sends which hits an internal MPI lib limit)
    • currently unclear if its a real issue or sth at the library
  • slightly related, JM is contact with local postdoc looking at alternative MPI approach (for dense variables)
  • we'll wait for additional info from Cray/HPE

(Additional) ideas for Hackathon

  • best practice performance doc
    • default pack sizes and loop patterns
  • investigate performance of loop patterns on various platforms (and new par for patterns, e.g., MDRange in hierarchical parallelism versus index split for vectorization)
  • related to MPI/Slingshot issue: combine message between ranks

Tentative next meeting 25 Apr

Clone this wiki locally