Skip to content

2022.01.27 Meeting Notes

Philipp Grete edited this page Jan 27, 2022 · 2 revisions

Agenda

  • Individual/group updates
  • Paper
  • Kokkos 3.5
  • Load balancing/CMU collaboration
  • Review non-WIP PRs

Individual/group updates

JM:

  • Getting reducers finalized and merged.
  • Added content to paper

-BR:

  • Added "sorted by particle" capability
  • Added MC figure to paper

JD:

  • Fixed sparse id map
  • Talked to Luke Roberts who will take on working on Sparse

FG:

  • Downstream dev in AthenaPK trying to get AGN picture for paper

Jim:

  • Downstream app development

PG:

  • Reviewed PRs
  • Finalized paper draft, new section on perf port.
  • Failed with sycl and GPUs
  • Got access to Spock to collect AMD GPU numbers
  • Got access to Ookami (a64fx) system to collect more number
  • Reviewed sparse and fixed final pieces. Now merged

Paper

  • Author list: rearrange by commits following PG
  • Add acknowledgement of Athena++ team, esp. Kengo and Kyle
  • Need: in design goal mention implicit knobs in abstractions for specific hw. JM will add.
  • Plan: have final numbers and text by end of next week. Share final draft for review until the next meeting. Then submit.

Kokkos 3.5

  • Need separate PR so that JD can test downstream.
  • Otherwise, no objections for bump.
  • Target new release (with updated Kokkos, extended build support, and sparse in place)

Load balancing

  • Parallel data lab (George Amvrosiadis, Chuck Cranor, Ankush Jain) with existing collab with LANL

  • Recently got interested in AMR - specifically task placement on nodes.

  • Key questions:

    • What are typical applications loads look like?
    • What are popular AMR decks to play with?
    • What are bottlenecks that we've seen in practice and how can we reproduce it?
  • What is meant by load balancing? Both

    • different computational load per block/rank, as well as
    • load balance due to suboptimal comm. patterns
  • Talked about

    • Blast wave test, RT, or KH tests
    • MeshBlockPacking and buffer filling (async kernel launches with fences)
    • Particles (which may introduce imbalance in per block cost [and comm])
    • Influence of MeshBlock sizes (smaller are harder -- reasonable limit right now prob. at 16^3, but 8^3 are of interest, too)
  • Next steps:

    • We'll share input decks for typical problems
    • CMU will join future meetings
    • In between, use Matrix for questions/quick comm.
  • Review non-WIP PRs PG now really takes care of https://github.com/lanl/parthenon/pull/613

Clone this wiki locally