Although the general README mentions that certain benchmarks require GPU-aware MPI, it's not explicitly mentioned anywhere that pingpong-hip is one of those benchmarks. I have two suggestions:
- Add a README.md file to pingpong-hip (and probably pingpong-cuda too) explicitly mentioning that it requires GPU-aware MPI.
- Add a fallback implementation for main-mpi that doesn't require GPU-aware MPI. If it's available, we use GPU-aware MPI, if it's not, we explicitly do mem copies and use host pointers for MPI calls instead of device pointers. If this is of interest, I do have a patch for this that I can share.
Although the general README mentions that certain benchmarks require GPU-aware MPI, it's not explicitly mentioned anywhere that pingpong-hip is one of those benchmarks. I have two suggestions: