CCE Offloading Models and Porting Applications to GPUs¶

Presenter: Alfio Lazzaro (HPE)

Archived materials on LUMI:

Slides: /appl/local/training/2p3day-20250303/files/LUMI-2p3day-20250303-304-Offload_CCE.pdf
Recording: /appl/local/training/2p3day-20250303/recordings/304-Offload_CCE.mp4

These materials can only be distributed to actual users of LUMI (active user account).

Q&A¶

In case I have a complex Struct in C++ how can I map that with OpenMP on GPU?
```
struct Data{
  subStruct Vc; 
  double **array_of_array;
  double max_mach;
  double ... 
  double ...
}
```
How can I map Data?
- (Alfio) In OpenMP you can use Mapper Identiﬁers and mapper Modiﬁers. See this example
Managed can help in this case (without deep copy)?
- Yes, indeed (note that managed is what C++ parellel frameworks do, e.g. SYCL)
If he can show some example it would be helpful.
- See this example
- This is an example that might need unified memory to work. And it will likely only be truly efficient on something like an MI300A.
Sorry for the repetition, can you clarify better the differences between managed and unified memory
- (Alfio) Unified memory is related to OpenMP (aka Unified Shared Memory, USM), which uses managed memory in HIP (see HipMallocManaged in "HIP Runtime API Reference: Managed Memory")