CCE Offloading Models and Porting Applications to GPUs¶
Presenter: Alfio Lazzaro (HPE)
Archived materials on LUMI:
-
Slides:
/appl/local/training/2p3day-20250303/files/LUMI-2p3day-20250303-304-Offload_CCE.pdf
-
Recording:
/appl/local/training/2p3day-20250303/recordings/304-Offload_CCE.mp4
These materials can only be distributed to actual users of LUMI (active user account).
Q&A¶
-
In case I have a complex Struct in C++ how can I map that with OpenMP on GPU?
How can I map Data?struct Data{ subStruct Vc; double **array_of_array; double max_mach; double ... double ... }
- (Alfio) In OpenMP you can use Mapper Identifiers and mapper Modifiers. See this example
Managed can help in this case (without deep copy)?
- Yes, indeed (note that managed is what C++ parellel frameworks do, e.g. SYCL)
If he can show some example it would be helpful.
-
This is an example that might need unified memory to work. And it will likely only be truly efficient on something like an MI300A.
-
Sorry for the repetition, can you clarify better the differences between managed and unified memory
- (Alfio) Unified memory is related to OpenMP (aka Unified Shared Memory, USM), which uses managed memory in HIP (see HipMallocManaged in "HIP Runtime API Reference: Managed Memory")