LNCS sublibrary. SL 2, Programming and software engineering
Volume Designation
11381
GENERAL NOTES
Text of Note
Includes author index.
Text of Note
International conference proceedings.
CONTENTS NOTE
Text of Note
Intro; 2018: 5th Workshop on Accelerator Programming Using Directives (WACCPD) http://waccpd.org/; Organization; Contents; Applications; Heterogeneous Programming and Optimization of Gyrokinetic Toroidal Code Using Directives; Abstract; 1 Introduction; 2 Simulation Platforms: Titan, SummitDev, and Summit; 3 Scientific Methods of GTC; 4 Porting and Optimization Strategy; 5 GPU Porting Status; 6 Performance; 6.1 Solver Performance Improvement; 6.2 Scaling Performance; 6.3 Tests on SummitDev; 6.4 Performance and Scalability on Summit; 7 Conclusion; Acknowledgments; References
Text of Note
2 Background on Warp Specialization and Elision3 Fission of Multiple-Parallel-Region Target Regions; 4 Overlapping Data Transfer and Split Kernel Execution; 5 Pipelining Data Transfer and Parallel Loop Execution; 6 Custom Grid Geometry; 7 Estimating Potential Benefits of Transformations; 7.1 Combining Kernel Splitting with Elision Improves Performance; 7.2 Elision Amplifies Benefits of Custom Grid Geometry; 7.3 Pipelining Improves Performance for High Trip Counts; 8 Related Work; 9 Conclusion
Text of Note
6.1 Use of OpenACC for the Squared Distance Calculation: GPU6.2 Comparison to CUDA Kernel; 6.3 OpenACC on the CPU; 6.4 Comparison to a Purely BLAS-Based Algorithm: Lowest Programming Knowledge Required; 7 Programming Effort; 8 Conclusions; A Artifact Description Appendix: Using Compiler Directives for Performance Portability in Scientific Computing: Kernels from Molecular Simulation; A.1 Abstract; A.2 Description; References; Using OpenMP; OpenMP Code Offloading: Splitting GPU Kernels, Pipelining Communication and Computation, and Selecting Better Grid Geometries; 1 Introduction
Text of Note
A Artifact Description Appendix: OpenMP Target Offloading: Splitting GPU Kernels, Pipelining Communication and Computation, and Selecting Better Grid GeometriesA. 1 Abstract; A.2 Description; A.3 Installation; A.4 Experiment Workflow; A.5 Evaluation and Expected Results; A.6 Experiment Customization; A.7 Notes; References; A Case Study for Performance Portability Using OpenMP 4.5; 1 Introduction; 2 The GPP Kernel and Its Baseline CPU Implementation; 2.1 GPP Kernel; 2.2 Baseline CPU Implementation; 3 GPU Implementations of the GPP Kernel; 3.1 Implementation Groundwork; 3.2 OpenMP 4.5
Text of Note
Using Compiler Directives for Performance Portability in Scientific Computing: Kernels from Molecular Simulation1 Introduction; 2 Background; 2.1 Performance Portability; 2.2 Molecular Dynamics; 3 Portability Goals: Timings and Architectures; 4 Designing the Kernels; 4.1 The Programming Model and Its Portable Subset; 4.2 Modular Format and Kernels; 5 Binning Module (Neighbor-List Updates): Bin-Assign, Bin-Count, and Bin Sorting; 5.1 Bin-Assign, Bin-Count; 5.2 Parallel Algorithm Design for Bin Count and Gather; 6 The Squared Pairwise Distance Calculation: Performance, Portability, and Effort
0
8
8
8
8
SUMMARY OR ABSTRACT
Text of Note
This book constitutes the refereed post-conference proceedings of the 5th International Workshop on Accelerator Programming Using Directives, WACCPD 2018, held in Dallas, TX, USA, in November 2018. The 6 full papers presented have been carefully reviewed and selected from 12 submissions. The papers share knowledge and experiences to program emerging complex parallel computing systems. They are organized in the following three sections: applications; using openMP; and program evaluation.