Part 1: Parallel Algorithms and Data Structures -- Paulius Micikevicius, NVIDIA 1 Large-Scale GPU Search 2 Edge volume Node Parallelism for Graph Centrality Metrics 3 Optimizing parallel prefix operations for the Fermi architecture 4 Building an Efficient Hash Table on the GPU 5 An Efficient CUDA Algorithm for the Maximum Network Flow Problem 6 On Improved Memory Access Patterns for Cellular Automata Using CUDA 7 Fast Minimum Spanning Tree Computation on Large Graphs 8 Fast in-place sorting with CUDA based on bitonic sort Part 2: Numerical Algorithms -- Frank Jargstorff, NVIDIA 9 Interval Arithmetic in CUDA 10 Approximating the erfinv Function 11 A Hybrid Method for Solving Tridiagonal Systems on the GPU 12 LU Decomposition in CULA 13 GPU Accelerated Derivative-free Optimization Part 3: Engineering Simulation -- Peng Wang, NVIDIA 14 Large-scale gas turbine simulations on GPU clusters 15 GPU acceleration of rarefied gas dynamic simulations 16 Assembly of Finite Element Methods on Graphics Processors 17 CUDA implementation of Vertex-Centered, Finite Volume CFD methods on Unstructured Grids with Flow Control Applications 18 Solving Wave Equations on Unstructured Geometries 19 Fast electromagnetic integral equation solvers on graphics processing units (GPUs) Part 4: Interactive Physics for Games and Engineering Simulation -- Richard Tonge, NVIDIA 20 Solving Large Multi-Body Dynamics Problems on the GPU 21 Implicit FEM Solver in CUDA 22 Real-time Adaptive GPU multi-agent path planning Part 5: Computational Finance -- Thomas Bradley, NVIDIA 23 High performance finite difference PDE solvers on GPUs for financial option pricing 24 Identifying and Mitigating Credit Risk using Large-scale Economic Capital Simulations 25 Financial Market Value-at-Risk Estimation using the Monte Carlo Method Part 6: Programming Tools and Techniques -- Cliff Wooley, NVIDIA 26 Thrust: A Productivity-Oriented Library for CUDA 27 GPU Scripting and Code Generation with PyCUDA 28 Jacket: GPU Powered MATLAB Acceleration 29 Accelerating Development and Execution Speed with Just In Time GPU Code Generation 30 GPU Application Development, Debugging, and Performance Tuning with GPU Ocelot 31 Abstraction for AoS and SoA Layout in C++ 32 Processing Device Arrays with C++ Metaprogramming 33 GPU Metaprogramming: A Case Study in Biologically-Inspired Machine Vision 34 A Hybridization Methodology for High-Performance Linear Algebra Software for GPUs 35 Dynamic Load Balancing using Work-Stealing 36 Applying software-managed caching and CPU/GPU task scheduling for accelerating dynamic workloads
0
"Since the introduction of CUDA in 2007, more than 100 million computers with CUDA capable GPUs have been shipped to end users. GPU computing application developers can now expect their application to have a mass market. With the introduction of OpenCL in 2010, researchers can now expect to develop GPU applications that can run on hardware from multiple vendors"--