GPU-based simplified simulation prototypes

R&D task number: G4RD9

Initial simplified GPU-based particle transport simulation for electrons and gamma

The current Geant4 code base is large and uses many C++ features that make it unsuitable for porting efficiently to accelerators in its present form. The project proposes creating few prototypes running on such accelerators (GPUs in a first phase), based on simplified models that can be gradually evolved to more comprehensive simulation applications. This approach allows understanding design, implementation and portability aspects related to running particle transport simulation on both host (CPU) and device (GPU), based simple models with only few components (easy to understand, change, debug and optimize).

A first prototype will implement simple ray tracing through a tracker detector geometry. This aims to run on both CPU and GPU using the same code base, and will be evolved embed realistic features of detector simulation (propagation in magnetic field, mock-up physics processes). This can teach valuable lessons about strategies for making detector simulation more device agnostic in the future.

The code produces a ray-traced image of a realistic tracker detector geometry. The geometry is represented using our SIMD-accelerated geometry library VecGeom that handles also ray-model collisions. The code is implemented for the time being as a utility in VecGeom repository, but will be moved to a separate repository when embedding other components (field, physics models)
The program reads the geometry from GDML (an XML-based format), and optionally transfers it to the GPU. The main program launches a kernel that traces each individual ray at (px, py) and returns the pixel color. The only package to be installed is VecGeom.

Goals

Demonstrate code portability using VecGeom CUDA backend in a first phase, but understand also how to do this using portability libraries in future
Learn how to deal with a realistic particle detector geometry in the context of accelerators
Deal with a variable number of rays (particles) executing a kernel several times (in future we will have to deal with both spawning and extinction changing the set of particles at each step)
Measure code efficiency and scaling on CPU and GPU. Try alternative scheduling models for rays and compare their relative efficiency.

Project lead: Andrei Gheata, Guilherme Amadio, John Apostolakis

Effort estimate

Repository link