This event has ended. Create your own event → Check it out
This event has ended. Create your own
View analytic
Wednesday, March 2 • 3:50pm - 4:10pm
Algorithms & Accelerators I: Efficient Reverse Time Migration on APU Clusters

Sign up or log in to save this to your schedule and see who's attending!


Reverse Time Migration (RTM) is a numerical method for seismic imaging that is widely used in the Oil & Gas industry. RTM is using the two-way travel time of wave propagations to place (or migrate) dipping temporal events in their true subsurface spatial locations. Processing these reflections produces a synthetic image of the subsurface geologic structure. The RTM workflow is very time-consuming and resource-demanding in terms of compute power, memory bandwidth and storage capacities as prodigious amounts of data (terabytes of data) are generated during computations (namely wavefield snapshots). Thanks to the advent of high performance computing facilities, RTM is seeing significant acceleration using clusters of multi-core CPUs [1, 2, 3], and GPUs (Graphic Processing Units) [4, 5, 6, 7]. Although CPU clusters have shown performance gain by spreading large datasets over connected compute nodes, and even larger performance enhancements can be achieved by GPU clusters thanks to the massively parallel architecture of the GPUs, the GPU based solutions are suffering from some limitations, namely small GPU memory capacities, overheads incurred by the PCI interconnection between CPU and GPU that may bottleneck the seismic imaging applications and high power consumptions.

Recently, AMD has released a new architecture, the Accelerated Processing Unit (APU) that combines CPU cores and GPU cores in the same silicon die. This hardware design benefits from both CPU and GPU advantages and suppresses PCI Express interconnect between the CPU and the GPU. The APU hardware design can thus be an attractive solution for efficient depth imaging. Besides the APU can be considered as a low-power HPC chip (between 60 and 95 of watts of TDP). However, the integrated GPUs are about one order of magnitude less compute powerful and have less internal memory bandwidth than high-end discrete GPUs. Moreover, multiple compute nodes are required in order to process realistic RTM cases: the impact of the APU architecture on the communications has thus also to be considered.

We focus our interest on the implementation and deployment of the 3D acoustic RTM in isotropic media on an APU cluster using Fortran+OpenCL+MPI. We rely on a three-dimensional 8th order finite difference approximation of the acoustic wave equation to simulate the wave propagation during both the forward and backward sweeps of the RTM algorithm, and use the selective checkpointing method (with a data checkpointing frequency equal to 10) to reconstruct the source wavefield.

In this talk, we give an overview of the application implementation details with a particular emphasis on the impact of the APU new unified memory model on the RTM workflow. Then, we present the OpenCL single precision performance results and the power efficiency estimation of the RTM on a 16-node cluster, each node having an A10-7850 APU (code-named Kaveri). We show the relevance of APUs in a seismic imaging context by presenting the pros and cons of such HPC platforms for the RTM, and compare it against the traditional solutions by means of strong and weak scaling testings.


Henri Calandra

Henri Calandra obtained his M.Sc. in mathematics in 1984 and a Ph.D. in mathematics in 1987 from the Universite des Pays de l’Adour in Pau, France. He joined Cray Research France in 1987 and worked on seismic applications. In 1989 he joined the applied mathematics department of the French Atomic Agency. In 1990 he started working for Total SA. After 12 years of work in high performance computing and as project leader for Pre-stack Depth... Read More →
avatar for Issam Said

Issam Said

Computational Scientist, TOTAL/LIP6
My current research interests include high performance computing (HPC), numerical methods and applied geophysics. I specialize in adapting scientific applications to hardware accelerators, mainly Graphics Processing Units (GPUs). It implies improving the performance of numerical solvers, surveying cutting edge hardware/architectures and studying the viability of scientific friendly programming models such as OpenACC. My current work involves... Read More →

Attendees (5)