This event has ended. Create your own event → Check it out
This event has ended. Create your own
View analytic
Wednesday, March 2 • 3:50pm - 4:10pm
Data Analytics Approaches & Tools: Reverse Time Migration via Resilient Distributed Datasets: Towards In-Memory Coherence of Seismic-Reflection Wavefields Using Thunder via Apache Spark

Sign up or log in to save this to your schedule and see who's attending!


The need to cross-correlate two wavefields in the application of Reverse Time Migration’s imaging condition remains one of two fundamental challenges with use of the method in practice (e.g., Liu et al., Computers & Geosciences 59, 17–23, 2013). In a significant departure from previous approaches, this computational challenge is addressed here through the introduction of Resilient Distributed Datasets (RDDs) for RTM’s precomputed source wavefields. RDDs are a relatively recent abstraction for in-memory computing ideally suited to distributed computing environments like clusters (Zaharia et al., NSDI 2012, http://www.cs.berkeley.edu/~matei/papers/2012/nsdi_spark.pdf). Originally introduced for Big Data Analytics and popularized (e.g., Lumb, “8 Reasons Apache Spark is So Hot”, insideBIGDATA, http://insidebigdata.com/2015/03/06/8-reasons-apache-spark-hot/, 2015) through the open-source implementation known as Apache Spark (https://spark.apache.org/), RDDs also appear promising in recontextualizing RTM’s imaging condition.

Recent work has already indicated that seismic reflection data in accepted industry formats can be distributed in memory across a cluster using Apache Spark (Yan et al., “A Spark-based Seismic Data Analytics Cloud”, 2015 Rice Oil & Gas Workshop, Houston, TX, http://rice2015.og-hpc.org/technical-program/). And although Lumb (“RTM Using Hadoop Is There a Case for Migration?”, 2015 Rice Oil & Gas Workshop, Houston, TX, http://rice2015.og-hpc.org/technical-program/) has indicated that RDDs and Spark appear promising for impacting RTM in a number of ways (e.g., in allowing for the implementation of imaging conditions using alternatives to cross-correlation), attention here focuses on use of RDDs for facilitating the assessment of coherence between seismic-reflection wavefields in memory. More specifically an algorithm that significantly reduces the impact of disk I/O, in the wavefield manipulations required by RTM, is proposed based on RDDs and subsequently implementation-prototyped using open-source Thunder (http://thunder-project.org/) via Apache Spark.

avatar for Ian Lumb

Ian Lumb

Solution Architect, Navops by Univa
As an HPC specialist, Ian Lumb has spent about two decades at the global intersection of IT and science. Ian received his B.Sc. from Montreal's McGill University, and then an M.Sc. from York University in Toronto. Although his undergraduate and graduate studies emphasized geophysics, Ian’s current interests include workload orchestration and container optimization for HPC to Big Data Analytics in clusters and clouds. Ian enjoys discussing... Read More →

Wednesday March 2, 2016 3:50pm - 4:10pm
BioScience Research Collaborative Building (BRC), Room 103

Attendees (9)