2016 Rice Oil &amp; Gas HPC Conference

8:30am PST

Tutorial: 'PETSc: The Portable, Extensible Toolkit for Scientific Computation', Matthew Knepley, Rice University

PETSc, is a suite of data structures and routines for the scalable parallel solution of nonlinear equations, often arising from partial differential equations or boundary integral equations. PETSc has been used for years in the oil and gas industry, including development contributed back from WesternGeco and Shell. It supports MPI,
shared memory pthreads, and GPUs, as well as hybrid MPI-shared memory pthreads or MPI-GPU parallelism. In this brief tutorial, we will highlight basic sparse parallel linear algebra, linear and nonlinear algebraic solvers, structured and unstructured meshes, and timestepping. We will show how optimal, hierarchical, multilevel solvers for complex, multiphysics problems can be dynamically assembled using the PETSc object system.

Speakers

Matthew Knepley

Wednesday March 2, 2016 8:30am - 10:00am PST
BioScience Research Collaborative Building (BRC),Room 282

10:00am PST

Break & Networking

Wednesday March 2, 2016 10:00am - 10:30am PST
BioScience Research Collaborative Building (BRC), Exhibit Hall

Break

10:30am PST

Tutorial: 'Introduction to OpenMP 4.0 and 4.5', Barbara Chapman, Stony Brook University

For over a decade, OpenMP has been the de-facto standard for parallel programming on shared memory systems. It has continued to evolve in order to meet the programming needs of a diversity of application developers, and to handle the requirements of new generations of computer architecture. In this tutorial we give a brief overview of the basics of OpenMP and then introduce the new features on OpenMP 4.0 and 4.5, with short examples to illustrate their usage.

Speakers

Amit Amritkar

University of Houston

Barbara Chapman

Deepak Eachempati

Xinmin Tian

Intel

Wednesday March 2, 2016 10:30am - 12:00pm PST
BioScience Research Collaborative Building (BRC), Room 280

10:30am PST

Tutorial: 'Parallel Adaptive PDE Simulation with libMesh', Roy Stogner, University of Texas

Speakers

Roy Stogner

Wednesday March 2, 2016 10:30am - 12:00pm PST
BioScience Research Collaborative Building (BRC), Room 282

12:00pm PST

Conference Registration & Networking

Wednesday March 2, 2016 12:00pm - 1:00pm PST
BioScience Research Collaborative Building (BRC), Exhibit Hall

Registration

1:00pm PST

Opening Remarks

WATCH THE PRESENTATION

Speakers

Jan E. Odegard

Executive Director Ken Kennedy Institute/ Associate Vice President Research Computing, Rice University

Jan E. Odegard Executive Director, Ken Kennedy Institute for Information Technology and Associate Vice President, Research Computing & Cyberinfrastructure at Rice University. Dr. Odegard joined Rice University in 2002, and has over 15 years of experience supporting and enabling research... Read More →

PM 0 Odegard Rice OGHPC 2016 DAY1 1 pdf

Wednesday March 2, 2016 1:00pm - 1:15pm PST
BioScience Research Collaborative Building (BRC), Room 103

Welcome

1:15pm PST

Keynote: 'How is High Performance Computing reshaping O&G exploration and production', Francois Alabert, Total

PRESENTATION NOT AVAILABLE

For several decades, the Oil and Gas industry has been committed to produce more and more hydrocarbons in response to the growing world demand for energy. Always seeking deeper and farther, exploration and development has become economically challenging as a result of increased geological and above ground complexity, stronger environmental constraints and pressure on costs. In this presentation, we will review Total’s experience on High Performance Computing, how it has dramatically improved the efficiency of exploration and reservoir management, and how it might be reshaping our ways of working.

Significantly enhanced computational algorithms and more powerful computers have provided a much better understanding of the distribution and description of complex geological structures, opening new frontiers to unexplored geological areas as well as helping limit the risks and overall costs of deep and ultra-deep offshore drilling.

The progress of multi component seismic data acquisition and the fast evolution of new technologies in the rock physics labs provide the opportunity to develop new families of algorithms which include more complex physics. With an order of magnitude increase in computing capability, reaching an exascale in few years, the reduction of computing time combined with new generations of algorithms will offer new perspectives and require adapting science and engineering workflows. While next generation codes are developed to give access to new information of unrivaled quality, these codes will also present new challenges in taking advantage of the increased complexity of the new supercomputers and will require integrated teams, which mix geo-scientist researchers with computational scientists and engineers.

Due to a new generation of sensors for field and reservoir monitoring, part of the emergence of connected systems recording continuous streams of information, the amount of data resulting from simulations, lab measurements, and field measurements will grow exponentially. Analyzing these data to extract pertinent information will be critical to the prediction, anticipation, optimization and reduction of the costs and risks associated with a variety of processes in O&G exploration and production. This volume of data opens an era of data analytics and deep learning. By taking advantage of next generation high performance computing capabilities, data analytics and deep learning will become an important driver for the evolution of O&G technology.

Modeling with improved and new physics, multi-physics integration and data analytics, together with the possibility to improve uncertainty and risks assessment, will be reshaping competency requirements of specialists and the way they work together, for safer, cheaper, faster and better Oil and Gas exploration and production.

Speakers

Francois Alabert

Vice-President of Geo-Technology Solutions, Total

François Alabert is currently leading the group of geoscience technology for exploration and reservoir within the Total Exploration & Production branch. The group of 450 engineers and technicians provides high technology solutions in the domains of geology, geophysics and geosciences... Read More →

Wednesday March 2, 2016 1:15pm - 2:00pm PST
BioScience Research Collaborative Building (BRC), Room 103

Keynote

2:00pm PST

Plenary: 'Computational Science and Engineering Applications at Exascale: Challenges and Opportunies', Doug Kothe, ORNL

WATCH THE PRESENTATION

The computer and computational science and engineering community in the public, private, and government sectors have been arguably thinking about exascale-class modeling and simulation technologies and capabilities for almost a decade. With exascale platforms becoming more certain and finally within sight, application developers and users must get “get real” now to adequately taken advantage of this opportunity. The hardware and software technologies currently envisioned in exascale platforms will present new challenges for application developers that could be disruptive relative to current approaches. New algorithms, for example, that communicate infrequently and store very little, may be critical for applications to move forward or even “hold pace”. Hybrid node architectures with hierarchical memory and compute technologies will likely be the norm, and applications may face comprehensive restructuring to exploit more appropriate task-based programming models and new data structures.

Given these challenges, tremendous opportunity nevertheless exists for science-based computational applications that can deliver, through effective exploitation of exascale HPC technology, breakthrough modeling and simulation solutions that yield high-confidence insights and answers to the nation’s most critical problems and challenges in scientific discovery, energy assurance, economic competitiveness, and national security. I will survey these application opportunities within the science/energy/national security mission space of the Department of Energy, where I will also touch upon challenges, decadal challenge problems, and prospective outcomes and impact.

Speakers

Doug Kothe

Oak Ridge National Laboratory

Douglas B. Kothe (Doug) has over three decades of experience in conducting and leading applied R&D in computational applications designed to simulate complex physical phenomena in the energy, defense, and manufacturing sectors. Doug is currently the Deputy Associate Laboratory Director... Read More →

Kothe Rice Oil&Gas HPC Conf final pdf

Wednesday March 2, 2016 2:00pm - 2:30pm PST
BioScience Research Collaborative Building (BRC), Room 103

2:30pm PST

Disruptive Technology 1: Orchestrating Containers within Production Oil and Gas HPC Workloads and Workflows

WATCH THE PRESENTATION

Docker presents as an open platform that allows developers and sysadmins to build, ship and execute distributed applications. This is particularly appealing in cases where lightweight, easy-to-use, well-contained technologies, are well matched with rapidly evolving needs and faced-paced innovation. Not surprisingly then, numerous organizations are successfully evaluating Docker containers in proof-of-concept initiatives and/or pilot projects. The transition to production use, however, introduces additional requirements as Docker containers need to be incorporated into existing IT infrastructures and (ultimately) integrated into application workflows. Simply put, organizations need to be able to manage Docker containers in the same way they have become accustomed to managing other types of workloads and workflows. In other words, requirements to launch, execute, control (including limit) and account for Docker containers in production environments is well evident; complicating these requirements is the need to move data into and out from containers that may need to provide interactive-execution modalities. Although early adopters report “easier replication, faster deployment and lower configuration and operating costs” of workflows involving Docker containers, it is clear that more fulsome IT infrastructure integrations are called for. After reviewing selected use cases, attention shifts to ongoing and future efforts aimed at fully integrating Docker containers within on premise and/or cloud-based IT infrastructures from a workload orchestration and container optimization perspective for the oil and gas industry.

Speakers

Ian Lumb

Solutions Architect, Univa Corporation

As an HPC specialist, Ian Lumb has spent about two decades at the global intersection of IT and science. Ian received his B.Sc. from Montreal's McGill University, and then an M.Sc. from York University in Toronto. Although his undergraduate and graduate studies emphasized geophysics... Read More →

1 Ian Lumb RiceU2016 DTT Containers pdf

Wednesday March 2, 2016 2:30pm - 3:00pm PST
BioScience Research Collaborative Building (BRC), Room 103

2 Reference Architecture Seismic Repositories v100102252016 pdf

2:30pm PST

Disruptive Technology 2: Bringing Insight to Seismic Storage Repositories for faster Time-To-Oil

WATCH THE PRESENTATION

Finding a needle in an ocean of seismic data can be a costly process. Over the last several years IOC’s and NOC’s have amassed petabytes of seismic data, and stored the data on bespoke storage systems that make processing and prioritizing that data challenging. As Shared-Nothing Object-Oriented Storage architectures increase in use, opportunities will emerge to improve Oil and Gas workflows by leveraging the idle compute for analytics on the data in repository to gain early insights to data stored there.

This presentation will talk about how to leverage a scale-out x86 architectures (on-premise or in the Cloud) to deliver both cost-effective storage and provide compute for initial analytics for Seismic Repositories without moving the data to large HPCs, improving time to market and discovery of other insights.

Speakers

Jason Jackson

Wednesday March 2, 2016 2:30pm - 3:00pm PST
BioScience Research Collaborative Building (BRC), Room 103

2:30pm PST

Disruptive Technology 3: BeeGFS - A Parallel File System to Solve I/O Problems

WATCH THE PRESENTATION

With the increasing size of parallel computers and the increasing speed of individual nodes ( CPU+GPU) the challenges for parallel file systems with respect to I/O pattern, bandwidth, latency, robustness and scalability are becoming more obvious. When the first dual core CPU´s hit the market we started to develop a parallel file system from scratch with full scalability for Data & Metadata, ease of use, robustness and high flexibility in mind. As the CPU roadmap was clearly pointing towards many core CPU´s one central development requirement was to follow a strict multithreaded approach to keep the software overhead low and allow the software to run on dedicated servers, on the compute nodes and adapt to new architectures on the rise like ARM and its variants.

Our own test cases for the development of BeeGFS ( former FhGFS) during the past 10 years have been a broad range of O&G codes mostly developed next door.
The paper presents an architectural overview over BeeGFS with special focus on scalability, metadata performance and reliability in large installations. As the BeeGFS server components are efficient multithreaded user space programs, which work on every underlying POSIX file system BeeGFS supports a variety of hardware and software solutions. As a special use case the paper will explain BeeOND: the BeeGFS on demand file system. SSD´s (NVRAM) in every compute node are delivering high speed, low latency I/O. With BeeOND we create a private parallel file system (../myscratch/ ) for very compute job on the corresponding nodes that fully utilizes the NVRAM capabilities and acts as a burst buffer for most of the temporal I/O behavior present in today´s applications. The paper will report about BeeOND O&G use cases and present benchmarks.

As the amount of storage grows data resilience and self-healing capabilities are essential requirements in a storage system. BeeGFS has ist own approach to this topic based on software robustness and its build-in data mirroring capabilities. . The paper shortly cover these HA aspects of BeeGFS and outlines the future BeeGFS roadmap which includes erasure coding as well as a non -POSIXI. The last section of the talk is related to the BeeGFS approach to Exascale.

Speakers

Christian Mohrbacher

Fraunhofer ITWM

Christian Mohrbacher studied computer sciences and afterwards joined Fraunhofer's Competence Center for High Performance Computing in 2008. He is currently part of the parallel file system group, which drives the development of BeeGFS.

Franz-Josef Pfreundt

3 F. Pfreundt BeeGFS Rice OG pdf

Wednesday March 2, 2016 2:30pm - 3:00pm PST
BioScience Research Collaborative Building (BRC), Room 103

2:30pm PST

Disruptive Technology 4: PCIeArch for RTM

Seismic Imaging is a standard data processing technique used in creating an image of subsurface structures of the Earth from measurements recorded at the surface via seismic wave propagations captured from various sound energy sources.

Reverse Time Migration (RTM) is an advanced migration algorithm that solves wave equations both downward & upward through the earth model. The most popular ways to solve the resulting wave equations is using either stencil based method or FFT based method.

Dell has a unique architecture in its PowerEdge C4130 by employing flexible PCIe architecture which allows different configurations between CPU & GPU. The architecture allows you to connect 2 GPU per socket directly or 4 GPU per socket via PCIe Switch. It also allows you to have GPU direct using Infiniband adapters between different nodes.

Couple of highlights:
Flexible PCIe architecture to support various CPU: GPU configurations.
Allows PCIe Switches to be inserted in the architecture which enables low latency peer-peer traffic.
Allows GPU Direct using Infiniband adapters for multi-node scale out.

Here are some of the different configurations that can be exploited for RTM & our paper will try to go in-depth on which configuration might be better suited for RTM.

C4130 supports lot of different configurations but the goal of this paper would be to start with the topologies that we think will benefit RTM and build from there.

Speakers

Chris Fischler

Jeff Gillespey

Bhavesh A Patel

Sr. Principal System Engineer, Dell

I am part of the Server Advanced Engineering group at Dell and work on server architectures for HPC workloads. I am primarily looking at different PCIe architectures for GPU applications.

Wednesday March 2, 2016 2:30pm - 3:00pm PST
BioScience Research Collaborative Building (BRC), Room 103

2:30pm PST

Disruptive Technology 5: Why Wait? Unleash Your Compute Power with Intel® SSDs

Andrey Kudryavtsev, HPC Solution Architect for Intel® Non-Volatile Memory Solutions Group (NSG) will discuss advancements in Intel SSD technology that is unleashing the power of the CPU and Moore’s Law. He will dive into the benefits of Intel® NVMe SSDs, a standard specification interface for SSDs, show how NVMe SSDs can greatly benefit HPC specific performance and workloads in the oil and gas exploration. He will also share the HPC performance benefits that Intel has already seen with their customers today, and how adoption of the current NVMe SSD technology sets the foundation for Intel’s next generation of memory using Intel® 3D XPoint™ technology which will be incorporated into SSDs with Intel Optane™ Technology

Speakers

Andrey Kudryavtsev

DAOS Product Manager, Intel

Andrey is in the DAOS Product Management role at Intel with more than 20 years of total server experience. Andrey has an unique mix of expertise in the storage technology from the file system stack down to the device level. He graduated from Nizhny Novgorod State University in Russia... Read More →

Cyndi Peach

Wednesday March 2, 2016 2:30pm - 3:00pm PST
BioScience Research Collaborative Building (BRC), Room 103

2:30pm PST

Disruptive Technology 6: Solving the Oil & Gas Dilemma with Burst Buffer

WATCH THE PRESENTATION

The largest compute systems in Oil and Gas have become nearly 20 times faster over the past five years, which means there is a growing performance gap between compute and I/O that poses a threat to scaling performance to Exascale levels. Traditional architectures are further accelerating IO bottlenecks. Increasingly leading Oil and Gas companies are looking at Burst Buffer based architectures that eliminate the IO bottlenecks. By buffering and aligning I/O, a Burst Buffer can drive a parallel file system at close to max hardware speeds and sustain peak workload performance requirements, making it a logical fit for extreme scale Oil & Gas applications. Most importantly, Burst Buffers have demonstrated unprecedented acceleration for key oil and gas codes like Reverse Time Migration, by accelerating RTM by 300% without any code modifications.

In this presentation, we will discuss the motivations for using a Burst Buffer, provide an overview of recent results achieved on production Reverse Time Migration (RTM) , and discuss how this relates to important Oil and Gas applications. Furthermore, we will explain why Oil & Gas is likely to be the first commercial sector to achieve Exascale computing by leveraging a Burst Buffer approach.

Speakers

Robert McMillen

McMillen is a senior system architect at DDN where he works out of the office of the CTO to analyze the data flow of key applications to identify opportunities to optimize the company's products. He has extensive experience in inventing (21 patents issued), developing, and managing... Read More →

6 Robert McMillen Minute IME talk R McMillen v2 pdf

Wednesday March 2, 2016 2:30pm - 3:00pm PST
BioScience Research Collaborative Building (BRC), Room 103

2:30pm PST

Disruptive Technology 7: IBM Accelerates Big Data Analytics

Current compute technologies and scientific methods for Big Data analysis today are demanding more compute cycles per processor than ever before, with extreme I/O performance also required. The capabilities of Intel Hyper-Threading that offers only 2 simultaneous threads per core limit the results of these recent advances in Big Data analysis techniques.

Despite multiple prior Hadoop Genome analysis attempts on an existing Intel based LSU HPC cluster, a large metagenome dataset could not be analyzed in a reasonable period of time on existing LSU resources. Knowing the extraordinary capabilities for big data analysis offered by IBM Power Systems, the LSU Center for Computational Technologies staff and researchers turned to IBM for help.

This presentation will talk about how IBM helped LSU render a 3.2TB Metagenome dataset in Hadoop in just 6.25 hours, using only 40 compute nodes , whereas the same analysis took 20+ hours on 120+ x86 nodes. This technology has direct parallel to Oil and Gas applications that can take advantage of simultaneous threading and a parallel file system.

Speakers

Jason Jackson

Terry Leatherland

Wednesday March 2, 2016 2:30pm - 3:00pm PST
BioScience Research Collaborative Building (BRC), Room 103

2:30pm PST

Disruptive Technology 8: Machine Learning Support for Full Waveform Inversion

WATCH THE PRESENTATION

In theory, Full waveform inversion (FWI) is a non-linear and global optimization algorithm that seeks to find a high-fidelity, high-resolution quantitative model of the subsurface by using all information in the recorded seismic waveforms. In practice, FWI is implemented as part of a workflow that contains iterative modeling and migration steps together with constrained parameter management.

Productivity enhancements to FWI can be obtained via the implementation of deep neural nets to the initial and final stages of the workflow: dynamic recurrent neural nets for time series analysis and convolutional neural nets for the characterization of features in the highly dimensional pre/post stack image cubes. A few general-purpose Deep Learning frameworks exist and will be reviewed given the requirements for a industry-specific implementation.

Speakers

Geert Wenes

Sr. Practice Leader/Architect, Cray, Inc.

8 G. Wenes Rice Innovation Talk Cray Wenes 2016 pdf

Wednesday March 2, 2016 2:30pm - 3:00pm PST
BioScience Research Collaborative Building (BRC), Room 103

2:30pm PST

Disruptive Technology 9: Disruptive HPC with the elastic AWS Cloud

Speakers

Tim DiLauro

Solutions Architect, Amazon Web Services

Timothy DiLauro is an AWS Solutions Architect responsible for providing architectural assistance and technical guidance for customers running enterprise solutions in the AWS cloud. Timothy supports AWS customers across Texas, including many in the Oil and Gas industry. Timothy has... Read More →

Wednesday March 2, 2016 2:30pm - 3:00pm PST
BioScience Research Collaborative Building (BRC), Exhibit Hall

3:00pm PST

Break & Networking

Wednesday March 2, 2016 3:00pm - 3:30pm PST
BioScience Research Collaborative Building (BRC)

Break

3:30pm PST

Algorithms & Accelerators I: Performance of DGTD Finite Element Methods for the RTM Procedure on GPU Clusters

WATCH THE PRESENTATION

Nodal discontinuous Galerkin time-domain (DGTD) methods exhibit attractive features for the large scale simulation of seismic waves in complex media. First, such methods provide accurate wavefield solutions for complicated geological structures thanks to the use of unstructured meshes and high-degree discontinuous basis functions. Additionally, the dense algebraic operations required per element and the weak element-to-element coupling of DGTD methods make them suitable schemes for efficient computations on modern clusters with massively parallelized many-core devices, such as GPUs. Both these aspects, accuracy and computational performance, are very important for seismic imaging in the Oil and Gas industry.

In collaboration with Shell, we have conceived a high-performance tool for seismic migration that can be run on clusters of GPUs [*]. This tool, named RiDG, includes reverse time migration (RTM) capabilities and multiple wave models. The model solver is based on a high-order DGTD method for first-order systems which uses unstructured meshes and multi-rate local time-stepping to efficiently deal with multi-scale solutions. Imaging conditions based on vertical characteristics provide improved RTM images.

We adopted the MPI+X approach for distributed programming together with OCCA, a unified framework to make use of major multi-threading languages (e.g. OpenMP, OpenCL and CUDA), offering a flexible approach to handling the multi-threading X. While the RTM procedure generally has extensive data storage requirements with slow I/O, low storage requirements for DGTD boundary data allows halo trace data to be stored in memory rather than relying on disk based check-pointing. The load balancing of our implementation reduces both device--host data movement and MPI node-to-node communication.

In this talk, we present the main features of our RTM implementation and recent results for GPU computing. In particular, the computational performance of the DGTD solver is analysed using the roofline model and compared with alternative strategies. The strong scalability of the implementation is tested using a three-dimensional RTM synthetic case on a GPU cluster. These results confirm the quality of RiDG implementation and the relevance of programming strategies.

[*] A. Modave, A. St-Cyr, W.A. Mulder, and T. Warburton. A nodal discontinuous Galerkin method for reverse-time migration on GPU clusters. Geophysical Journal International, 203(2):1419– 1435, 2015.

Speakers

Axel Modave

Postdoctoral Associate, VirginiaTech

Amik St-Cyr

Tim Warburton

John K. Costain Chair & Professor of Mathematics, Virginia Tech

Wednesday March 2, 2016 3:30pm - 3:50pm PST
BioScience Research Collaborative Building (BRC), Room 280 & 282

3:30pm PST

Data Analytics Approaches & Tools: Handling Clusters with a Task-based Runtime System: Application to Geophysics

WATCH THE PRESENTATION

Many paradigms of parallelism have been derived from MPI to form the MPI+X combinations in order to improve for instance load-imbalance issues. Unfortunately, these solutions are difficult to develop, port and optimize since they involve different programming levels and because they generally use a static mapping in the MPI layer. We propose to use a single task-based paradigm which offer dynamism through work-stealing and which can tackle distributed heterogeneous machines using advanced runtime systems. The ease of portability comes from the powerful DAG description which hides the hardware and prevent the use of explicit communications. We compared MPI-based version and task-based version on Geophysics simulations, especially on the DIVA code of Total. Our previous studies demonstrated the task-based paradigm superiority on shared memory architectures (CPU or MIC), we are now working on distributed and heterogeneous architectures (CPUs+MICs) and, according to our preliminary results, the performance are still better than the MPI-version.

Speakers

Emmanuel Agullo

Helene Barucq

senior research scientist, Inria

Lionel Boillot

Expert Engineer, Inria

George Bosilca

Henri Calandra

Total

Henri Calandra obtained his M.Sc. in mathematics in 1984 and a Ph.D. in mathematics in 1987 from the Universite des Pays de l’Adour in Pau, France. He joined Cray Research France in 1987 and worked on seismic applications. In 1989 he joined the applied mathematics department of... Read More →

Julien Diaz

Corentin Rossignon

1 Rice2 pdf

Wednesday March 2, 2016 3:30pm - 3:50pm PST
BioScience Research Collaborative Building (BRC), Room 103

3:50pm PST

Algorithms & Accelerators I: Efficient Reverse Time Migration on APU Clusters

WATCH THE PRESENTATION

Reverse Time Migration (RTM) is a numerical method for seismic imaging that is widely used in the Oil & Gas industry. RTM is using the two-way travel time of wave propagations to place (or migrate) dipping temporal events in their true subsurface spatial locations. Processing these reflections produces a synthetic image of the subsurface geologic structure. The RTM workflow is very time-consuming and resource-demanding in terms of compute power, memory bandwidth and storage capacities as prodigious amounts of data (terabytes of data) are generated during computations (namely wavefield snapshots). Thanks to the advent of high performance computing facilities, RTM is seeing significant acceleration using clusters of multi-core CPUs [1, 2, 3], and GPUs (Graphic Processing Units) [4, 5, 6, 7]. Although CPU clusters have shown performance gain by spreading large datasets over connected compute nodes, and even larger performance enhancements can be achieved by GPU clusters thanks to the massively parallel architecture of the GPUs, the GPU based solutions are suffering from some limitations, namely small GPU memory capacities, overheads incurred by the PCI interconnection between CPU and GPU that may bottleneck the seismic imaging applications and high power consumptions.

Recently, AMD has released a new architecture, the Accelerated Processing Unit (APU) that combines CPU cores and GPU cores in the same silicon die. This hardware design benefits from both CPU and GPU advantages and suppresses PCI Express interconnect between the CPU and the GPU. The APU hardware design can thus be an attractive solution for efficient depth imaging. Besides the APU can be considered as a low-power HPC chip (between 60 and 95 of watts of TDP). However, the integrated GPUs are about one order of magnitude less compute powerful and have less internal memory bandwidth than high-end discrete GPUs. Moreover, multiple compute nodes are required in order to process realistic RTM cases: the impact of the APU architecture on the communications has thus also to be considered.

We focus our interest on the implementation and deployment of the 3D acoustic RTM in isotropic media on an APU cluster using Fortran+OpenCL+MPI. We rely on a three-dimensional 8th order finite difference approximation of the acoustic wave equation to simulate the wave propagation during both the forward and backward sweeps of the RTM algorithm, and use the selective checkpointing method (with a data checkpointing frequency equal to 10) to reconstruct the source wavefield.

In this talk, we give an overview of the application implementation details with a particular emphasis on the impact of the APU new unified memory model on the RTM workflow. Then, we present the OpenCL single precision performance results and the power efficiency estimation of the RTM on a 16-node cluster, each node having an A10-7850 APU (code-named Kaveri). We show the relevance of APUs in a seismic imaging context by presenting the pros and cons of such HPC platforms for the RTM, and compare it against the traditional solutions by means of strong and weak scaling testings.

Speakers

Henri Calandra

Total

Pierre Fortin

Jean-Luc Lamotte

Issam Said

Computational Scientist, TOTAL/LIP6

My current research interests include high performance computing (HPC), numerical methods and applied geophysics. I specialize in adapting scientific applications to hardware accelerators, mainly Graphics Processing Units (GPUs). It implies improving the performance of numerical solvers... Read More →

2 isaid oghpc2016 pdf

Wednesday March 2, 2016 3:50pm - 4:10pm PST
BioScience Research Collaborative Building (BRC), Room 280 & 282

3:50pm PST

Data Analytics Approaches & Tools: Reverse Time Migration via Resilient Distributed Datasets: Towards In-Memory Coherence of Seismic-Reflection Wavefields Using Thunder via Apache Spark

WATCH THE PRESENTATION

The need to cross-correlate two wavefields in the application of Reverse Time Migration’s imaging condition remains one of two fundamental challenges with use of the method in practice (e.g., Liu et al., Computers & Geosciences 59, 17–23, 2013). In a significant departure from previous approaches, this computational challenge is addressed here through the introduction of Resilient Distributed Datasets (RDDs) for RTM’s precomputed source wavefields. RDDs are a relatively recent abstraction for in-memory computing ideally suited to distributed computing environments like clusters (Zaharia et al., NSDI 2012, http://www.cs.berkeley.edu/~matei/papers/2012/nsdi_spark.pdf). Originally introduced for Big Data Analytics and popularized (e.g., Lumb, “8 Reasons Apache Spark is So Hot”, insideBIGDATA, http://insidebigdata.com/2015/03/06/8-reasons-apache-spark-hot/, 2015) through the open-source implementation known as Apache Spark (https://spark.apache.org/), RDDs also appear promising in recontextualizing RTM’s imaging condition.

Recent work has already indicated that seismic reflection data in accepted industry formats can be distributed in memory across a cluster using Apache Spark (Yan et al., “A Spark-based Seismic Data Analytics Cloud”, 2015 Rice Oil & Gas Workshop, Houston, TX, http://rice2015.og-hpc.org/technical-program/). And although Lumb (“RTM Using Hadoop Is There a Case for Migration?”, 2015 Rice Oil & Gas Workshop, Houston, TX, http://rice2015.og-hpc.org/technical-program/) has indicated that RDDs and Spark appear promising for impacting RTM in a number of ways (e.g., in allowing for the implementation of imaging conditions using alternatives to cross-correlation), attention here focuses on use of RDDs for facilitating the assessment of coherence between seismic-reflection wavefields in memory. More specifically an algorithm that significantly reduces the impact of disk I/O, in the wavefield manipulations required by RTM, is proposed based on RDDs and subsequently implementation-prototyped using open-source Thunder (http://thunder-project.org/) via Apache Spark.

Speakers

Ian Lumb

Solutions Architect, Univa Corporation

2 Lumb RevTimeMig 2016RiceOGC Spark'd RTM Mar2 2016 pdf

Wednesday March 2, 2016 3:50pm - 4:10pm PST
BioScience Research Collaborative Building (BRC), Room 103

4:10pm PST

Algorithms & Accelerators I: GPU-accelerated Discontinuous Galerkin Methods on Hybrid Meshes

WATCH THE PRESENTATION

Time-domain discontinuous Galerkin (DG) solvers combine high order accurate approximations with unstructured meshes, and are effective for the simulation of seismic wave propagation. DG solutions are allowed to be discontinuous across elements, and neighboring elements are coupled weakly through a numerical flux. The use of unstructured meshes in tandem with discontinuous approximations makes it possible to accurately capture sharp interfaces and geological features. Additionally, due to this weak coupling, DG solvers exhibit high parallel scalability, and benefit greatly from acceleration using Graphics Processing Units (GPUs). Finally, GPU-accelerated DG solvers have shown promise as efficient propagators for reverse time migration.

It is well known that high order DG on hexahedral meshes yields very efficient computational kernels due to the tensor product structure of hexahedra. However, producing high quality hexahedral meshes for complex domains is presently a difficult and non-robust procedure. Hybrid meshes, which consist of wedge and pyramidal elements in addition to hexahedra and tetrahedra, have been proposed to leverage the efficiency of hexahedral elements for more general geometries. We extend efficient DG solvers to hybrid meshes containing multiple types of elements, which have the potential to produce propagation models with improved accuracy at reduced computational cost. We propose efficient, low-storage implementations of DG on GPUs for each type of element and discuss the extension of multi-rate time stepping strategies for acoustic wave propagation to hybrid meshes.

Speakers

Jesse Chan

Zheng Wang

Tim Warburton

John K. Costain Chair & Professor of Mathematics, Virginia Tech

3 og2016 pdf

Wednesday March 2, 2016 4:10pm - 4:30pm PST
BioScience Research Collaborative Building (BRC), Room 280 & 282

4:10pm PST

Data Analytics Approaches & Tools: Applying Big Data Analytics to Seismic Interpretation

WATCH THE PRESENTATION

Machine Learning including deep learning is the core technology in big data analytics. The petroleum industry is one of the big data domains that are facing the challenges of rapidly increasing volume and velocity of data. In this paper, we attempt to demonstrate the applicability of machine learning technology in identifying geological features from seismic data volumes. We compare the differences between traditional methods and machine learning methods in our test cases. We also present our seismic data analytics platform built on top of Hadoop and Spark to provide a productive and scalable platform to facilitate the work of tackling the big data challenges.

Speakers

Chao Chen

Ted Clee

President, TEC Applications Analysis

TEC Applications Analysis

Lei Huang

Assistant Professor, Prairie View A&M University

Yuzhong Yan

3 Rice OnG HPC talk pdf

Wednesday March 2, 2016 4:10pm - 4:30pm PST
BioScience Research Collaborative Building (BRC), Room 103

4:30pm PST

Algorithms & Accelerators I: Parallelizing Seismic One-Way Based Migration for GPUs Using OpenACC

WATCH THE PRESENTATION

One-Way Migration is a popular algorithm used in the industry for migrating seismic data. A parallel version of this algorithm using MPI is widely implemented. GPUs have become popular in the past few years in HPC due to their high computational throughput at reasonable power consumption, and hybrid CPU-GPU architectures are seen as a stepping stone towards next generation supercomputing. In this talk, we describe our experience in using OpenACC to parallelize One-Way Migration on NVIDIA GPUs.

In seismic applications, input data is typically made up of 'shots' which are processed independently using MPI tasks. One-Way Migration uses Fourier Finite Differencing. 'Phase-Shift' and 'Wide-Angle Correction' form the bulk of the computation for every task, taking as much as 80% of the computation time. These components form good candidates for computation on GPUs and thereby reduce application runtime by a significant amount.

Traditionally, low-level programming languages and extensions such as CUDA are used for programming on GPUs. However, this is non-trivial and the resulting code is not portable between different GPU architectures. As different accelerator technologies are being evaluated as the path forward towards exascale computing, code portability is highly desired. OpenACC is an emerging directive-based programming model for accelerators, similar to OpenMP. Application users can annotate their code using pragmas that instruct the compiler to generate appropriate device code.

We use OpenACC to parallelize One-Way Migration, which involves optimizing kernels that involve FFT operations and solving systems of tridiagonal sparse matrices. The process of optimizing applications for GPUs is discussed along with challenges and potential pitfalls for application users. We discuss our experience with different compilers and the evolution of OpenACC as a standard. Using acoustic isotropic data, we are able to improve the performance of the application by a factor of 3 using the NVIDIA K20X GPU on the Titan supercomputer at Oak Ridge National Lab, as compared with the CPU-only version of the application run on an 8-core SandyBridge CPU. However, the performance of an application that uses OpenACC does not match that written in CUDA, and it would be beneficial for future versions of compilers to focus on reducing this gap.

Speakers

Oscar Hernandez

Maxime Hugues

Kshitij Mehta

HPC R&D Scientist, Total E&P

4 Rice2016 Kshitij pdf

Wednesday March 2, 2016 4:30pm - 4:50pm PST
BioScience Research Collaborative Building (BRC), Room 280 & 282

4:30pm PST

Data Analytics Approaches & Tools: SweetSpot Identification Using Machine Learning for Unconventionals

PRESENTATION NOT AVAILABLE

Reducing cost in well drilling and completion while improving the productivity of unconventional (UNC) reservoirs is vitally important. The physics model based simulation technologies that achieved great success in exploring conventional reservoirs have not been as effective when applied to UNC plays. The best methodology to determine where to drill and how to complete remains elusive. It is challenging to accurately and rapidly characterize the high EUR regions of a UNC play with early exploration data to provide guidance on where to drill new wells.

Machine learning (ML) techniques are data driven and can incorporate pertinent information from input data sources and learn the underlying complex and hidden inter-relationships and patterns. ML provides a promising means to tackle the complex exploration and production problems arising from UNC plays, where the underlying physics is not well known, or where the physical models are highly uncertain.

This contribution describes ML methodologies to tackle the particular challenge: Based on available exploration and production data (often scarce) from a play, can we accurately predict the emerging top productive areas, the so called sweetspots? The workflow has two stages (i) data integration and preprocessing, which generate a set of feature variables (or predictors) from original data; (ii) Predictive modeling, where a predictive model is built based on the predictors and production data using machine learning algorithms. The workflow has been applied to unconventional datasets for sweetspot identification. The results show that the methodology provides promising potentials.

Speakers

Detlef Hohl

Chief Scientist Computation and Data Science, Shell International E&P, Inc.

Ligang Lu

Shell International Exploration and Production Inc.

Xiao Wang

Mingqi Wu

Statistical Consultant, Shell

Statistical Consultant / Data Scientist

Wednesday March 2, 2016 4:30pm - 4:50pm PST
BioScience Research Collaborative Building (BRC), Room 103

4:50pm PST

Algorithms & Accelerators I: GpuWrapper: A Portable API for Heterogeneous Programming at CGG

WATCH THE PRESENTATION

To increase the portability of our GPU-accelerated applications, we designed an API for heterogeneous programming abstracting CUDA and OpenCL. Applications targeting the GpuWrapper API can thus run on Nvidia GPU and on all devices supporting OpenCL. Moreover, this common API should future-proof our applications against uncertainty in hardware and programming model evolutions.

Speakers

Victor Arslan

High Performance Computing Research Engineer, CGG

High Performance Computing Research Engineer. Graduated a Master in Applied Mathematics in 2009 and specialist in software programming on massively parallel architectures. I work at CGG on technology forecasting for computational accelerators, including the Intel MIC architecture... Read More →

Jean-Yves Blanc

Chief IT Architect, CGG

Gina Sitaraman

Marc Tchiboukdjian

IT Architect, CGG

Guillaume Thomas-Collignon

5 gpuwrapper slides pdf

Wednesday March 2, 2016 4:50pm - 5:10pm PST
BioScience Research Collaborative Building (BRC), Room 280 & 282

4:50pm PST

Data Analytics Approaches & Tools: Scalable data-driven predictive model application for real-time operations monitoring

PRESENTATION NOT AVAILABLE

Oil and Gas mission critical operations surveillance transitioned from merely real-time monitoring to embracing more proactive scheme. Increased leverage of data-driven models and machine learning techniques enabled early notification and lead time prediction of undesirable events.

Conventional model learning, verification, and testing process is inherently biased due to subjective sampling of population space or restrictive computation resources constraints. From a practical perspective, scaling models operability beyond sub-space of global operations footprint is a desired business objective that stretches the limits of training and maintaining validity of such models in a continuously evolving environment and growing big data feeds.

We present some of the lessons learned, challenges, and practical implications for designing scalable data-driven models and visualizations leveraging integrated big data streaming infrastructure in a distributed setting. The heterogeneous infrastructure support more active learning approach, adapting to the environment and process dynamics in a continuously evolving system.

Speakers

Mohamed Sidahmed

Data Analytics Scientist, BP

Pradeep Vaswani

Wednesday March 2, 2016 4:50pm - 5:10pm PST
BioScience Research Collaborative Building (BRC), Room 103

5:10pm PST

Networking Reception

Wednesday March 2, 2016 5:10pm - 7:00pm PST
BioScience Research Collaborative Building (BRC)

Networking Reception

7:30am PST

Conference Registration, Breakfast, & Networking

Thursday March 3, 2016 7:30am - 8:30am PST
BioScience Research Collaborative Building (BRC)

Registration

8:30am PST

Message from Organizer

WATCH THE PRESENTATION

Speakers

Jan E. Odegard

Executive Director Ken Kennedy Institute/ Associate Vice President Research Computing, Rice University

AM 0 Odegard Rice OGHPC 2016 DAY2 1 pdf

Thursday March 3, 2016 8:30am - 8:45am PST
BioScience Research Collaborative Building (BRC)

Welcome

8:45am PST

Keynote: 'Exploration seismology and the return of the supercomputer', Sverre Brandsberg-Dahl, PGS

WATCH THE PRESENTATION

Speakers

Sverre Brandsberg-Dahl

Sverre Brandsberg-Dahl is the Global Chief Geophysicist for the Imaging and Engineering Division at PGS. This division is responsible for delivering data processing and imaging services to customers around the world. It is also the home of PGS’ R&D organization where all aspects... Read More →

Thursday March 3, 2016 8:45am - 9:30am PST
BioScience Research Collaborative Building (BRC), Room 103

Keynote

9:30am PST

Plenary: 'HPC I/O today and the Road Ahead', Brent Gorda, Intel

PRESENTATION NOT AVAILABLE

The world of HPC I/O & storage is active and changing for the better. On both evolutionary and revolutionary paths, storage is evolving due to changing needs and the introduction of disruptive hardware. Lustre, the popular open source scale-out parallel file system, is advancing in response to modern workloads and will continue to be the choice for network-based high performance/capacity storage. However, new solid state storage is approaching to disrupt the memory/storage hierarchy. Storage hardware and software have always been important to HPC but at this point in time, the community is making great advances that will benefit HPC for a long time to come.

Speakers

Brent Gorda

General Manager, HPC Storage, Intel

Brent Gorda is the General Manager of HPC Storage at Intel. In 2010, Brent started Whamcloud to focus on the longevity of Lustre and sold the company to Intel in 2012. An industry veteran, Brent has several decades of experience in HPC, leading projects such as the BlueGene architecture... Read More →

Thursday March 3, 2016 9:30am - 10:00am PST
BioScience Research Collaborative Building (BRC), Room 103

10:00am PST

Break & Networking

Thursday March 3, 2016 10:00am - 10:30am PST
BioScience Research Collaborative Building (BRC)

Break

10:30am PST

Algorithms & Accelerators II: A High Performance Reservoir Simulator on GPU

PRESENTATION NOT AVAILABLE

The resolution and complexity of reservoir simulation models are increasing continuously in order to capture the geologic heterogeneity and multiphase physics in a reservoir with more fidelity. In addition, scenarios designed to investigate uncertainty quantification, history matching, and production optimization can take considerable time and computational power on these large scale models. The excessive run times not only limits the ability to simulate multiple realizations, but also lead to longer project timelines, and can cause costly delays. Thus, an ultra fast reservoir simulator offers significant value for efficient workflows that enable rapid high precision decision making.

Stone Ridge Technology and Marathon Oil Company have developed ECHELON, a state-of-the-art GPU based reservoir simulator. The GPU implementation provides an extremely dense and efficient computational platform that can reduce the required hardware footprint and power usage. ECHELON executes all major computational tasks on GPU including property evaluation, construction and assembly of Jacobian, and the CPR-AMG linear solver. GPUs allow faster numerical simulation for multi-million cell models that can run at least an order of magnitude faster than current parallel CPU codes.

In this talk, we give field example models for assessing both performance and accuracy of ECHELON. We address several potential bottlenecks on performance and our strategies to resolve them. We will also discuss our attempt to efficiently scale reservoir simulation to the GPU cluster using a combination of CUDA and MPI. Furthermore, we discuss the workflow challenges and requirements which need to be addressed to make the best use of this high performance simulator along with other existing tools.

Speakers

Rajesh Gandham

Reza Ghasemi

Reservoir Engineer, Stone Ridge Technology

Carlos G. Miranda

Karthik Mukundakrishnan

Edward Yang

Rice HPC Conference 2016 pdf

Thursday March 3, 2016 10:30am - 10:50am PST
BioScience Research Collaborative Building (BRC), Room 280 & 282

10:30am PST

Facilities, Infrastructure & Visualization: Experiences with Oil Immersion Cooling in a Seismic Processing Datacenter

WATCH THE PRESENTATION

CGG installed its first rack of oil immersion cooled compute systems in June 2011. A variety of lessons have been learned in that time. This presentation covers some of those lessons including: actual cost savings (CapEx, OpEx), equipment failure rates, thermal performance, and operational issues. The presentation begins with an outline of the specific business scenario that led to considering oil immersion cooling. It closes with an outline of possible next steps and remaining hurdles. Before going into those details, our current thinking on the feasibility of oil immersion cooling can be summarized as follows

• Oil immersion cooling provides significant ROI on a case by case basis depending on the specific business scenario.
• CapEx savings are realized as ‘deferring’ expenditures for a number of years.
• OpEx power savings are approximately 30% for standard high density air cooled servers.
• Our current oil immersion datacenter has an ‘Equivalent PUE’ of 1.05.
• There are specific equipment failure modes and drawbacks to oil immersion but these can be dealt with successfully.
• Significant additional savings remain to be exploited.

Business Scenario
The business scenario involved upgrading a PUE ~ 2 legacy datacenter. An upgrade to high efficiency air cooling and conversion to oil cooling were evaluated. Oil was chosen because of lower OpEx and the ability to defer CapEx.

Cost Savings
The OpEx part of cost savings comes to approximately 30% of the power consumed by air cooled equipment in high efficiency (PUE = 1.35) datacenter. The CapEx savings is significant, but because it involves NPV calculations and other business considerations, is not evaluated explicitly here.

Oil Immersion Challenges
There are unique challenges raised by oil immersion cooling. These can be dealt with successfully. They include material degradation (plastics & silicones degrade in one way or another in oil), lower equipment density, and an ‘oily’ work environment. Component failure rates are similar in oil and air.

Oil Immersion Bonuses
There are several new benefits provided by oil. Decreased sound levels and greater thermal inertia provide operational benefits. An increase in thermal headroom of 20C is observed for the hottest components.

Next Steps
There are significant opportunities arising from increased thermal headroom. A simple thermal model shows how server density can be increased and the feasibility of warm water cooling.

Speakers

Randy Anderson

Ted Barragy

Cemil Ozyalcin

Data Center Infrastructure Engineer, CGG

Experiences with Oil Immersion Cooling copy pdf

Thursday March 3, 2016 10:30am - 10:50am PST
BioScience Research Collaborative Building (BRC), Room 103

10:50am PST

Algorithms & Accelerators II: Experience with Two-stage Constraint Pressure Residual Preconditioning in Production Reservoir Simulation

PRESENTATION NOT AVAILABLE

Numerical reservoir simulation is an important tool for the oil and gas industry. It is used to aid the development and forecasting of hydrocarbon reservoirs. With recent advances in parallel reservoir simulation, we have the capability to simulate larger and more realistic problems in much short time frame than ever before. In a fully-implicit simulator, most of the computational time is spent on solving a sequence of large-scale and ill-conditioned Jacobian resulting from the discretization of the material balance equations. Consequently, an efficient preconditioning strategy is required.

In this talk, we share our experience on using CPR (Constrained Pressure Residual) and BILU (Block Incomplete LU) preconditioners for large scale linear systems resulting from black-oil and compositional reservoir simulations on massively parallel computing architectures. Our numerical findings demonstrate that the CPR preconditioner is more efficient than BILU when running on a moderate core count, while the superior scalability of BILU makes it the preferred choice on larger core counts. This illustrates the importance of having multiple preconditioning options.

Speakers

Bret Beckner

Vadim Dyadechko

Jizhou Li

Ilya Mishev

Thursday March 3, 2016 10:50am - 11:10am PST
BioScience Research Collaborative Building (BRC), Room 280 & 282

10:50am PST

Facilities, Infrastructure & Visualization: Visualization of Massive Seismic Data in HPC

PRESENTATION NOT AVAILABLE

Seismic data processing is a very important tool to revealing geological structures, lithology properties, and fluid contents from seismic field data. It is also a computationally intensive operation when applied to the multiple terabytes of data acquired in a modern 3D survey. In many cases, the processing steps consist of applying relatively simple algorithms to these massive datasets. To obtain results in a timely fashion, the data is commonly processed in High Performance Computing (HPC) center. The processing results need to be visualized and verified. In this abstract, we described the approaches that the graphics server concept has been adapted to provide interactive 2D and 3D visualization for these massive datasets in HPC. Researchers transparently use the same SSH security of HPC for visualization. A web portal lets users easily obtain a remote desktop that runs in HPC graphics server. Multiple users can share a graphics server. As a result, researchers can interactively visualize hundreds of gigabytes of 2 dimensional datasets that was not able to be displayed previously. With graphics hardware in graphics server, the performance of 3D visualization is much better than the local workstation.

Speakers

Jim Ching-Rong Lin

Thursday March 3, 2016 10:50am - 11:10am PST
BioScience Research Collaborative Building (BRC), Room 103

11:10am PST

Algorithms & Accelerators II: An Efficient High Accuracy Discretization and Direct Solution Technique for Variable Coefficient Partial Differential Equations

WATCH THE PRESENTATION

The ability to efficiently and accurately solve variable coefficient partial differential equations (PDEs) is a critical for numerical simulations in seismic imaging. In this talk, we present a high-order accurate discretization technique for these challenging problems that comes complete with with an efficient and robust direct solver. The method utilizes local high order discretization gluing neighboring regions with continuum operators. The resulting sparse linear system is inherently amenable to a direct solver similar to nested dissection whose asymptotic scaling is no worse than O(N^{3/2}) precomputation where N is the number of discretization points. The cost of the applying the solver scales (at worst O(N log(N)) with a tiny constant). The result is a method that ideally suited for the ill-conditioned problems with many right hand-sides that consistently arise in the seismic imaging community. For applications where the coefficients of the PDE change locally in the geometry, such as in many inversion algorithms, the proposed method is naturally able to re-use information from the static regions making local updates extremely inexpensive. Numerical results will illustrate the performance of the proposed method.

Speakers

Adrianna Gillman

Thursday March 3, 2016 11:10am - 11:30am PST
BioScience Research Collaborative Building (BRC), Room 280 & 282

11:10am PST

Facilities, Infrastructure & Visualization: Big Seismic Data: Increase Performance for HPC and Interpretation, and Reduce Infrastructure Cost

WATCH THE PRESENTATION

Oil companies and service companies amass seismic data at the rate of hundreds of terabytes or petabytes per year. Driven by an increase in resolution and new acquisition methods, seismic data sets are larger than ever before. As a consequence, data storage system sales is the fastest growing segment according to official numbers.
Another consequence is that the internal networks quickly become a bottleneck. As users require access to increasingly larger datasets, whether pre-stack or post-stack, networks are constantly overloaded. We will present a new software-based approach to significantly improve storage capacity and simultaneously increase effective network bandwidth between central storage and consumers of seismic data. The described approach does not require upgrades or modifications to hardware, and thus enables the oil company or service company to leverage previous investments. Compared to commonly available commercial software, Hue’s implementation is approximately 25 times faster on compression and more than 200 times faster on decompression. This speed ensures that the overhead of compression and decompression is minimized. In fact it becomes a lot faster to access compressed data than the original data, providing a significant I/O boost to any application using the compressed data. As will be described, the proposed approach is generally transparent to end users.

Speakers

Michele Isernia

VP Strategy & Alliances, Hue Technology N.A

Ideas and Innovation grounded to global business development, mostly in "enterprise" type businesses.

3 Hue FAST for Rice HPC event 2016 copy pdf

Thursday March 3, 2016 11:10am - 11:30am PST
BioScience Research Collaborative Building (BRC), Room 103

11:30am PST

Algorithms & Accelerators II: Hybrid Parallel Implementation of the DG Method

WATCH THE PRESENTATION

The majority of supercomputers today are built using the so-called hybrid architecture where the central processing units are complemented by one or more accelerated units. In order to achieve optimal performance on these hybrid architectures many parallel applications must be modified to support more than one level of parallelism. This requirement can be realized by combining multiple parallel programming models and their implementations which are designed to perform best on their targeted architectures. In this work, we evaluate the parallel scalability of the Discontinuous Galerkin (DG) method using a hybrid parallel programming model. We combine three levels of parallelism to achieve the best performance on a GPU-enabled supercomputer. We also compare the performances of a purely distributed implementation with our hybrid implementation.

Speakers

Nabil Chaabane

Postdoc fellow, Rice University

Numerical methods for PDEs, High performance computing, MPI, openMP, GPU

Samy Hamlaoui

Beatrice Riviere

Noah Harding Chair and Professor, Rice University

Mikhail Serkachev

4 Nabil Chaabane pptx

Thursday March 3, 2016 11:30am - 11:50am PST
BioScience Research Collaborative Building (BRC), Room 280 & 282

11:30am PST

Facilities, Infrastructure & Visualization: Data Centric Optimizations of Seismic Natural Migration Algorithm at Scale on Parallel File Systems and Burst Buffer

WATCH THE PRESENTATION

Parallel I/O is an integral component of modern high performance computing, especially in storing and processing very large datasets, such as the case of seismic imaging applications. The storage hierarchy includes nowadays additional layers, the latest being the usage of SSD based storage as a burst buffer for I/O acceleration.
We analyze here the performance of an I/O intensive seismic application, natural migration algorithm at scale, on a large installation of Lustre parallel file system and SSD-based burst buffer. Our results show a significant performance improvement by tuning the Lustre stripe count and its counterpart in burst buffer technology for various node counts. The advantage of burst buffer is demonstrated with up to 34% performance improvement when compared to Lustre.

Speakers

Abdullah Altheyab

King Abdullah University of Science and Technology

Saber Feki

Computational Scientist, KAUST Supercomputing Laboratory

Saber Feki received his PhD and M.S in computer science at the University of Houston in 2008 and 2010 respectively. In 2011, he joined the oil and gas industry with TOTAL as an HPC Research Scientist working on seismic imaging applications using different programming models including... Read More →

Gerard Schuster

4 SeismicNaturalMigration IO BB Opt copy pdf

Thursday March 3, 2016 11:30am - 11:50am PST
BioScience Research Collaborative Building (BRC), Room 103

11:50am PST

Algorithms & Accelerators II: A Survey of Sparse Matrix-Vector Multiple Performance on Large Matrices

PRESENTATION NOT AVAILABLE

Iterative linear solvers are popular in large-scale
computing as they consume less memory than direct solvers.
Contrary to direct linear solvers, iterative solvers approach the
solution gradually requiring the computation of sparse matrix-
vector (SpMV) products. The evaluation of SpMV products
can emerge as a bottleneck for computational performance
within the context of the simulation of large problems. In this
work, we focus on a linear system arising from the discretiza-
tion of the Cahn{Hilliard equation, which is a fourth order non-
linear parabolic partial differential equation that governs the
separation of a two-component mixture into phases [3]. The
underlying spatial discretization is performed using the dis-
continuous Galerkin method and Newton's method.
A number of parallel algorithms and strategies have been eval-
uated in this work to accelerate the evaluation of SpMV prod-
ucts. “

Speakers

Mauricio Araya

Senior Researcher Computer Science, Shell Intl. E&P Inc.

Florian Frank

Rice University

Max Grossman

Christopher Thiel

Shell

Thursday March 3, 2016 11:50am - 12:10pm PST
BioScience Research Collaborative Building (BRC), Room 280 & 282

11:50am PST

Facilities, Infrastructure & Visualization: Advanced Parallel IO Libraries Study for Seismic Depth Imaging Applications

WATCH THE PRESENTATION

Seismic applications such as the Reverse Time Migration (RTM) are very demanding on HPC resources, being compute, memory or storage. Using those resources efficiently is critical on an industrial production system where a full processing campaign can take up to several months of intensive computations. Henceforth, extracting a maximum of performances from every part of a seismic processing application is a necessity.

IO operations are a critical part of a HPC seismic application and the nature of IO in this domain is of a great diversity in terms of access pattern: serial, parallel on a shared file or independently. Optimizing IO can become complex when considering the multiple level of storage in parallel at the local or system level.

While parallel IO in HPC environments is typically achieved through a mix of MPI-IO \cite{thakur1997users} for shared file IO and POSIX-IO for single file per process data accesses. Extracting good performances from the underlying file system at scale is difficult and requires a lot of optimization, tuning and boiler plate code. For this reason, advanced parallel IO libraries have been subject to an increasing interest from the Oil and Gas industry due to the advanced data management semantics they propose, the implementation simplicity and high performances.

We propose here a study on the performances of two of those libraries, namely parallel HDF5 and ADIOS for checkpointing and shared file access in parallel. This study has been done on several HPC systems in an industrial environment. We show that using advanced parallel IO libraries provides a good trade-off in terms performances for seismic software.

Speakers

Pierre-Yves Aquilanti

Jyothi Mangala Bhaskar

Thursday March 3, 2016 11:50am - 12:10pm PST
BioScience Research Collaborative Building (BRC), Room 103

12:10pm PST

Lunch & Networking

Thursday March 3, 2016 12:10pm - 1:30pm PST
BioScience Research Collaborative Building (BRC), Exhibit Hall

Break

1:30pm PST

Plenary: 'The Evolution of a Comprehensive Computation and Data Infrastructure at the Texas Advanced Computing Center', Dan Stanzione, TACC

WATCH THE PRESENTATION

Speakers

Dan Stanzione

Executive Director, Texas Advanced Computing Center, The University of Texas at Austin

Dr. Stanzione is the Executive Director of the Texas Advanced Computing Center (TACC) at The University of Texas at Austin. A nationally recognized leader in high-performance computing, Stanzione has served as Deputy Director since June 2009 and assumed the Executive Director post... Read More →

1 PM Plenary Stanzione RiceOilandGas 3 3 2016 pdf

Thursday March 3, 2016 1:30pm - 2:00pm PST
BioScience Research Collaborative Building (BRC), Room 103

2:00pm PST

Plenary: 'HPC Workforce Challenges', Barbara Chapman, Stony Brook University & University of Houston

WATCH THE PRESENTATION

Simulation and computing are essential to a significant fraction of today’s research in academia, government laboratories and in industry. They are also the basis for the design and development of many products. With the increasing reliance of US industry on computing for its business, demand for HPC-related skills in a range of disciplines including Computer Science, Applied Mathematics, Statistics and domain sciences, is expected to grow. In this presentation, we discuss the findings of a report from the US Department of Energy on HPC workforce challenges.

Speakers

Barbara Chapman

2 PM Plenary chapman rice workforce rev 3 3 2016 pdf

Thursday March 3, 2016 2:00pm - 2:30pm PST
BioScience Research Collaborative Building (BRC), Room 103

2:30pm PST

Keynote: 'The Path to Capable Exascale Computing', Paul Messina, ANL

WATCH THE PRESENTATION

Exascale computing has been the subject of study and analysis for almost ten years. Dozens of voluminous reports have been published. R&D related to the many issues and challenges involved in achieving usable and affordable exascale computers has resulted in thousands of papers and presentations. The time has come to mount a focused effort that applies the insights learned by those studies to build exascale systems. President Obama’s Executive order of July 2015 established the National Strategic Computing Initiative, a key objective of which is to accelerate delivery of a capable exascale computing system. The Executive Order assigns the lead role for pursuing that objective to the US Department of Energy Office of Science and the DOE Nuclear Security Administration.

This talk will present an overview of the efforts that are underway to put in place a joint DOE-SC and NNSA project that will result in a capable exascale ecosystem and prepare mission critical scientific and engineering applications to take advantage of that ecosystem.

Speakers

Paul Messina

Argonne National Laboratory (ANL)

Dr. Paul Messina is a senior strategic advisor and Argonne Distinguished Fellow at Argonne National Laboratory. During 2008-2015 he served as Director of Science for the Argonne Leadership Computing Facility and in 2002-2004 as Distinguished Senior Computer Scientist at Argonne and... Read More →

Thursday March 3, 2016 2:30pm - 3:15pm PST
BioScience Research Collaborative Building (BRC), Room 103

Keynote

3:15pm PST

Student Poster Session & Closing Reception

Thursday March 3, 2016 3:15pm - 5:00pm PST
BioScience Research Collaborative Building (BRC), Exhibit Hall

Poster Reception

3:15pm PST

Poster: 3D Seismic Observations of the Peridotite Ridge at the Deep Galicia Magma-Poor Rifted Margin

Speakers

Gary Gray

Speakers

Faruk O. Alpak

Shell

Florian Frank

Rice University

Chen Lu

Beatrice Riviere

Noah Harding Chair and Professor, Rice University

Thursday March 3, 2016 3:15pm - 5:15pm PST
BioScience Research Collaborative Building (BRC), Exhibit Hall

3:15pm PST

Poster: Discontinuous Galerkin Geometric Multigrid Methods

Speakers

Maurice Fabien

Matthew Knepley

Beatrice Riviere

Noah Harding Chair and Professor, Rice University

Thursday March 3, 2016 3:15pm - 5:15pm PST
BioScience Research Collaborative Building (BRC), Exhibit Hall

3:15pm PST

Poster: Efficient Seismic Modeling Using Poroelastic Approach

Speakers

Jan Hesthaven

Khemraj Shukla

Thursday March 3, 2016 3:15pm - 5:15pm PST
BioScience Research Collaborative Building (BRC), Exhibit Hall

3:15pm PST

Poster: GPU Accelerated Discontinuous Galerkin Method on Hybrid Meshes: Applications in Seismic Imaging

Speakers

Jesse Chan

Axel Modave

Postdoctoral Associate, VirginiaTech

Zheng Wang

Tim Warburton

John K. Costain Chair & Professor of Mathematics, Virginia Tech

Thursday March 3, 2016 3:15pm - 5:15pm PST
BioScience Research Collaborative Building (BRC), Exhibit Hall

3:15pm PST

Poster: GPU Accelerated Hermite Methods for the Simulation of Waves

Speakers

Jesse Chan

Arturo Vargas

Graduate Student, Rice University

Graduate student working in numerical methods for hyperbolic equations.

Thursday March 3, 2016 3:15pm - 5:15pm PST
BioScience Research Collaborative Building (BRC), Exhibit Hall

3:15pm PST

Poster: High-resolution seismic imaging based on non-convex regularization for inverting the Radon Transform

Dilipkumar Asthagiri

Walter Chapman

Jinlu Liu

Thursday March 3, 2016 3:15pm - 5:15pm PST
BioScience Research Collaborative Building (BRC), Exhibit Hall

3:15pm PST

Poster: Numerical Solution of Risk-Averse Optimization Problems Governed by Partial Differential Equations with Uncertain Coefficients

Speakers

Timur Takhtaganov

Thursday March 3, 2016 3:15pm - 5:15pm PST
BioScience Research Collaborative Building (BRC), Exhibit Hall

3:15pm PST

Poster: Optimizations of Explicitly Parallel Programs Using Polyhedral Techniques

Speakers

Prasanth Chatarasi

Graduate Student, Rice University

Parallel computing ; Optimizing compilers ; Perforamance ; OpenMP;

Jun Shirako

Vivek Sarkar

Thursday March 3, 2016 3:15pm - 5:15pm PST
BioScience Research Collaborative Building (BRC), Exhibit Hall

3:15pm PST

Poster: Overcoming Distributed Debugging Challenges in the MPI+OpenMP Programming Model

Speakers

Lai Wei

Thursday March 3, 2016 3:15pm - 5:15pm PST
BioScience Research Collaborative Building (BRC), Exhibit Hall

3:15pm PST

Poster: Parallel-in-Time Gradient-Type Method for Optimal Control Problems

Speakers

Xiaodi Deng

Rice University

Matthias Heinkenschloss

Thursday March 3, 2016 3:15pm - 5:15pm PST
BioScience Research Collaborative Building (BRC)

3:15pm PST

Poster: Paralleling Mesoscopic Phase-Field Model by PETSc and HYPRE

Speakers

Liang Hong

Ming Tang

Thursday March 3, 2016 3:15pm - 5:15pm PST
BioScience Research Collaborative Building (BRC), Exhibit Hall

3:15pm PST

Poster: Pfimbi: Accelerating Big Data Jobs Through Flow-Controlled Data Replication

Speakers

Florin Dinu

EPFL

Simbarashe Dzinamarira

Eugene Ng

Rice University

Thursday March 3, 2016 3:15pm - 5:15pm PST
BioScience Research Collaborative Building (BRC), Exhibit Hall

3:15pm PST

Poster: Stable Composite Staggered Grid Finite Difference Scheme for Anisotropic Elastic Wave Simulations

Speakers

William Symes

Rice University

Muhong Zhou

Thursday March 3, 2016 3:15pm - 5:15pm PST
BioScience Research Collaborative Building (BRC), Exhibit Hall

3:15pm PST

Poster: Towards PETSc-based OPM Upscaling of Relative Permeability as a Cloud Service

Speakers

Anne Elster

Professor/ Visiting Scholar, NTNU Trondheim, Norway and UT Austin

Founder and Director – HPC-Lab, Dept. of Computer & Info. Science NTNU, which started as my research group when I joined NTNU in 2001, and was founded as a lab in 2008. We are now an established research lab in Heterogeneous and Parallel Computing, consisting of myself, several... Read More →

Thursday March 3, 2016 3:15pm - 5:15pm PST
BioScience Research Collaborative Building (BRC), Exhibit Hall