The number of hardware threads per processor on multicore and manycore processors is growing rapidly. Fully exploiting emerging scalable parallel systems will require programs to use threaded programming models at the node level. OpenMP is the leading model for multithreaded programming. This tutorial will give a hands-on introduction of how to use Rice University's open-source HPCToolkit performance tools to analyze the performance of programs that employ MPI + OpenMP to harness the power of scalable parallel systems. See
http://hpctoolkit.org for more information about HPCToolkit.