Parallel Computing: Accelerating Computational Science and Engineering (CSE) ( Advances in Parallel Computing )

Publication series ：Advances in Parallel Computing

Author： Bader M.; Bode A.;Bungartz H.-J.;

Publisher： Ios Press‎

Publication year： 2014

E-ISBN: 9781614993810

P-ISBN(Hardback): 9781614993803

Subject： TP301 理论、方法

Keyword： null 自动化技术、计算机技术

Language： ENG

Access to resources Favorite

Disclaimer: Any content in publications that violate the sovereignty, the constitution or regulations of the PRC is not accepted or approved by CNPIEC.

Description

Parallel computing has been the enabling technology of high-end machines for many years. Now, it has finally become the ubiquitous key to the efficient use of any kind of multi-processor computer architecture, from smart phones, tablets, embedded systems and cloud computing up to exascale computers. _x000D_ This book presents the proceedings of ParCo2013 – the latest edition of the biennial International Conference on Parallel Computing – held from 10 to 13 September 2013, in Garching, Germany. The conference focused on several key parallel computing areas. Themes included parallel programming models for multi- and manycore CPUs, GPUs, FPGAs and heterogeneous platforms, the performance engineering processes that must be adapted to efficiently use these new and innovative platforms, novel numerical algorithms and approaches to large-scale simulations of problems in science and engineering._x000D_ The conference programme also included twelve mini-symposia (including an industry session and a special PhD Symposium), which comprehensively represented and intensified the discussion of current hot topics in high performance and parallel computing. These special sessions covered large-scale supercomputing, novel challenges arising from parallel architectures (multi-/manycore, heterogeneous platforms, FPGAs), multi-level algorithms as well as multi-scale, multi-physics and multi-dimensional problems._x000D_ It is clear that parallel computing – including the processing of large dat

Chapter

Title Page

Preface

Conference Organisation

Contents

Invited Talks

Extreme Data Science at the National Energy Research Scientific Computing (NERSC) Center

Performance Analysis Techniques for the Exascale Co-Design Process

Parallel Programming Models

XMP-IO Function and Its Application to MapReduce on the K Computer

POLCA - A Programming Model for Large Scale, Strongly Heterogeneous Infrastructures

Exploitation of Quality/Throughput Tradeoffs in Image Processing Through Invasive Computing

An Efficient Thread Mapping Strategy for Multiprogramming on Manycore Processors

A Scalable Farm Skeleton for Heterogeneous Parallel Programming

Towards Truly Boolean Arrays in Data-Parallel Array Processing

Deep Packet Inspection on Commodity Hardware Using FastFlow

Performance Analysis and Tools

Formalizing Bottlenecks in Task-Based OpenMP Applications

Characterizing Performance of Applications on Blue Gene/Q

Specification of Periscope Tuning Framework Plugins

Parallel Numerical Linear Algebra

On Using Speculative Computations for Parallel Reduction to Tridiagonal Form

Fast Approximate Solution of the Non-Symmetric Generalized Eigenvalue Problem on Multicore Architectures

Locality Optimization on a NUMA Architecture for Hybrid LU Factorization

Variable Block Algebraic Recursive Multilevel Solver (VBARMS) for Sparse Linear Systems

A Proposal of a Single-Synchronized Solver Suited to Large Scale Linear Systems on Parallel Computers with Distributed Memory

Approximate Inverse Preconditioners for Krylov Methods on Heterogeneous Parallel Computers

Cache and Energy Efficiency of Sparse Matrix-Vector Multiplication for Different BLAS Numerical Types with the RSB Format

Heterogeneous Sparse Matrix Computations on Hybrid GPU/CPU Platforms

Parallel Algorithms

MapReduce Streaming Algorithms for Laplace Relaxation on the Cloud

Space Exploration Using Parallel Orbits: A Study in Parallel Symbolic Computing

SFC-Based Communication Metadata Encoding for Adaptive Mesh Refinement

Graph Repartitioning with Both Dynamic Load and Dynamic Processor Allocation

ForestClaw: Hybrid Forest-of-Octrees AMR for Hyperbolic Conservation Laws

A Space-Time Parallel Solver for the Three-Dimensional Heat Equation

An Efficient Pipelined Implementation of Space-Time Parallel Applications

GPU Computing and Applications

Efficient GPU-Based Optimization of Volume Meshes

Fast Uniform Grid Construction on GPGPUs Using Atomic Operations

Porting Large HPC Applications to GPU Clusters: The Codes GENE and VERTEX

Numerical Simulation of the Low Compressible Viscous Gas Flows on GPU-Based Hybrid Supercomputers

Simulation of Multiphase Flows in the Subsurface on GPU-Based Supercomputers

Atomic Computing - A Different Perspective on Massively Parallel Problems

Parallelisation and Optimisation of Large-Scale Applications

Accelerating SeisSol by Generating Vectorized Code for Sparse Matrix Operators

Experience with the MPI/STARSS Programming Model on a Large Production Code

Exploiting Data- and Task-Parallelism in the Solution of Riccati Equations on Multicore Servers and GPUs

Testing and Implementing Some New Algorithms Using the FFTW Library on Massively Parallel Supercomputers

Performance Measurements of MHD Simulation for Planetary Magnetosphere on Peta-Scale Computer FX10

Parallel Simulations of Self-Propelled Microorganisms

Improving Communication Performance of Sparse Linear Algebra for an Atomistic Simulation Application

NEMORB's Fourier Filter and Distributed Matrix Transposition on Petaflop Systems

Parallel Computing Design for Exact Diagonalization Scheme on Multi-Band Hubbard Cluster Models

ParCo PhD Symposium

ParCo 2013 PhD Symposium

Numerical Experiments with New Algorithms for Parallel Decomposition of Large Computational Meshes

A Distributed Algorithm for the Permutation Flow Shop Problem - An Empirical Analysis

GPI2 for GPUs: A PGAS Framework for Efficient Communication in Hybrid Clusters

A Fault Tolerant Implementation of Multi-Level Monte Carlo Methods

High Performance CPU/GPU Multiresolution Poisson Solver

Mini-Symposium "Parallel Computing with FPGAs (ParaFPGA2013)"

ParaFPGA 2013: Harnessing Programs, Power and Performance in Parallel FPGA Applications

High-Level Synthesis Revised: Generation of FPGA Accelerators from a Domain-Specific Language Using the Polyhedron Model

Compiling a Dataflow-Based Language Abstraction onto an FPGA

Timing Driven C-Slow Retiming on RTL for MultiCores on FPGAs

Performance and Resource Modeling for FPGAs Using High-Level Synthesis Tools

Interactive Graph Cuts Using FPGA

An Image Filter System Based on Dynamic Partial Reconfiguration on FPGA

Investigating Energy Consumption of an SRAM-Based FPGA for Duty-Cycle Applications

Mini-Symposium "High-Dimensional Meets Parallel - Algorithms and Applications"

High-Dimensional Meets Parallel: Algorithms and Applications

Global Communication Schemes for the Sparse Grid Combination Technique

Load Balancing for Massively Parallel Computations with the Sparse Grid Combination Technique

A Parallel Fault Tolerant Combination Technique

Managing Complexity in the Parallel Sparse Grid Combination Technique

Scalability and Fault Tolerance of the Alternating Direction Method of Multipliers for Sparse Grids

Mini-Symposium "Application Autotuning for HPC (Architectures)"

Mini-Symposium on Application Autotuning for HPC

Investigating Performance Benefits from OpenACC Kernel Directives

Application-Independent Autotuning for GPUs

Autotuning of Pattern Runtimes for Accelerated Parallel Systems

Empirical Performance Modeling of GPU Kernels Using Active Learning

Crowdtuning: Systematizing Auto-Tuning Using Predictive Modeling and Crowdsourcing

Autotuning the Energy Consumption

Potentials and Limitations for Energy Efficiency Auto-Tuning

Mini-Symposium "Extreme Scaling on SuperMUC"

Extreme Scaling Workshop at the LRZ

Extreme Scaling of Lattice Quantum Chromodynamics

End-to-End Parallel Simulations with APES

Towards Petaflops Capability of the VERTEX Supernova Code

Scaling of the GROMACS 4.6 Molecular Dynamics Code on SuperMUC

Mini-Symposium "Parallel Programming for Heterogeneous Architectures"

Parallel Programming for Heterogeneous Architectures

Execution Schemes for the NPB-MZ Benchmarks on Hybrid Architectures: A Comparative Study

Scilab on a Hybrid Platform

Divide and Conquer Parallelization of Finite Element Method Assembly

Cudagrind: A Valgrind Extension for CUDA

Profiling Hybrid HMPP Applications with Score-P on Heterogeneous Hardware

Binary Instrumentation for Scalable Performance Measurement of OpenMP Applications

A Case Study: Holistic Performance Analysis on Heterogeneous Architectures Using the Vampir Toolchain

Further Mini-Symposium Contributions

PRACE DECI (Distributed European Computing Initiative) Minisymposium

A Generic Prototype to Benchmark Algorithms and Data Structures for Hierarchical Hybrid Grids

Towards a Performance Engineering Workflow for OpenMP 4.0

Theoretical Measures of Cache Efficiency for Tetrahedral Adaptive Meshes. A Case Study with a Quasi Space-Filling Curve Order

Author Index

The users who browse this book also browse

Description

Chapter

The users who browse this book also browse

No browse record.