Chapter
Efficiency and Scalability of the Parallel Barnes-Hut Tree Code PEPC
Combining Numerical Iterative Solvers
A Comparative Study of Some Distributed Linear Solvers on Systems Arising from Fluid Dynamics Simulations
Gradient Projection Methods for Image Deblurring and Denoising on Graphics Processors
Parallel Simulations of Seismic Wave Propagation on NUMA Architectures
Aitken-Schwarz and Schur Complement Methods for Time Domain Decomposition
Performance Modeling Tools for Parallel Sparse Linear Algebra Computations
Narrow-Band Reduction Approach of a DRSM Eigensolver on a Multicore-Based Cluster System
Parallel Multistage Preconditioners by Extended Hierarchical Interface Decomposition for Ill-Conditioned Problems
A Comparison of Different Communication Structures for Scalable Parallel Three Dimensional FFTs in First Principles Codes
Parallelization Strategies for ODE Solvers on Multicore Cluster Systems
Evaluation of Parallel Sparse Matrix Partitioning Software for Parallel Multilevel ILU Preconditioning on Shared-Memory Multiprocessors
A Parallel Implementation of the Davidson Method for Generalized Eigenproblems
A Data-Flow Modification of the MUSCLE Algorithm for Multiprocessors and a Web Interface for It
A Parallel Algorithm for the Fixed-Length Approximate String Matching Problem for High Throughput Sequencing Technologies
Computing Alignment Plots Efficiently
Image Processing & Visualisation
Parallelizing the LM OSEM Image Reconstruction on Multi-Core Clusters
Hierarchical Visualization System for High Performance Computing
Real Time Ultrasound Image Sequence Segmentation on Multicores
Processing Applications Composed of Web/Grid Services by Distributed Autonomic and Self-Organizing Workflow Engines
LPT Scheduling Algorithms with Unavailability Constraints Under Uncertainties
Parallel Genetic Algorithm Implementation for BOINC
RPC/MPI Hybrid Implementation of OpenFMO – All Electron Calculations of a Ribosome
When Clouds Become Green: The Green Open Cloud Architecture
A Versatile System for Asynchronous Iterations: From Multithreaded Simulations to Grid Experiments
Exploiting Object-Oriented Abstractions to Parallelize Sparse Linear Algebra Codes
Handling Massive Parallelism Efficiently: Introducing Batches of Threads
Skeletons for Multi/Many-Core Systems
Efficient Streaming Applications on Multi-Core with FastFlow: The Biosequence Alignment Test-Bed
A Framework for Detailed Multiphase Cloud Modeling on HPC Systems
Extending Task Parallelism for Frequent Pattern Mining
Exploring the GPU for Enhancing Parallelism on Color and Texture Analysis
Generalized GEMM Kernels on GPGPUs: Experiments and Applications
Comparison of Modular Arithmetic Algorithms on GPUs
Fast Multipole Method on the Cell Broadband Engine: The Near Field Part
The GPU on the Matrix-Matrix Multiply: Performance Study and Contributions
Performance Measurement of Applications with GPU Acceleration Using CUDA
Conflict Analysis for Heap-Based Data Dependence Detection
Adaptive Parallel Matrix Computing Through Compiler and Run-Time Support
High-Throughput Parallel-I/O Using SIONlib for Mesoscopic Particle Dynamics Simulations on Massively Parallel Computers
Tracing Performance of MPI-I/O with PVFS2: A Case Study of Optimization
A Historic Knowledge Based Approach for Dynamic Optimization
Evaluation of Task Mapping Strategies for Regular Network Topologies
Benchmark & Performance Tuning
Automatic Performance Tuning of Parallel Mathematical Libraries
Automatic Performance Tuning Approach for Parallel Applications Based on Sparse Linear Solvers
A Flexible, Application- and Platform-Independent Environment for Benchmarking
Optimized Checkpointing Protocols for Data Parallel Programs
Constructing Resiliant Communication Infrastructure for Runtime Environments
Optimizing Performance and Energy of High Performance Computing Applications
Mini-Symposium "Adaptive Parallel Computing: Latency Toleration, Non-Determinism as a Form of Adaptation, Adaptive Mapping"
An Operational Semantics for S-Net
Mini-Symposium "DEISA: Extreme Computing in an Advanced Supercomputing Environment"
DEISA Mini-Symposium on Extreme Computing in an Advanced Supercomputing Environment
DEISA Extreme Computing Initiative (DECI) and Science Community Support
Application Oriented DEISA Infrastructure Services
Chemical Characterization of Super-Heavy Elements by Relativistic Four-Component DFT
Direct Numerical Simulation of the Turbulent Development of a Round Jet at Reynolds Number 11,000
EUFORIA: Exploring E-Science for Fusion
Mini-Symposium "EuroGPU 2009"
Parallel Computing with GPUs
Porous Rock Simulations and Lattice Boltzmann on GPUs
An Efficient Multi-Algorithms Sparse Linear Solver for GPUs
Abstraction of Programming Models Across Multi-Core and GPGPU Architectures
Modelling Multi-GPU Systems
Throughput Computing on Future GPUs
Mini-Symposium "ParaFPGA-2009: Parallel Computing with FPGA’s"
ParaFPGA: Parallel Computing with Flexible Hardware
Software vs. Hardware Message Passing Implementations for FPGA Clusters
RAPTOR – A Scalable Platform for Rapid Prototyping and FPGA-Based Cluster Computing
Speeding up Combinational Synthesis in an FPGA Cluster
A Highly Parallel FPGA-Based Evolvable Hardware Architecture
Applying Parameterizable Dynamic Configurations to Sequence Alignment
Towards a More Efficient Run-Time FPGA Configuration Generation
ACCFS – Virtual File System Support for Host Coupled Run-Time Reconfigurable FPGAs
Mini-Symposium "Parallel Programming Tools for Multi-Core Architectures"
Parallel Programming Tools for Multi-Core Architectures
Parallel Programming for Multi-Core Architectures
An Approach to Application Performance Tuning
How to Accelerate an Application: A Practical Case Study in Combustion Modelling
From OpenMP to MPI: First Experiments of the STEP Source-to-Source Transformation Tool
Using Multi-Core Architectures to Execute High Performance-Oriented Real-Time Applications
Performance Tool Integration in a GPU Programming Environment: Experiences with TAU and HMPP
An Interface for Integrated MPI Correctness Checking
Enhanced Performance Analysis of Multi-Core Applications with an Integrated Tool-Chain – Using Scalasca and Vampir to Optimise the Metal Forming Simulation FE Software INDEED
Mini-Symposium "Programming Heterogeneous Architectures"
Mini-Symposium on Programming Heterogeneous Architectures
Parallelization Exploration of Wireless Applications Using MPA
Prototyping and Programming Tightly Coupled Accelerators
Simplifying Heterogeneous Embedded Systems Programming Based on OpenMP