

Author: Yang Laurence Tianruo Brent Richard P.
Publisher: Springer Publishing Company
ISSN: 0920-8542
Source: The Journal of Supercomputing, Vol.29, Iss.2, 2004-08, pp. : 145-156
Disclaimer: Any content in publications that violate the sovereignty, the constitution or regulations of the PRC is not accepted or approved by CNPIEC.
Abstract
In this paper we mainly study the parallelization of the CGLS method, a basic iterative method for large and sparse least squares problems in which the conjugate gradient method is applied to solve normal equations. On modern parallel architectures its parallel performance is always limited because of the global communication required for inner products, the main bottleneck of parallel performance. In this paper, we describe a modified CGLS (MCGLS) method which improve parallel performance by assembling the results of a number of inner products collectively and by creating situations where communication can be overlapped with computation. More importantly, we also propose an improved CGLS (ICGLS) method to reduce inner product's global synchronization points to half, then significantly improve the parallel performance accordingly compared with the standard CGLS method and the MCGLS method.
Related content




By Lee Chain-Wu Huang Chun-Hsi Yang Laurence Tianruo Rajasekaran Sanguthevar
The Journal of Supercomputing, Vol. 29, Iss. 2, 2004-08 ,pp. :


By Chen Ling Chen Hongjian Pan Yi Chen Yixin
The Journal of Supercomputing, Vol. 29, Iss. 2, 2004-08 ,pp. :


By Li Xuhui Cao Jiannong He Yanxiang
The Journal of Supercomputing, Vol. 29, Iss. 2, 2004-08 ,pp. :


By Wang Hui Guo Minyi Wei Daming
The Journal of Supercomputing, Vol. 29, Iss. 2, 2004-08 ,pp. :