Author: Vazhkudai Sudharshan
Publisher: Springer Publishing Company
ISSN: 1570-7873
Source: Journal of Grid Computing, Vol.2, Iss.1, 2004-03, pp. : 31-42
Disclaimer: Any content in publications that violate the sovereignty, the constitution or regulations of the PRC is not accepted or approved by CNPIEC.
Abstract
Data-sharing scientific communities use storage systems as distributed data stores by replicating content. In such highly replicated environments, a particular dataset can reside at multiple locations and can thus be downloaded from any one of them. Since datasets of interest are significantly large in size, improving download speeds either by server selection or by co-allocation can offer substantial benefits. In this paper, we present an architecture for co-allocating Grid data transfers across multiple connections, enabling the parallel download of datasets from multiple servers. We have developed several co-allocation strategies comprising of simple brute-force, predictive and dynamic load balancing techniques as a means both to exploit rate differences among the various client–server links and to address dynamic rate fluctuations. We evaluate our approaches using the GridFTP data movement protocol in a wide-area testbed and present our results.
Related content
Distributed data mining on the grid
By Cannataro M. Talia D. Trunfio P.
Future Generation Computer Systems, Vol. 18, Iss. 8, 2002-10 ,pp. :
Efficient retrieval of replicated data
By Tosun Ali
Distributed and Parallel Databases, Vol. 19, Iss. 2-3, 2006-05 ,pp. :