Speaker: Yves Caniou (JFLI)
Date: 24th September 2012
Place: room 102, Faculty of Science Bldg. 7, Hongo Campus, The University of Tokyo
In september 2011, the Open Grid Forum standardized the document “Data Management API within the GridRPC” which discribes an optional API that extends the GridRPC standard. Used in a GridRPC middleware, it provides a minimal set of functions to handle a large set of data operations among which: movements, replications, migrations, data prefetch and persistency.
We will present a library implementing the API that has been integrated in two different middleware, respectively DIET and NINF. We have conducted several experiments, showing very high benefits that a Grid user can expect 1) in terms of resource usage compared to the current GridRPC context since useless transfers are avoided; 2) in terms of reducing the completion time of an application to obtain results the soonest (data can be prefetched and replicated, hence letting calculus to be submitted really soon in a workflow analysis in addition to the possible overlap between computations and communications); 3) in terms of code portability, since we show with these examples that at last the same GridRPC code can be compiled and executed within two different GridRPC middleware which implements the GridRPC data management API; 4) finally we thus obtain middleware interoperability without any explicit glue as generally done: we show as a proof of concept that resources dispatched across different administrative domains can be used altogether without the underlying distributed data management systems having any knowledge of the workflow and/or computing resources: computational servers of DIET and NINF transparently collaborate to the same calculus by sharing GridRPC data!
Last results has been published in [1,2] in which we explained the API and described the improvements using the implementation of this standard. Further work will go on the transparent management of protocols such as GridFTP, iRods, torrent, and the use of catalog for data, as well on security.
 Yves Caniou, Eddy Caron, Gaël Le Mahec, and Hidemoto Nakada. Standardized Data Management in GridRPC Environments. In 6th International Conference on Computer Sciences and Convergence Information Technology, Jeju Island, Korea, Nov. 29 – Dec. 1 2011.
 Yves Caniou, Eddy Caron, Gaël Le Mahec, and Hidemoto Nakada. Transparent Collaboration of GridRPC Middleware using the OGF Standardized GridRPC Data Management API. In The International Symposium on Grids and Clouds (ISGC), February 26 – March 2 2012. Proceedings of Science.