WP2 Optimization Task




Grid data access optimization

Jobs submitted to the EU DataGrid require three types of resources: computing, storage and network. The Grid must make scheduling and replication decisions based on the current state of these resources. Any file on the Grid may have several replicas, in several different locations on the Grid and the aim of the ROS is to optimize access to these files for all jobs that run on the Grid. This is done in two ways: firstly the ROS provides an estimate of the access cost for a job to run at a particular site, i.e. the time it would take to access all the files required for the job. This information is used by a Resource Broker to schedule the job to the optimal Grid site.

Secondly, the ROS is responsible for optimizing file access as jobs are running and this is carried out in two stages:

  • Short-term optimization: When a job requests a file the ROS finds the best replica on the Grid in terms of the cost of transferring the file to the local site the job is running on.
  • Long-term optimization: Using long-term data access patterns the ROS can create and delete replicas anywhere on the Grid according to its predictions of file usage across the Grid. We plan to use an economy-based algorithm.
The European Organization for Nuclear Research
Feedback and questions concerning this site should be directed to hep-project-grid-optorsim@listbox.cern.ch Last updated Febuary 2, 2004