FAQ Register Files




This FAQ describes the possibilites a user has for file registration in EDG Release 2.0.

Overview


What does 'register a file' mean?

In order to register a file in the 'grid', the file needs to reside in a grid-aware store, i.e. in EDG terms on a Storage Element (SE). Every file on an SE has a well-defined Storage URL (SURL) that looks like

 srm://sehost.domain:port/filepath

The sehost.domain is the fully qualified name of the SE host where the file resides on, like adc0013.cern.ch. The port is optional, but may be given if the SE is to be contacted on a nonstandard port. The filepath finally is the proper path and name of the file in question.

A file that is registered for the first time acquires a Grid Unique ID (GUID) which is then used to uniquely identify the given file. If the file is replicated to other SEs it may be found using the given GUID. GUIDs are based on UUIDs which is an ISO standard*. An example of a GUID is

 guid:73e16e74-26b0-11d7-b1e0-c5c68d88236a

You can also register a logical name for the file that is of a more human-readable form. Logical names have the only convention that they start with 'lfn:'. An example valid logical file name is

 lfn:nobel-prize-winning-data

The mappings between GUIDs and SURLs (1:N mapping) is stored in the Replica Location Service (RLS), which, in Release 2.0 of the EDG testbed consists of a single central Local Replica Catalog (LRC) instance. A GUID may have many SURL instances, all SURLs represent a replica of the same file.

The mappings between LFNs and GUIDs (M:1 mapping) is stored in the Replica Metadata Catalog (RMC), which is also deployed as a single central catalog instance.

[Trivia facts: The UUIDS/GUIDS I-D by Paul Leach, normatively referenced by the -08 specification, was not approved by the IESG due to existing, nearly identical UUIDS in Annex A of ISO-11578, the ISO Remote Procedure Call specification. [If you click the link, you won't get the document but a nice page stating that the thing costs 340 swiss franks to download.. so much for 'open' standards..] The IESG prefers that the ISO RPC specification be used instead of the UUID/GUID I-D.]


Registering using the replica manager command-line tool, edg-rm

The replica manager command-line tool edg-rm has the following methods available to register a file:

  • edg-rm copyAndRegisterFile

    The file that is to be registered has not been copied to a grid-aware storage yet (say it is on local disk). This command will take the source file, copy it on an SE and register it in the LRC and RMC (if a logical file name has been given). See the usage of this command for details. This is a time-consuming operation, as the data needs copying as well. The GUID will be auto-generated.
  • edg-rm registerFile

    The file has already been copied to an SE and its SURL is known. The GUID will be auto-generated.
  • edg-rm registerGuid

    The file has already been copied to an SE and it already has a known GUID. Both the SURL and the GUID are given to this command. Use this with caution, it is easy to corrupt the consistency of the catalog with this command. It has been supplied only to meet the so-called 'truckFTP' use-case where a file is not shipped using FTP or any other protocol over the net but actually through FedEx on tape. The target SE has to store the received file and the registry needs to be done using an existing known GUID.

NOTE: All of these operations are relatively time-consuming (with respect to the Java API and C++ API) because the edg-rm command actually starts up a Java VM and executes Java code. To call these commands in a loop for thousands of files is very inefficient.


Registering using the replica manager Java API

 


Registering using the replica manager C++ API

 


Registering using the LRC/RMC command line tools, edg-lrc and edg-rmc

 


Registering using the LRC/RMC Java API

 


Registering using the LRC/RMC C++ API

 

 

The European Organization for Nuclear Research
Feedback and questions concerning this site should be directed to EDG-WP2@cern.ch
Last updated September 9, 2003