Globus Toolkit Programming Model


Globus Toolkit Programming Model

 

The Globus Toolkit client libraries provide high level commands that implement a protocol without requiring the developer to have an in-depth knowledge of the protocol. The toolkit mainly provides remote file system commands and data movement commands, as well as a range of other helper functions to assist in executing these operations.

 

Consider for example the GridFTP. GridFTP is the de facto standard for data access. GridFTP client library uses an asynchronous programming model. The steps required for using the GridFTP client library in an application is described below.

 

1.    Include headers

2.    Module Activation/Initialization

3.    Handle Setup

4.    Check Features

5.    Set Operation Attributes

6.    Execute the Operation

7.    Module Deactivation and Cleanup

1.    Include headers

All code that makes calls to functions in the GridFTP client library must have the following include:

#include “globus_ftp_client.h

 

2.    Module Activation/Initialization:

In any Globus Toolkit C code,

§  if you make a direct call to a module function, you must activate and deactivate the module.

o   Example – To use XIO for doing the file IO – need to call activate/deactivate on XIO.

§  The module will activate and deactivate its own dependencies.

o   For instance, Globus XIO and Globus GSI, are used by the Globus GridFTP client library. They will be automatically activated when you call activate on the GridFTP client library.

 

result = globus_module_activate(GLOBUS_FTP_CLIENT_MODULE);

 

3.    Handle Setup

§  Every function call is a completely encapsulated GridFTP session:

o   a control connection is formed,

o   authentication is done,

o   if necessary a data channel is established,

o   the necessary data is transferred,

o   and then everything is torn down.

globus_ftp_client_handleattr_init(&handle_attr);

globus_ftp_client_handleattr_set_cache_all(GLOBUS_TRUE);

globus_ftp_client_handle_init(&handle, &handle_attr);

 


 

4.    Check Features

§  A good practice is to always verify that the server supports any features you intend to exploit.

§  Checking for features comprises four steps.

§  First, you call init.

§  Second, you call the features function, which sends the FTP FEAT command do the server, then parses the response, and loads the structure with the results.

§  Third, to access the results, you call is_feature_supported listing the feature you are interested in; the features are an enumerated type.

§  Fourth, when you are finished, you call features_destroy to free the memory for the structure.

§  Checking features encapsulates an entire GridFTP session.

o   Therefore you have to specify a URL (protocol, host, and port) and a callback function.

§  In code it looks like this:

result = globus_ftp_client_features_init(features);

result = globus_ftp_client_feat(handle, url, operation_attr, features,

complete_cb, callback_args);

while(!done)

{

globus_cond_wait(&cond, &lock);

}

result = globus_ftp_client_is_feature_supported(features, answer,

GLOBUS_FTP_CLIENT_FEATURE_PARALLELISM);

 

§  where the complete callback would simply set done to TRUE.

§  Note that answer can be GLOBUS_FTP_CLIENT_TRUE, FALSE, or MAYBE. MAYBE means that particular feature was not probed; it does not necessarily indicate an error.

 

5.    Set Operation Attributes

 

§  Once you know what features are supported by the servers you are using, you can configure any necessary attributes for the various operations

§  The operation attributes can be separated into two categories:

o   data movement options and

o   security options.

§  All the function calls have the same form:

globus_ftp_client_operationattr_[set|get]_?attribute?.

 

§  The set variant changes the value of the attribute, and the get variant returns the current value of the attribute.

§  The data movement attributes are as follows:

o   type: This sets the file type to either ASCII or Image (binary).

o   mode: GridFTP supports two modes:

§  stream (MODE S) and

·      default mode – the file is moved by sending the bytes as an ordered sequence of bytes over the wire

§  extended block (MODE E).

·      GridFTP extension. This mode sends the data in blocks with eight bits of flags and a 64-bit offset and length prepended.

o   parallelism: This specifies the number of TCP streams that should be opened between each network endpoint.

o   tcp_buffer: This is another important performance attribute. If the TCP buffer size is not sufficiently large, it will limit your performance. For an explanation of why that is so, please see last month's "On the Grid" article on GridFTP performance tuning.

 

§  Security is a critical aspect of any Grid application.

§  GridFTP provides a wide range of security options.

o   Both the authentication and the protection level may be set on the control channel and the data channel.

§  GridFTP offers three mode of authentication through gss:

o   NONE – no authentication is performed,

o   SELF – indicates that the server should be running under your credentials

o   SUBJECT – allows you to specify the expected subject name that the server will authenticate with.

 

§  Protection relates to verification of the data.

§  Standard security measures define four levels of protection:

o   CLEAR indicates no protection of any kind

o   SAFE indicates that the data is integrity checked (using checksum)

o   CONFIDENTIAL means that the data is encrypted

o   PRIVATE means that the data is both encrypted and integrity checked.

 

For example, if you wish to set MODE E (required for parallelism), the parallelism to four streams, and your TCP buffer size to 2 MB per stream, the code would look like this:

globus_ftp_client_operationattr_init(&attr);

parallelism.mode = GLOBUS_FTP_CONTROL_PARALLELISM_FIXED;

parallelism.fixed.size = 4;

globus_ftp_client_operationattr_set_mode(&attr,

         GLOBUS_FTP_CONTROL_MODE_EXTENDED_BLOCK);

globus_ftp_client_operationattr_set_parallelism(&attr, &parallelism);

 


 

6.    Execute the Operation

§  We are now ready to move some data.

§  The asynchronous nature of the Globus Toolkit comes into play here again.

§  To start – initiate the control channel protocol exchange.

§  This command sends all of the necessary command to the server and the server waits for data to come down the data channel

§  Next– read the data off the disk and send it to the waiting server.

§  This process is handled by the register_read or in this case, the register_write command.

§  When register_write completes writing the data the callback function checks to see whether the file is at EOF – if not, it reads another block and calls register_write again.

§  The process continues until the entire file has been moved.

result = globus_ftp_client_put(

     &handle,                    /* the handle we initialized above */

     dst,                            /* the URL of the destination */

     &attr,                         /* the operation attribute structure */

     GLOBUS_NULL,       /* Restart Markers, if any */

     done_cb,                    /* callback when transfer is complete */

     0);                              /* an optional argument to the callback */

 

globus_ftp_client_register_write (

handle,                       /* the handle to our session */

buffer,                       /* the data to send to the server */

length,                       /* length of the data */

global_offset,             /* offset in file where this should be written */

feof(fd),                     /* are we at EOF? */

data_cb,                    /* function to call when write is complete */

GLOBUS_NULL);      /* argument to the callback */

 

7.    Module Deactivation and Cleanup

After the work is done, cleanup remains.

1.    Free buffers that you allocated or were allocated for you by calls such as globus_error_get() and globus_print_friendly().

2.    Destroy anything that you initialized, and then deactivate the module:

globus_ftp_client_operationattr_destroy(operation_attr);

globus_ftp_client_handleattr_destroy(handle_attr);

globus_ftp_client_handle_destroy(handle);

globus_module_deactivate(GLOBUS_FTP_CLIENT_MODULE);

 

3.    A short cut call to free buffers and deactivate the module –

globus_module_deactivate_all();

 

 

 

No comments:

Post a Comment

Don't be a silent reader...
Leave your comments...

Anu