Globus
Toolkit Programming Model
The Globus Toolkit client libraries
provide high level commands that implement a protocol without requiring the
developer to have an in-depth knowledge of the protocol. The toolkit mainly
provides remote file system commands and data movement commands, as well as a
range of other helper functions to assist in executing these operations.
Consider for example the GridFTP. GridFTP is the de facto
standard for data access. GridFTP client library uses
an asynchronous programming model. The steps required for using the GridFTP client library in an application is described
below.
1.
Include headers
2.
Module Activation/Initialization
3.
Handle Setup
4.
Check Features
5.
Set Operation Attributes
6.
Execute the Operation
7.
Module Deactivation and Cleanup
1.
Include
headers
All code that makes calls to functions in the GridFTP client library must have the following include:
#include “globus_ftp_client.h”
2.
Module
Activation/Initialization:
In any Globus Toolkit C code,
§ if you make a direct call to a module function, you must
activate and deactivate the module.
o Example –
To use XIO for doing the file IO – need to call activate/deactivate on XIO.
§ The module
will activate and deactivate its own dependencies.
o For
instance, Globus XIO and Globus GSI, are used by the
Globus GridFTP client library. They will be
automatically activated when you call activate on the GridFTP
client library.
result = globus_module_activate(GLOBUS_FTP_CLIENT_MODULE);
3.
Handle
Setup
§ Every
function call is a completely encapsulated GridFTP
session:
o a control
connection is formed,
o authentication
is done,
o if
necessary a data channel is established,
o the
necessary data is transferred,
o and then everything is torn down.
globus_ftp_client_handleattr_init(&handle_attr);
globus_ftp_client_handleattr_set_cache_all(GLOBUS_TRUE);
globus_ftp_client_handle_init(&handle,
&handle_attr);
4.
Check
Features
§ A good
practice is to always verify that the server supports any features you intend
to exploit.
§ Checking
for features comprises four steps.
§ First, you
call init.
§ Second, you
call the features function, which sends the FTP FEAT command do the server,
then parses the response, and loads the structure with the results.
§ Third, to
access the results, you call is_feature_supported
listing the feature you are interested in; the features are an enumerated type.
§ Fourth,
when you are finished, you call features_destroy to
free the memory for the structure.
§ Checking
features encapsulates an entire GridFTP session.
o Therefore
you have to specify a URL (protocol, host, and port) and a callback function.
§ In code it
looks like this:
result = globus_ftp_client_features_init(features);
result = globus_ftp_client_feat(handle, url,
operation_attr, features,
complete_cb, callback_args);
while(!done)
{
globus_cond_wait(&cond, &lock);
}
result = globus_ftp_client_is_feature_supported(features, answer,
GLOBUS_FTP_CLIENT_FEATURE_PARALLELISM);
§ where the complete callback would simply set done to TRUE.
§ Note that
answer can be GLOBUS_FTP_CLIENT_TRUE, FALSE, or MAYBE. MAYBE means that
particular feature was not probed; it does not necessarily indicate an error.
5.
Set
Operation Attributes
§ Once you
know what features are supported by the servers you are using, you can
configure any necessary attributes for the various operations
§ The
operation attributes can be separated into two
categories:
o data
movement options and
o security options.
§ All the
function calls have the same form:
globus_ftp_client_operationattr_[set|get]_?attribute?.
§ The set
variant changes the value of the attribute, and the get variant returns the
current value of the attribute.
§ The data
movement attributes are as follows:
o type: This sets
the file type to either ASCII or Image (binary).
o mode: GridFTP
supports two modes:
§ stream
(MODE S) and
·
default mode – the file is moved by sending the bytes
as an ordered sequence of bytes over the wire
§ extended block (MODE E).
·
GridFTP extension.
This mode sends the data in blocks with eight bits of flags and a 64-bit offset
and length prepended.
o parallelism: This
specifies the number of TCP streams that should be opened between each network
endpoint.
o tcp_buffer: This is
another important performance attribute. If the TCP buffer size is not
sufficiently large, it will limit your performance. For an explanation of why
that is so, please see last month's "On the Grid" article on GridFTP performance tuning.
§ Security is
a critical aspect of any Grid application.
§ GridFTP provides a wide range of security
options.
o Both the authentication and the protection level may be set on the
control channel and the data channel.
§ GridFTP offers three mode of authentication
through gss:
o NONE – no authentication is performed,
o SELF – indicates that the server should
be running under your credentials
o SUBJECT – allows you to specify the
expected subject name that the server will authenticate with.
§ Protection relates to verification of the
data.
§ Standard
security measures define four levels of protection:
o CLEAR indicates no protection of any
kind
o SAFE indicates that the data is
integrity checked (using checksum)
o CONFIDENTIAL means that the data is encrypted
o PRIVATE means that the data is both
encrypted and integrity checked.
For example, if you wish to set MODE E (required for
parallelism), the parallelism to four streams, and your TCP buffer size to 2 MB
per stream, the code would look like this:
globus_ftp_client_operationattr_init(&attr);
parallelism.mode = GLOBUS_FTP_CONTROL_PARALLELISM_FIXED;
parallelism.fixed.size = 4;
globus_ftp_client_operationattr_set_mode(&attr,
GLOBUS_FTP_CONTROL_MODE_EXTENDED_BLOCK);
globus_ftp_client_operationattr_set_parallelism(&attr, ¶llelism);
6.
Execute the
Operation
§ We are now
ready to move some data.
§ The
asynchronous nature of the Globus Toolkit comes into play here again.
§ To start – initiate
the control channel protocol exchange.
§ This
command sends all of the necessary command to the server and the server waits
for data to come down the data channel
§ Next– read
the data off the disk and send it to the waiting
server.
§ This
process is handled by the register_read or in this
case, the register_write command.
§ When register_write completes writing the data the callback function
checks to see whether the file is at EOF – if not, it reads another block and
calls register_write again.
§ The process
continues until the entire file has been moved.
result = globus_ftp_client_put(
&handle, /* the handle we initialized above */
dst,
/* the URL of
the destination */
&attr, /* the operation
attribute structure */
GLOBUS_NULL, /* Restart Markers, if any */
done_cb, /* callback when transfer is
complete */
0); /*
an optional argument to the callback */
globus_ftp_client_register_write (
handle, /*
the handle to our session */
buffer, /*
the data to send to the server */
length, /*
length of the data */
global_offset, /*
offset in file where this should be written */
feof(fd), /*
are we at EOF? */
data_cb, /* function to call when
write is complete */
GLOBUS_NULL);
/* argument to the callback */
7.
Module
Deactivation and Cleanup
After the work is done, cleanup remains.
1.
Free buffers that you allocated or were allocated for
you by calls such as globus_error_get() and globus_print_friendly().
2.
Destroy anything that you initialized, and then
deactivate the module:
globus_ftp_client_operationattr_destroy(operation_attr);
globus_ftp_client_handleattr_destroy(handle_attr);
globus_ftp_client_handle_destroy(handle);
globus_module_deactivate(GLOBUS_FTP_CLIENT_MODULE);
3.
A short cut call to free buffers and deactivate the
module –
globus_module_deactivate_all();
No comments:
Post a Comment
Don't be a silent reader...
Leave your comments...
Anu