Unit 4 GT4 Architecture


UNIT IV

PROGRAMMING MODEL

 

 

7.4.1 Open Source Grid Middleware Packages

7.4.1.1 Grid Standards and APIs

7.4.1.2 Software Support and Middleware

7.4.2 The Globus Toolkit Architecture (GT4)

7.4.2.1 The GT4 Library

7.4.2.2 Globus Job Workflow

7.4.2.3 Client-Globus Interactions

Globus Toolkit4 Architecture

Globus Toolkit Programming Model

 


 

7.4.1 Open Source Grid Middleware Packages

 

§  Many software, middleware, and programming environments have been developed for grid computing over past 15 years.

§  Popular grid middleware packages.

o   BOINC – Berkeley Open Infrastructure for Network Computing.

o   UNICORE – Middleware developed by the German grid computing community.

o   Globus (GT4) A middleware library jointly developed by Argonne National Lab., Univ. of Chicago, and USC Information Science Institute, funded by DARPA, NSF, and NIH.

o   CGSP – ChinaGrid Support Platform is a middleware library developed by 20 top universities in China.

o   Condor-G Originally developed at the Univ. of Wisconsin for general distributed computing, and later extended to Condor-G for grid job management.

o   Sun Grid Engine (SGE) Developed by Sun Microsystems for business grid applications. Applied to private grids and local clusters within enterprises or campuses.

7.4.1.1 Grid Standards and APIs

 

§  Grid standards have been developed over the years.

§  Important organizations involved

o   The Open Grid Forum (formally Global Grid Forum) and

o   Object Management Group

§  Important standards

o   OGSA (Open Grid Services Architecture

o   GLUE for resource representation,

o   SAGA (Simple API for Grid Applications),

o   GSI (Grid Security Infrastructure),

o   OGSI (Open Grid Service Infrastructure), and

o   WSRE (Web Service Resource Framework).

 

§  The grid standards have guided the development of several middleware libraries and API tools for grid computing.

§  They are applied in both research grids and production grids today.

o   Research grids tested include the EGEE, France Grilles, D-Grid (German), CNGrid (China), TeraGrid (USA), etc.

o   Production grids built with the standards include the EGEE, INFN grid (Italian), NorduGrid, Sun Grid, Techila, and Xgrid.

 

 


 

7.4.1.2 Software Support and Middleware

 

§  Grid middleware is specifically designed a layer between hardware and the software.

§  The middleware products enable

o   the sharing of heterogeneous resources and

o   managing virtual organizations created around the grid.

§  Middleware glues the allocated resources with specific user applications.

 

§  Popular grid middleware tools include the Globus Toolkits (USA), gLight, UNICORE (German), BOINC (Berkeley), CGSP (China), Condor-G, and Sun Grid Engine, etc..

 

 

 

 


 

7.4.2 The Globus Toolkit Architecture (GT4)

 

§  The Globus Toolkit, started in 1995 with funding from DARPA, is an open middleware library for the grid computing communities.

§  These open source software libraries support many operational grids and their applications on an international basis.

§  The toolkit addresses common problems and issues related to grid resource discovery, management, communication, security, fault detection, and portability.

§  The software itself provides a variety of components and capabilities.

§  The library includes a rich set of service implementations.

 

§  The implemented software

o   supports grid infrastructure management,

o   provides tools for building new web services in Java, C, and Python,

o   builds a powerful standard-based security infrastructure and client APIs (in different languages), and

o   offers comprehensive command-line programs for accessing various grid services.

 

§  The Globus Toolkit was initially motivated by a desire to remove obstacles that prevent seamless collaboration, and thus sharing of resources and services, in scientific and engineering applications.

§  The shared resources can be computers, storage, data, services, networks, science instruments (e.g., sensors), and so on.


 

§  The Globus library version GT4, is conceptually shown in Figure 7.18.

 

 

 

7.4.2.1 The GT4 Library

 

§  GT4 offers the middle-level core services in grid applications.

§  The high-level services and tools, (such as MPI, Condor-G, and Nirod/G), are developed by third parties for general-purpose distributed computing applications.

§  The local services, (such as LSF, TCP, Linux, and Condor), are at the bottom level and are fundamental tools supplied by other developers.

§  As a de facto standard in grid middleware, GT4 is based on industry-standard web service technologies.

 


 

§  Table 7.7 summarizes GT4’s core grid services by module name.

§  Essentially, these functional modules help users to discover available resources, move data between sites, manage user credentials, and so on.

 

 

§  HTTP-based GRAM – Globus Resource Allocation Manager – to locate, submit, monitor, and cancel jobs on Grid computing resources. It provides reliable operation, stateful monitoring, credential management, and file staging.

§  MDS modules – Monitory and Discovery Services – Distributed access to structure and state information.

§  Nexus is used for collective communications (unicast and multicast).

§  HBM – HeartBeat Monitoring – Monitoring system components of resource nodes.

§  GridFTP – for internode fast file transfers.

§  GASS – Global Access of Secondary Storage – provides a uniform name space (via URLs) and access mechanisms for files accessed via different protocols and stored in diverse storage system types (HTTP, FTP, HPSS, DPSS etc.).

§  GSI – Grid Security Infrastructure – specification for secure communication between software in a grid computing environment.


 

7.4.2.2 Globus Job Workflow

 

§  Figure 7.19 shows the typical job workflow when using the Globus tools.

 

 

§  A typical job execution sequence proceeds as follows:

1.    The user delegates his credentials to a delegation service.

2.    The user submits a job request to GRAM with the delegation identifier as a parameter.

3.    GRAM parses the request, retrieves the user proxy certificate from the delegation service, and then acts on behalf of the user.

4.    GRAM sends a transfer request to the RFT (Reliable File Transfer), which applies GridFTP to bring in the necessary files.

5.    GRAM invokes a local scheduler via a GRAM adapter

6.    The SEG (Scheduler Event Generator) initiates a set of user jobs.

7.    The local scheduler reports the job state to the SEG.

8.    Once the job is complete, GRAM uses RFT and GridFTP to stage out the resultant files.

9.    The grid monitors the progress of these operations and sends the user a notification when they succeed, fail, or are delayed.

7.4.2.3 Client-Globus Interactions

 

§  GT4 service programs are designed to support user applications as illustrated in Figure 7.20.

 

 

§  There are strong interactions between provider programs and user code.

§  GT4 makes heavy use of industry-standard web service protocols and mechanisms in service description, discovery, access, authentication, authorization, and the like.

§  GT4 makes extensive use of Java, C, and Python to write user code.

§  Web service mechanisms define specific interfaces for grid computing.

§  Web services provide flexible, extensible, and widely adopted XML-based interfaces.

§  GT4 provides a set of infrastructure services for accessing, monitoring, managing, and controlling access to infrastructure elements.

 


 

§  The server code in the vertical boxes in Figure 7.22 corresponds to 15 grid services that are in heavy use in the GT4 library.

§  These demand computational, communication, data, and storage resources and a range of end-user tools that provide the higher-level capabilities needed in specific user applications.

§  Wherever possible, GT4 implements standards to facilitate construction of operable and reusable user code.

§  Developers can use these services and libraries to build simple and complex systems quickly.

§  A high-security subsystem addresses message protection, authentication, delegation, and authorization.

 

 

§  GT4 has a a set of service implementations and associated client libraries and provides both web services and non-WS applications.

§  The horizontal boxes in the client domain denote custom applications and/or third-party tools that access GT4 services.

§  The toolkit programs provide a set of useful infrastructure services.

 

§  Three containers are used to host user-developed services written in Java, Python, and C, respectively.

§  These containers provide implementations of security, management, discovery, state management, and other mechanisms frequently required when building services.

§  They extend open source service hosting environments with support for a range of useful web service specifications, including WSRF, WS-Notification, and WS-Security.

 

§  A set of client libraries allow client programs in Java, C, and Python to invoke operations on both GT4 and user-developed services.

§  In many cases, multiple interfaces provide different levels of control:

§  For example, GridFTP contains,

o   simple command-line client (globusurl-copy)

o   control and data channel libraries for use in programs

o   XIO library for the integration of alternative transports.

 

§  The use of uniform abstractions and mechanisms means clients can interact with different services in similar ways, which facilitates construction of complex, interoperable systems and encourages code reuse.

 


 

 

Globus Toolkit4 Architecture:

The Globus Toolkit4 is a collection of many software components which are divided into the following five categories.


 

1.    Security – the connections should be secured based on the Grid Service Infrastructure (GSI).

2.    Information services – the information services are also called as Monitoring and Discovery services (MDS), It comprises a collection of components to discover and supervise resources in a virtual organization.

3.    Execution management – it deals with the initiation, monitoring, coordination and management of executable programs in a GRID.

4.    Data Management – These components will allow users to manage large sets of data in virtual organization.

5.    Common runtime – the Common Runtime components offer a set of fundamental libraries along with tools which are necessary to construct both web services and non-web services.