Technology Programme

Ari-Pekka Hameri

Due to the increased momentum around Grid computing technologies, the DataGrid project became the sole project of the Technology Programme in the year 2002. The Programme seeks synergy from combining all research efforts into building one core technology base, which then serves the four different focus areas of the Programme. These focus areas include the contribution to the European DataGrid (EDG) project in the form of Grid security solutions. The research team has built a flexible authentication and authorization framework for various Grid services. The second focus area concerns cluster activities, where the Programme has provided other than physics users with the possibility to test emerging Grid technologies. These computing resources have also been successfully connected to the NorduGrid pool of clusters around Scandinavia. On the physics side, the third focus area, the Programme is actively working with the LHC Computing Grid (LCG) and NorduGrid in order to provide high level user interfaces to access lower level Grid resources. The fourth focus area involves industrial collaboration and technology transfer. The Programme has succeeded in forming the first industrial Grid research collaboration with academic and industrial partners from Finland. In addition to the Grid development, the Programme is playing a crucial role in executing a large-scale survey on the learning benefits stemming for CERN's supplier companies when interacting with the Big Science centre. A sample, covering suppliers from 6 European countries, has been detected and surveyed, the analysis is underway and the first results are expected to come out in spring 2003.

Data security

The main HIP contribution to the European Union DataGrid (EDG) project has been participating and leading the Security Task under the Data Management work package. This task provides security solutions for the whole EDG consortium. The main achievement of this task was the completion of the Grid Security Infrastructure (GSI) authentication module prototype for the CERN Advanced Storage System (CASTOR), which incorporates high storage tapes. The GSI is a Public Key Infrastructure (PKI) implementation based on verifying the identity of users in a globally distributed Grid. Following the completion of this module the support and further development of the software were handed over to the Mass Storage Management work package of the EDG project.

Other vital applications have also been developed based on the GSI library. One of these is the TrustManager - handling access, user and data authentication both on the client and the server side. The application is nearing its completion and a new certificate delegation mechanism is under development in order to improve both performance and security. The TrustManager is being integrated into the Data Management software produced in the work package. Later on this application will be further integrated with the other work packages of the EDG project, notably with the information and monitoring software. In this manner, the EDG project will deliver the overall DataGrid prototype by the end of 2003.

At the same time the authorization framework has been developed based on a modularized architecture. This was done in co-operation with the Swedish Research Council. The modularization allows great flexibility in interfacing to several kinds of storages for authorization information by changing the access module. The modularization also makes possible the use of the same authorization mechanism in different environments by simply changing the thin interface module. This authorization package is also being integrated into the Data Management software.

As the leader of the security task, the Technology Programme has actively participated in security-related meetings, workshops and conferences throughout the year. HIP has also joined the Liberty Alliance consortium, which is developing open standards for single sign-on and federated network identity.

Cluster activities and NorduGrid

The Technology Programme continues to operate and maintain a small Linux cluster in Otaniemi, Finland. This cluster has been funded by the Magnus Ehrnrooth Foundation with a three-year project that started in 2001. Focus has been on developing the cluster to evaluate different Grid technologies, training younger scientists and providing researchers outside of physics with access to exploit Grid technologies. The cluster currently comprises 10 computing nodes with a fully automatic installation (FAI) system with simultaneous support of two mainstream Linux distributions. As for applications in other sciences than physics, the cluster has been used for growing virtual cattle by a research group from the University of Zaragoza. In a similar manner the Finnish Environmental Institute (SYKE) has been preparing to test the cluster in order to study remote sensing data analysis of satellite data and error characteristics of the real measurements concerning the health of Finnish lakes.

On the Scandinavian dimension the cluster has been linked to the NorduGrid resource network. It is now possible for NorduGrid users to see the computing resources of the Technology Programme, and in this way the first steps towards Scandinavian-wide Grid resources have been taken.

The cluster has also been used as a forum to train young students. Altogether eight students have been trained to be fully competent in maintaining Grid technologies and supporting users that want to exploit them. This work is vital, as in the near future there will be a growing demand for skilled people to operate clusters not only for the needs of the LHC project, but also for the seemingly growing demand from other sciences.

CMS software, Physics and LHC Grid

Several CMS software packages (e.g. CMSIM, ORCA, OSCAR) and other physics software (e.g. ROOT, LHCXX) have been made available through the nodes of the cluster. This was done in collaboration with researchers in the LHC Programme at HIP. At the same time the NorduGrid tools have been used to test the submission of physics analysis jobs. The results produced with simulations, like the beam-beam effect in the LHC, were much appreciated by the physicists at CERN.

The Programme has successfully demonstrated a NorduGrid compliant Grid portal prototype, which can be used as a user-friendly interface to Grid resources. Past experience from the development of web-based user interfaces and portals will be exploited when further developing the new Grid portal by using GridBlocks components. GridBlocks, the Grid building blocks, form a coherent and easy-to-use package containing EDG Grid security solutions and several value added Grid application components. It is disseminated and managed as Open Source software through the Sourceforge service (http://gridblocks.sourceforge.net) since November 2002.

The Mobile Analyzer concept was generalized during the year to allow the operation of other types of applications than the ones used in physics. The application was already used for the distributed aggregation calculations of OLAP databases and neural network based distribution optimizations. The software has since been promoted to the GridBlocks framework, where the work continues to provide Grid users with a flexible web service based on a distributed agent tool. Interoperation between the Grid portal and the Mobile Analyzer has also been demonstrated during the year.

Industrial collaboration

The increased interest in the Grid technologies combined with the emerging consensus about the basic structure of the Grid solutions have created a suitable environment for exchange and testing of ideas between academia and industry. Following this trend the Technology Programme has been active in disseminating information about Grid technologies and their development to industrial parties outside the scientific world. A contact network towards the industry has been established through public presentations and numerous face-to-face visits to companies in Finland and in Switzerland.

The main result of these discussions has been the formation of a domestic project consortium with three academic and six industrial participants. The aim of the project is to share know-how on Grid technologies with industry and to provide industrial companies with a possibility to test the technologies in practice. Special focus is being placed on reviewing the impact of the Grid on the current and near-future solutions provided by the companies focusing on information and communications technologies (ICT). In addition to the technological solutions, the study also aims at trying to gain an understanding of the possible effect of the Grid in the business models of various players in the ICT cluster. Other activities that have strengthened the technology transfer activities around the Grid are the following:

As always the summer student programme of HIP was well exploited by the group. Eight new university students from Finland and Denmark together with four students of the previous year took part in a collaboration with the initiative for nanotechnologies and contributed to Grid software development work. The results of this collaboration are now being leveraged as an example in planning CERN's OpenLab student programme. One of the results of this collaboration is called the OpenLogbook, which is an e-science application to keep a record of synchronous multimedia and metadata management for easy retrieval and storage. The Technology Programme continues to work together with CERN OpenLab to facilitate similar technology spin-offs on a larger scale.