OSCAR: Monthly Chairman Bulletin

We had many summer projects this year: * ‘'’NFS Mountpoints in OSCAR, by Paul Greidanus’’’. OSCAR currently does not have an option to allow non-“standard” NFS mount points, this project would be to extend OSCAR’s native NFS mount points, to include all mounted NFS servers on the head node, and clone that configuration onto the node image. Also, investigate automounters, and cloning automount configuration as well. * ‘'’Implementation and Integration of a Universal Monitoring Framework, by Okoye Chuka’’’. Currently OSCAR can install a cluster, perform managerial tasks such as addition/deletion of nodes and also monitor the status of the cluster with ganglia or nagios. HA-OSCAR, an extension of OSCAR introduces redundancy at the head-node level by duplicating the primary head-node and based on predefined policies carries out specific actions to guarantee availability of this head-node. OSCAR cannot monitor the states of services concurrently running on all compute nodes such as lam , pbs_mom and take predefined actions in the case of failures. I propose the design, implementation and integration of a universal framework that allows gathering and storing information about the health of the various clustering services. * ‘'’OSCAR V2M Extension, by Panyong Zhang’’’. In the 2007’s soc projects, Kulathep Charoenpornwattana has implemented “Cluster Virtualization with Xen”[1], It support the Xen. As the current Lguest[3] and KVM[4] integrated in Linux kernel and vmware is used widespreadly, it would be very benificial to extend the V2M[2] to support Lguest, KVM and The purpose of this project is to extend the OSCAR V2M, Let it can work with Lguest, KVM. * ‘'’XOSCAR - A New Graphical User Interface That Enables Virtualization and System Partitioning, by Robert Barbilon’’’. OSCAR is a software management tool initially developed to support only standard Beowulf cluster. However, Beowulf clusters are not any more the typical configuration for computation, current systems being often disk-less or even composed of virtual machines. To address this issue, the OSCAR-V project has been initiated and aims to provide a unique system management software for all types of configurations (typically disk-full, disk-less, or composed with virtual machines). One of the strong limitations of the current OSCAR-V project is to not provide any Graphical User Interface (GUI), even if an effort has been initiated last year to define the premises of what should be such a GUI (using C++ and Qt4 in order to provide portability to major platforms). Therefore, this project aims to design a new GUI for OSCAR-V that supports both the deployment and management of virtual machines, disk-less & disk-full compute nodes. The design of the GUI will have to fit the underlying OSCAR-V tools used for system management. The GUI also needs to support remote management, allowing one to “drive” the OSCAR-V cluster from its laptop. * ‘'’Implementation of a Benchmarking Framework for OSCAR, by Tara McQueen’’’. Performance evaluation is mandatory for high-performance computing (HPC). The diversity of parallel applications and their target platforms have also increased considerably. This heterogeneity of applications/platforms creates many challenges for performance evaluation to the point where almost all tasks for performance evaluation are still carried out manually, from the configuration of the systems, the execution of the benchmarks, to the generation of final performance reports. Furthermore, system management tools for HPC platforms are also typically lacking benchmarking mechanisms for platform “validation”, i.e., a way to validate the software configuration and therefore the overall performance available for applications. While the benchmarking tools address the challenge of evaluating the performance of a given configuration, system management and monitoring tools address the actual configuration of the benchmarks and monitor their execution. On the benchmarking aspects, only few projects, such as Cbench, address those issues, automating the execution of benchmarks and the generation of final reports. However, only a few benchmarks are included, and those projects are still lacking extensive documentation. Yet, the gap between system management/monitoring and performance evaluation is growing. In this paper, we take a step forward in automating the run-time infrastructure needed by the tasks necessary for performance evaluation, extending Cbench and integrating Cbench into OSCAR, a software suite for the management of HPC clusters.

All these projects were pretty successful and should be integrated into OSCAR pretty soon.