CLEO: USER & ADMIN-FRIENDLY CLUSTER BATCH SYSTEM

One of the most important components of modern cluster is a batch system. Developed in RCC MSU, Cleo system is intended for task control on clusters with different configurations and requirements on computational resources usage. All major parallel programming environments are supported, such as mpich, mvapich, Intel MPI and others.

The system is portable between majority of UNIX-platforms, so it can be used on almost any system. It is able to work with several clusters simultaneously as well as with particular partitions of clusters. Flexible tuning of system’s parameters, resource usage policies and independent scheduler make Cleo more convenient for administrating and management of different cluster modes. Automatic and manual blocking of nodes and tasks simplifies cluster management without necessity to stop all users’ activities. Priority system and load prediction help managing cluster usage effectively with minimal efforts.

Cleo is easily extendable. Modules’ interface is documented and provided with examples, which helps to create quickly new schedulers or complement system capabilities with new functionality. All Cleo modules work in protected environment, therefore security of the entire system is increased.

The system state is available in XMLformat and can be used by any external application. Statistics gathering tools provide both complex and detailed reports on users’ activities on a cluster.

Cleo is used on SKIF MSU “CHEBYSHEV” cluster (5000 cores) and “GraphIT!” (192 CPU cores + 48
GPUs) supercomputers of Moscow State University
and in several other organizations.

Cleo website: http://sf.net/projects/cleo-bs.html



"Chebyshev" load visualization system

User login