Management tools

Managing a cluster can seem like a daunting task.  It really can be quite easy with the inclusion of Advanced Clustering's management software packages, free with any Apex HPC cluster purchase, and some of the hardware devices describe below.

IPMI

IPMI (Intelligent Platform Management Interface) is an open-standard management system designed for remote monitoring and control of servers.  IPMI is available as an option on most all of Advanced Clustering's Pinnacle Servers.

IPMI works by embedding a small service processor or Baseboard Management Controller (BMC) in the system. The BMC will be powered on an operational as long as the system is plugged into the main electrical power. It operates even when the system is actually turned off, the operating system has crashed, and during most hardware failures.

The BMC can be controlled either in-band via the operating system running on the server or out-of-band via a TCP/IP network connection. In a cluster environment the out-of-band management functionality is especially helpful. The out-of-band management allows your system administrator to control all nodes in the cluster from a central point. The admin would have the option of checking fan, temperature, and power supply voltage sensor data, powering on or off a system, or even connecting to the console of the machine.

Serial Console

Serial Consoles are another hardware device that can be used for out-of-band management. With these devices you will be able to connect to the console of each node from anywhere you allow access.

This kind of access can be a real time-saver when your cluster is located in the data-center down the hall or across the world. Most of Advanced Clustering's compute nodes allow for serial BIOS redirection, so you can monitor or change any board level setting without being in front of the machine. When purchased as part of your cluster Advanced Clustering will setup and enable serial re-direction of the BIOS, boot-loader, and operating system -- giving you complete remote control over your entire system.

KVM Switch

A KVM (Keyboard, Video, Mouse) switch is a hardware device that allows a user to control multiple systems from a single keyboard, video monitor and mouse.

Each system is connected to the KVM device via a dedicated cable. Control between systems can be achieved by pressing buttons on the KVM device or via hotkeys on the keyboard (often combinations of CTRL or SCROLL LOCK). Many KVM options are available allowing from as few as 2 computers to as large as hundreds making them suitable for most any size cluster.

While KVM's are useful management devices, they do have some limitations. Access is limited to only a few feet away from the cluster without the additional KVM over IP capabilities, and they typically only allow for one console access at a time. Some enterprise models do offer multiple console access, but they can be quite expensive.

Network controlled PDU

To allow an administrator complete management over their cluster we recommend a remote power control device.  These stand-alone devices are a combination power controller and power distribution unit.  Through extensive testing we've found APC's line of Masterswitch devices to provide the best feature set and are available in multiple configurations meeting the needs of any data-center.

Since most clusters larger than a few nodes in size would require more than one power control device,  Advanced Clustering's Beo Utils package is included with all cluster purchases to make using these devices easier.  Instead of having to remember what outlet, on which device a particular node is plugged into, you can use a simple command line tool to turn on, off, or hardware reboot a node by just knowing it's hostname.

 

Customer testimonial

"I've *never* had such consistent high quality support and service from any other company. You guys are fantastic! You rock!" -- Daniel R. at SAIC

Find out even more reasons to purchase from us

 

Management tools

All of these management tools are options in our Apex HPC cluster systems.  Find out more information or request a quote for your next cluster.

Contact Info
Toll-free: 866-802-8222 International: 913-643-0300 Email: info, sales, support