Expand your knowledge of hardware, software and supercomputing

Repairing a corrupted SGE database

If your filesystem ever fills up or the system crashes as the wrong time, your SGE database may get corrupted.  Here are steps that can usually repair the database so SGE will run properly again. cd $SGE_ROOT/default/spool cp -a spooldb spooldb.bak cd spooldb db_verify sge db_recover db_dump -f sge.out sge mv sge sge.old db_load -f […]

Taking Compute Nodes Down for Maintenance

When taking your compute nodes down for any reason, it’s good to take that node out of any job queues in which it may be a member. Nodes coming up temporarily may start new jobs, only to be shut down again, killing the user’s job. Here’s how to safely pull a node out of service […]

Creating Groups of Nodes in TORQUE

Despite being a simple first in/first out (FIFO) scheduler, pbs_sched can use node properties to emulate host groups. This can be useful if you have different types of nodes that provide different types of resources. The nodes available in TORQUE are controlled by the file /var/spool/torque/server_priv/nodes. The most basic configuration simply lists the nodes and […]

Use our Breakin stress test and diagnostics tool to pinpoint hardware issues and component failures.
Check out our product catalog and use our Configurator to plan your next system and get a price estimate.

Request a Consultation from our team of HPC Experts

Would you like to speak to one of our HPC experts? We are here to help you. Submit your details, and we'll be in touch shortly.

  • This field is for validation purposes and should be left unchanged.