Expand your knowledge of hardware, software and supercomputing

InfiniBand Port States

The status for your InfiniBand Host Channel Adapter (HCA) can be found using the ‘ibstat’ command.

# ibstat
CA 'mlx4_0'
CA type: MT4099
Number of ports: 1
Firmware version: 2.10.0
Hardware version: 0
Node GUID: 0x0002c9030031fdc0
System image GUID: 0x0002c9030031fdc3
Port 1:
State: Active
Physical state: LinkUp
Rate: 40
Base lid: 1
LMC: 0
SM lid: 1
Capability mask: 0x0251486a
Port GUID: 0x0002c9030031fdc1
Link layer: InfiniBand

For proper operation you are looking for ‘State: Active‘ and ‘Physical State: LinkUp’  

Physical State

The physical state field indicates the state of the cable.  This is very similar to the link state on Ethernet.  The values you’ll see in this field are as follows:

Polling

There is no connection from this card to another card or switch.  Check to make sure cable is installed and the device on the other end of the cable is on and working properly.

LinkUp

There is link and connection between this node and the device at the other end of the cable.  This doesn’t mean it’s configured and ready to send data, just that the physical connection is up.

State

The state shows if the HCA port is up, and if it’s been discovered by the subnet manager.

Down

There is no physical connection between the HCA card in this node and the device at the other end of the cable.  This is almost always seen when ‘Physical State’ shows the value ‘Polling.’

Initializing

Physical connection has been made between the HCA in this node and the device at the other end of the cable, but it hasn’t been discovered by the subnet manager.  You need to make sure you have a managed switch, or more likely that the ‘opensm‘ process is running on a node in your cluster.

Active

The physical connection is up and working, and the port has been discovered by the subnet manager.  The port is in a normal operational state.

Rate

The rate is the speed at which the port is operating.  This should match the speed of the slowest device between the node’s HCA and the device at the other end of the cable.  For example if you have a QDR card and a DDR switch, the speed will be DDR and not QDR.

Use our Breakin stress test and diagnostics tool to pinpoint hardware issues and component failures.
Check out our product catalog and use our Configurator to plan your next system and get a price estimate.

Request a Consultation from our team of HPC Experts

Would you like to speak to one of our HPC experts? We are here to help you. Submit your details, and we'll be in touch shortly.

  • This field is for validation purposes and should be left unchanged.