RHEL-RT Infiniband

From RHEL-RT

Jump to: navigation, search

Infiniband Notes

This page contains some email excerpts regarding infiniband support in RHEL.

  • Mellanox MT25204 hardware issue
    • Hardware testing for the Mellanox MT25204 has revealed that an internal error occurs under certain high-load conditions. When the ib_mthca driver reports a catastrophic error on this hardware, it is usually related to an insufficient completion queue depth relative to the number of outstanding work requests generated by the user application. Although the driver will reset the hardware and recover from such an event, all existing connections are lost at the time of the error. This generally results in a segmentation fault in the user application. Further, if opensm is running at the time the error occurs, then it will have to be manually restarted in order to resume proper operation. This same issue exists in RHEL5.1 (bugzilla 251934). The OFED 1.2 utilities are directly inherited from the RHEL5.1 baseline).
  • question How does the infiniband support included in MRG Realtime differ from what is in standard RHEL5?
    • answer The OFED user space utilities are identical. The kernel portion of infiniband is basically the same - only difference is porting it to the different MRG Realtime kernel. In other words, we haven't done any realtime specific infiniband work.
  • question As far as I am aware, with OFED 1.1 we have Socket Direct Protocol (SDP) support along with RDMA and IPoIB.
    • answer Correct.
  • question With OFED 1.2 in RHEL5.1 (and RHEL 4.6?) we are supporting the multipath features using dm-mpio rather than OFED 1.2 itself. (too invasive, kABI changes)
    • answer I'm not aware of any attempts to support multipath via dm-mpio. As I understand it, dm-mpio is just a block device multipath. Commonly, when referring to multipath in the IB space, people are really talking about transparent either bonded or fail over IB physical links, which implies that the *network* stack is multipath, not the block devices hooked on top of the network stack. For example, you wouldn't be able to use dm-mpio to multipath an iSCSI device over two gigabit ethernet links if the ethernet links are already bonded together into bond0. Then you would just run to the block device over bond0 as a single path and the bond device does the multipathing for you. Same here. The people I've talked to all want multipath TCP/IP and RDMA sockets, not block devices, so the multipathing to satisfy their requirements *must* be at a lower level than dm-mpio. I'm hoping that for either 5.2/4.7 or 5.3/4.8 to put in the patches that allow the IPoIB interfaces to be used with our already existing bonding driver so that you can have bonded ib? interfaces. However, it might not be possible for kABI reasons, but since it isn't done yet, I can't say for sure.
  • question The LD_preload shim for SDP to work - ie. is that affected by -RT?
    • answer No.
  • question What are limitations of support for SDP, IP over Infiniband (IPoIB) and the SCSI RDMA?
    • answer These three items are irrespective of hardware. They are upper layer protocols and work over any supported hardware.
  • question Specifically what are the HW choices around HW supported on HP c-class blades?
    • answer Right now we support Mellanox and QLogic Infinipath hardware. There is additional hardware support present, but it's in tech preview state and

is mostly centered around 10GigE/iWARP hardware.

References

Personal tools