| rjw | 1f88458 | 2022-01-06 17:20:42 +0800 | [diff] [blame] | 1 | SYSFS FILES | 
 | 2 |  | 
 | 3 |   For each InfiniBand device, the InfiniBand drivers create the | 
 | 4 |   following files under /sys/class/infiniband/<device name>: | 
 | 5 |  | 
 | 6 |     node_type      - Node type (CA, switch or router) | 
 | 7 |     node_guid      - Node GUID | 
 | 8 |     sys_image_guid - System image GUID | 
 | 9 |  | 
 | 10 |   In addition, there is a "ports" subdirectory, with one subdirectory | 
 | 11 |   for each port.  For example, if mthca0 is a 2-port HCA, there will | 
 | 12 |   be two directories: | 
 | 13 |  | 
 | 14 |     /sys/class/infiniband/mthca0/ports/1 | 
 | 15 |     /sys/class/infiniband/mthca0/ports/2 | 
 | 16 |  | 
 | 17 |   (A switch will only have a single "0" subdirectory for switch port | 
 | 18 |   0; no subdirectory is created for normal switch ports) | 
 | 19 |  | 
 | 20 |   In each port subdirectory, the following files are created: | 
 | 21 |  | 
 | 22 |     cap_mask       - Port capability mask | 
 | 23 |     lid            - Port LID | 
 | 24 |     lid_mask_count - Port LID mask count | 
 | 25 |     rate           - Port data rate (active width * active speed) | 
 | 26 |     sm_lid         - Subnet manager LID for port's subnet | 
 | 27 |     sm_sl          - Subnet manager SL for port's subnet | 
 | 28 |     state          - Port state (DOWN, INIT, ARMED, ACTIVE or ACTIVE_DEFER) | 
 | 29 |     phys_state     - Port physical state (Sleep, Polling, LinkUp, etc) | 
 | 30 |  | 
 | 31 |   There is also a "counters" subdirectory, with files | 
 | 32 |  | 
 | 33 |     VL15_dropped | 
 | 34 |     excessive_buffer_overrun_errors | 
 | 35 |     link_downed | 
 | 36 |     link_error_recovery | 
 | 37 |     local_link_integrity_errors | 
 | 38 |     port_rcv_constraint_errors | 
 | 39 |     port_rcv_data | 
 | 40 |     port_rcv_errors | 
 | 41 |     port_rcv_packets | 
 | 42 |     port_rcv_remote_physical_errors | 
 | 43 |     port_rcv_switch_relay_errors | 
 | 44 |     port_xmit_constraint_errors | 
 | 45 |     port_xmit_data | 
 | 46 |     port_xmit_discards | 
 | 47 |     port_xmit_packets | 
 | 48 |     symbol_error | 
 | 49 |  | 
 | 50 |   Each of these files contains the corresponding value from the port's | 
 | 51 |   Performance Management PortCounters attribute, as described in | 
 | 52 |   section 16.1.3.5 of the InfiniBand Architecture Specification. | 
 | 53 |  | 
 | 54 |   The "pkeys" and "gids" subdirectories contain one file for each | 
 | 55 |   entry in the port's P_Key or GID table respectively.  For example, | 
 | 56 |   ports/1/pkeys/10 contains the value at index 10 in port 1's P_Key | 
 | 57 |   table. | 
 | 58 |  | 
 | 59 |   There is an optional "hw_counters" subdirectory that may be under either | 
 | 60 |   the parent device or the port subdirectories or both.  If present, | 
 | 61 |   there are a list of counters provided by the hardware.  They may match | 
 | 62 |   some of the counters in the counters directory, but they often include | 
 | 63 |   many other counters.  In addition to the various counters, there will | 
 | 64 |   be a file named "lifespan" that configures how frequently the core | 
 | 65 |   should update the counters when they are being accessed (counters are | 
 | 66 |   not updated if they are not being accessed).  The lifespan is in milli- | 
 | 67 |   seconds and defaults to 10 unless set to something else by the driver. | 
 | 68 |   Users may echo a value between 0 - 10000 to the lifespan file to set | 
 | 69 |   the length of time between updates in milliseconds. | 
 | 70 |  | 
 | 71 | MTHCA | 
 | 72 |  | 
 | 73 |   The Mellanox HCA driver also creates the files: | 
 | 74 |  | 
 | 75 |     hw_rev   - Hardware revision number | 
 | 76 |     fw_ver   - Firmware version | 
 | 77 |     hca_type - HCA type: "MT23108", "MT25208 (MT23108 compat mode)", | 
 | 78 |                or "MT25208" | 
 | 79 |  | 
 | 80 | HFI1 | 
 | 81 |  | 
 | 82 |   The hfi1 driver also creates these additional files: | 
 | 83 |  | 
 | 84 |    hw_rev - hardware revision | 
 | 85 |    board_id - manufacturing board id | 
 | 86 |    tempsense - thermal sense information | 
 | 87 |    serial - board serial number | 
 | 88 |    nfreectxts - number of free user contexts | 
 | 89 |    nctxts - number of allowed contexts (PSM2) | 
 | 90 |    chip_reset - diagnostic (root only) | 
 | 91 |    boardversion - board version | 
 | 92 |  | 
 | 93 |    sdma<N>/ - one directory per sdma engine (0 - 15) | 
 | 94 | 	sdma<N>/cpu_list - read-write, list of cpus for user-process to sdma | 
 | 95 | 			   engine assignment. | 
 | 96 | 	sdma<N>/vl - read-only, vl the sdma engine maps to. | 
 | 97 |  | 
 | 98 | 	The new interface will give the user control on the affinity settings | 
 | 99 | 	for the hfi1 device. | 
 | 100 | 	As an example, to set an sdma engine irq affinity and thread affinity | 
 | 101 | 	of a user processes to use the sdma engine, which is "near" in terms | 
 | 102 | 	of NUMA configuration, or physical cpu location, the user will do: | 
 | 103 |  | 
 | 104 | 	echo "3" > /proc/irq/<N>/smp_affinity_list | 
 | 105 | 	echo "4-7" > /sys/devices/.../sdma3/cpu_list | 
 | 106 | 	cat /sys/devices/.../sdma3/vl | 
 | 107 | 	0 | 
 | 108 | 	echo "8" > /proc/irq/<M>/smp_affinity_list | 
 | 109 | 	echo "9-12" > /sys/devices/.../sdma4/cpu_list | 
 | 110 | 	cat /sys/devices/.../sdma4/vl | 
 | 111 | 	1 | 
 | 112 |  | 
 | 113 | 	to make sure that when a process runs on cpus 4,5,6, or 7, | 
 | 114 | 	and uses vl=0, then sdma engine 3 is selected by the driver, | 
 | 115 | 	and also the interrupt of the sdma engine 3 is steered to cpu 3. | 
 | 116 | 	Similarly, when a process runs on cpus 9,10,11, or 12 and sets vl=1, | 
 | 117 | 	then engine 4 will be selected and the irq of the sdma engine 4 is | 
 | 118 | 	steered to cpu 8. | 
 | 119 | 	This assumes that in the above N is the irq number of "sdma3", | 
 | 120 | 	and M is irq number of "sdma4" in the /proc/interrupts file. | 
 | 121 |  | 
 | 122 |    ports/1/ | 
 | 123 |           CCMgtA/ | 
 | 124 |                cc_settings_bin - CCA tables used by PSM2 | 
 | 125 |                cc_table_bin | 
 | 126 |                cc_prescan - enable prescaning for faster BECN response | 
 | 127 |           sc2v/ - 32 files (0 - 31) used to translate sl->vl | 
 | 128 |           sl2sc/ - 32 files (0 - 31) used to translate sl->sc | 
 | 129 |           vl2mtu/ - 16 (0 - 15) files used to determine MTU for vl |