現状
100GbpsケーブルでMellanox CX4とDX010を接続しても100Gbpsでリンクアップしない.
しかし,別のHPE FlexFabric(40Gbps)とは40Gbpsだけどリンクアップする.
mlxlinkからLogを見てみてもPHY FWというエラーが出て,スピードを変えたり,FECを変えたりしてもリンクアップしなかった.
# mlxlink -d /dev/mst/mt4115_pciconf0.1
Operational Info
----------------
State : Polling
Physical state : ETH_AN_FSM_ABILITY_DETECT
Speed : N/A
Width : N/A
FEC : N/A
Loopback Mode : No Loopback
Auto Negotiation : FORCE - 100G
Supported Info
--------------
Enabled Link Speed : 0x00f00000 (100G)
Supported Cable Speed : 0x48101065 (100G,50G,40G,25G,20G,10G,1G)
Troubleshooting Info
--------------------
Status Opcode : 36
Group Opcode : PHY FW
Recommendation : Other issues
Tool Information
----------------
Firmware Version : 12.28.2006
MFT Version : mft 4.30.1-113
解決策
DX-010のFECをRSモードに切り替えることで,解決できるとのイシューレポートを発見
https://github.com/sonic-net/sonic-buildimage/issues/6476
https://github.com/sonic-net/SONiC/issues/384
sudo config interface fec Ethernet96 rs
$ show interfaces status
Interface Lanes Speed MTU FEC Alias Vlan Oper Admin Type Asym PFC
----------- --------------- ------- ----- ----- ------- ------ ------ ------- --------------- ----------
Ethernet0 65,66,67,68 40G 9100 N/A Eth1 trunk up up QSFP28 or later N/A
Ethernet4 69,70,71,72 100G 9100 N/A Eth2 trunk down up N/A N/A
Ethernet8 73,74,75,76 100G 9100 N/A Eth3 trunk down up N/A N/A
Ethernet12 77,78,79,80 100G 9100 N/A Eth4 trunk down up N/A N/A
Ethernet16 33,34,35,36 100G 9100 N/A Eth5 trunk down up N/A N/A
Ethernet20 37,38,39,40 100G 9100 N/A Eth6 trunk down up N/A N/A
Ethernet24 41,42,43,44 100G 9100 N/A Eth7 trunk down up N/A N/A
Ethernet28 45,46,47,48 100G 9100 N/A Eth8 trunk down up N/A N/A
Ethernet32 49,50,51,52 100G 9100 N/A Eth9 trunk down up N/A N/A
Ethernet36 53,54,55,56 100G 9100 N/A Eth10 trunk down up N/A N/A
Ethernet40 57,58,59,60 100G 9100 N/A Eth11 trunk down up N/A N/A
Ethernet44 61,62,63,64 100G 9100 N/A Eth12 trunk down up N/A N/A
Ethernet48 81,82,83,84 100G 9100 N/A Eth13 trunk down up N/A N/A
Ethernet52 85,86,87,88 100G 9100 N/A Eth14 trunk down up N/A N/A
Ethernet56 89,90,91,92 100G 9100 N/A Eth15 trunk down up N/A N/A
Ethernet60 93,94,95,96 100G 9100 N/A Eth16 trunk down up N/A N/A
Ethernet64 97,98,99,100 100G 9100 N/A Eth17 trunk down up QSFP28 or later N/A
Ethernet68 101,102,103,104 100G 9100 N/A Eth18 trunk down up N/A N/A
Ethernet72 105,106,107,108 100G 9100 N/A Eth19 trunk down up N/A N/A
Ethernet76 109,110,111,112 100G 9100 N/A Eth20 trunk down up N/A N/A
Ethernet80 1,2,3,4 100G 9100 N/A Eth21 trunk down up N/A N/A
Ethernet84 5,6,7,8 100G 9100 N/A Eth22 trunk down up N/A N/A
Ethernet88 9,10,11,12 100G 9100 N/A Eth23 trunk down up N/A N/A
Ethernet92 13,14,15,16 100G 9100 N/A Eth24 trunk down up N/A N/A
Ethernet96 17,18,19,20 100G 9100 rs Eth25 trunk up up N/A N/A
Ethernet100 21,22,23,24 100G 9100 N/A Eth26 trunk down up N/A N/A
Ethernet104 25,26,27,28 100G 9100 N/A Eth27 trunk down up N/A N/A
Ethernet108 29,30,31,32 100G 9100 N/A Eth28 trunk down up N/A N/A
Ethernet112 113,114,115,116 100G 9100 N/A Eth29 trunk up up N/A N/A
Ethernet116 117,118,119,120 100G 9100 N/A Eth30 trunk down up N/A N/A
Ethernet120 121,122,123,124 100G 9100 N/A Eth31 trunk down up N/A N/A
Ethernet124 125,126,127,128 100G 9100 N/A Eth32 trunk down up N/A N/A
sudo config save
で解決した.
mlxlinkからもリンクアップしてて感動.
# mlxlink -d /dev/mst/mt4115_pciconf0.1
Operational Info
----------------
State : Active
Physical state : LinkUp
Speed : 100GbE
Width : 4x
FEC : Standard RS-FEC - RS(528,514)
Loopback Mode : No Loopback
Auto Negotiation : ON
Supported Info
--------------
Enabled Link Speed : 0xf8f1f1d3 (100G,56G,50G,40G,25G,10G,1G)
Supported Cable Speed : 0x48101065 (100G,50G,40G,25G,20G,10G,1G)
Troubleshooting Info
--------------------
Status Opcode : 0
Group Opcode : N/A
Recommendation : No issue was observed
Tool Information
----------------
Firmware Version : 12.28.2006
MFT Version : mft 4.30.1-113
苦労したポイント
DX010のSonicのコンソールから,デバイスの状態を確認したら,右のTypeの欄にQSFP28 or later
という文字列がたまたま現れるが,本当にケーブルを挿してるところがどこか紛らわしくなって,全く別のところの設定をいじったりしたせいで,原因の把握が遅れてしまった.