1. 何がうれしいのか
一般的にストレージの性能情報はメーカーの提供する監視ツールでグラフ化したり、あるいはHarvestなどのフリーのツールを使うことが多いかと思います。しかしマニュアルを読んでインストール・設定するのは面倒です。また、トラブルシューティングで粒度の細かい値を見たいにも関わらず、情報が足りないこともあります。
この記事では、Linxuマシン上にONTAPのコマンドを羅列したシェルを作成し、LinuxからONTAPに対してそのコマンドを実行してLinuxにログを残す方法を記載します。コマンドの引数は自身で調整できるため、必要な情報をログに残すことができます。
この仕組みを実現するには事前にLinuxとONTAPの間でSSHの鍵交換をする必要がありますが、手順は以下の記事を確認してください。
Linux編 https://qiita.com/kan_itani/items/43529ddf3894a9dee6e8
Windows編 https://qiita.com/s_yosh1d/items/7ed8af8d7e8f16ddf040
2. ONTAPでどのような情報をCLIから取得できるのか
この記事では以下をカバーしています。
(これ以外にも取得可能なので、スクリプトをカスタマイズしてみてください。)
2.1. ネットワーク層、Volume層、DISK層、など各種のレイヤーでのレイテンシ
::> set -confirmations off ; set advanced ; qos statistics latency show -iterations 14400
Policy Group Latency Network Cluster Data Disk QoS Max QoS Min NVRAM Cloud FlexCache SM Sync VA AVSCAN
-------------------- ---------- ---------- ---------- ---------- ---------- ---------- ---------- ---------- ---------- ---------- ---------- ---------- ----------
-total- 0ms 0ms 0ms 0ms 0ms 0ms 0ms 0ms 0ms 0ms 0ms 0ms 0ms
-total- 0ms 0ms 0ms 0ms 0ms 0ms 0ms 0ms 0ms 0ms 0ms 0ms 0ms
-total- 0ms 0ms 0ms 0ms 0ms 0ms 0ms 0ms 0ms 0ms 0ms 0ms 0ms
-total- 3.22ms 2.17ms 53.00us 1000.00us 0ms 0ms 0ms 0ms 0ms 0ms 0ms 0ms 0ms
User-Best-Effort 3.22ms 2.17ms 53.00us 1000.00us 0ms 0ms 0ms 0ms 0ms 0ms 0ms 0ms 0ms
-total- 287.00us 242.00us 0ms 21.00us 24.00us 0ms 0ms 0ms 0ms 0ms 0ms 0ms 0ms
User-Best-Effort 287.00us 242.00us 0ms 21.00us 24.00us 0ms 0ms 0ms 0ms 0ms 0ms 0ms 0ms
-total- 126.00us 84.00us 0ms 20.00us 22.00us 0ms 0ms 0ms 0ms 0ms 0ms 0ms 0ms
User-Best-Effort 126.00us 84.00us 0ms 20.00us 22.00us 0ms 0ms 0ms 0ms 0ms 0ms 0ms 0ms
-total- 148.00us 101.00us 0ms 21.00us 26.00us 0ms 0ms 0ms 0ms 0ms 0ms 0ms 0ms
User-Best-Effort 148.00us 101.00us 0ms 21.00us 26.00us 0ms 0ms 0ms 0ms 0ms 0ms 0ms 0ms
Policy Group Latency Network Cluster Data Disk QoS Max QoS Min NVRAM Cloud FlexCache SM Sync VA AVSCAN
-------------------- ---------- ---------- ---------- ---------- ---------- ---------- ---------- ---------- ---------- ---------- ---------- ---------- ----------
-total- 108.00us 62.00us 0ms 21.00us 25.00us 0ms 0ms 0ms 0ms 0ms 0ms 0ms 0ms
User-Best-Effort 108.00us 62.00us 0ms 21.00us 25.00us 0ms 0ms 0ms 0ms 0ms 0ms 0ms 0ms
-total- 183.00us 136.00us 0ms 21.00us 26.00us 0ms 0ms 0ms 0ms 0ms 0ms 0ms 0ms
User-Best-Effort 183.00us 136.00us 0ms 21.00us 26.00us 0ms 0ms 0ms 0ms 0ms 0ms 0ms 0ms
-total- 276.00us 230.00us 0ms 21.00us 25.00us 0ms 0ms 0ms 0ms 0ms 0ms 0ms 0ms
User-Best-Effort 276.00us 230.00us 0ms 21.00us 25.00us 0ms 0ms 0ms 0ms 0ms 0ms 0ms 0ms
-total- 280.00us 234.00us 0ms 21.00us 25.00us 0ms 0ms 0ms 0ms 0ms 0ms 0ms 0ms
User-Best-Effort 280.00us 234.00us 0ms 21.00us 25.00us 0ms 0ms 0ms 0ms 0ms 0ms 0ms 0ms
2.2. IP単位(=LIF単位)での通信流量
::> set -confirmations off ; set advanced ; statistics volume show -vserver $SVM_NAME -interval 5 -iterations 0 -max 100
aff-a90 : 11/11/2024 02:27:08
Recv Sent
Recv Data Recv Sent Data Sent Current
*LIF Vserver Packet (Bps) Errors Packet (Bps) Errors Port
------------- ------- ------ ------- ------ ------ -------- ------ -------
n2_e6a_seg2_4 a90svm1 15055 3094290 0 17190 72267811 0 e6a
n2_e6a_seg2_3 a90svm1 13897 2837565 0 15764 66271856 0 e6a
n2_e6a_seg1_2 a90svm1 14138 3062160 0 17012 71519499 0 e6a
n2_e6a_seg1_1 a90svm1 14297 3087225 0 17151 72104906 0 e6a
n2_e11a_seg2_4
a90svm1 14917 3064140 0 17023 71564692 0 e11a
n2_e11a_seg2_3
a90svm1 13710 2976300 0 16535 69513140 0 e11a
n2_e11a_seg1_2
a90svm1 14436 3077595 0 17097 71876839 0 e11a
n2_e11a_seg1_1
a90svm1 15078 3134565 0 17414 73211609 0 e11a
n1_e6a_seg2_4 a90svm1 8841 1650735 0 9170 38553833 0 e6a
n1_e6a_seg2_3 a90svm1 8331 1563435 0 8685 36514893 0 e6a
n1_e6a_seg1_2 a90svm1 8915 1646055 0 9144 38444529 0 e6a
n1_e6a_seg1_1 a90svm1 8654 1617030 0 8983 37765583 0 e6a
n1_e11a_seg2_4
a90svm1 8136 1509480 0 8385 35253693 0 e11a
n1_e11a_seg2_3
a90svm1 8438 1579905 0 8777 36899559 0 e11a
n1_e11a_seg1_2
a90svm1 8962 1657980 0 9211 38723044 0 e11a
n1_e11a_seg1_1
a90svm1 8718 1618785 0 8993 37807623 0 e11a
2.3. ボリューム単位のOperations per secの値やスループット、Latency、Read/Write比率の値
::> set -confirmations off ; set advanced ; statistics lif show -interval 5 -iterations 0 -max 40 -sort-key instance_name -vserver $SVM_NAME
aff-a90 : 11/11/2024 02:26:20
*Total Read Write Other Read Write Latency
Volume Vserver Aggregate Ops Ops Ops Ops (Bps) (Bps) (us)
---------------- ------- --------- ------ ----- ----- ----- --------- ----- -------
flexgroup1__0001 a90svm1 aggr1 30561 30557 0 0 125161472 0 7
flexgroup1__0049 a90svm1 aggr1 29334 29333 0 0 120151040 0 7
flexgroup1__0041 a90svm1 aggr1 29231 29230 0 0 119728128 0 7
flexgroup1__0080 a90svm1 aggr2 20942 20933 0 4 85744640 0 8
flexgroup1__0066 a90svm1 aggr2 20782 20781 0 0 85121024 0 7
flexgroup1__0034 a90svm1 aggr2 20644 20643 0 0 84555776 0 7
flexgroup1__0062 a90svm1 aggr2 19795 19795 0 0 81081344 0 7
flexgroup1__0018 a90svm1 aggr2 19495 19495 0 0 79851520 0 7
flexgroup1__0012 a90svm1 aggr2 19429 19429 0 0 79583232 0 7
flexgroup1__0028 a90svm1 aggr2 19208 19207 0 0 78674944 0 7
flexgroup1__0079 a90svm1 aggr1 0 0 0 0 0 0 0
flexgroup1__0078 a90svm1 aggr2 0 0 0 0 0 0 0
flexgroup1__0077 a90svm1 aggr1 0 0 0 0 0 0 0
flexgroup1__0076 a90svm1 aggr2 0 0 0 0 0 0 0
flexgroup1__0075 a90svm1 aggr1 0 0 0 0 0 0 0
flexgroup1__0074 a90svm1 aggr2 0 0 0 0 0 0 0
<後略>
2.4. クラスタ全体の負荷状況 / コントローラ毎の負荷状況
::> set -confirmations off ; set advanced ; statistics show-periodic -interval 2 -iterations 7200
::> set -confirmations off ; set advanced ; statistics show-periodic -interval 2 -iterations 7200 -node $NODE_A
::> set -confirmations off ; set advanced ; statistics show-periodic -interval 2 -iterations 7200 -node $NODE_B
aff-a90-01: node.node: 11/11/2024 02:26:07
cpu cpu cpu total fcache pkts pkts total total data data data cluster cluster cluster disk disk
avg busy total ops nfs-ops cifs-ops ops recv sent recv sent busy recv sent busy recv sent read write
---- ---- ----- -------- -------- -------- -------- -------- -------- -------- -------- ---- -------- -------- ------- -------- -------- -------- --------
1% 2% 82% 0 0 0 0 213 187 988KB 486KB 0% 630B 0B 0% 973KB 467KB 0B 17.9KB
1% 2% 101% 0 0 0 0 219 208 378KB 435KB 0% 150B 90B 0% 377KB 433KB 6.00KB 12.0KB
1% 2% 83% 0 0 0 0 187 133 364KB 419KB 0% 59B 0B 0% 362KB 411KB 13.9KB 17.9KB
2% 3% 123% 0 0 0 0 179 126 377KB 394KB 0% 289B 229B 0% 376KB 392KB 7.96KB 11.9KB
5% 8% 362% 44389 44389 0 0 39252 37514 9.56MB 149MB 0% 8.69MB 147MB 0% 888KB 1.98MB 2.59MB 2.42MB
10% 14% 706% 118477 118477 0 0 130224 115034 28.8MB 450MB 1% 27.1MB 449MB 0% 1.64MB 452KB 5.97KB 17.9KB
14% 18% 911% 152519 152519 0 0 187550 162319 38.6MB 627MB 2% 38.2MB 627MB 0% 370KB 414KB 0B 12.0KB
14% 20% 942% 143756 143756 0 0 171089 151202 35.7MB 583MB 2% 35.3MB 583MB 0% 427KB 517KB 11.9KB 17.9KB
13% 18% 880% 144332 144332 0 0 171789 151868 35.8MB 585MB 2% 35.5MB 585MB 0% 341KB 391KB 5.97KB 11.9KB
14% 19% 900% 144187 144187 0 0 172404 152331 38.4MB 589MB 2% 35.5MB 586MB 0% 2.86MB 3.17MB 4.93MB 4.76MB
13% 18% 877% 144032 144032 0 0 171580 151471 36.4MB 583MB 2% 35.4MB 583MB 0% 1.01MB 442KB 9.95KB 17.9KB
13% 18% 871% 144844 144844 0 0 183381 162228 38.2MB 624MB 2% 37.8MB 624MB 0% 380KB 375KB 6.00KB 12.0KB
13% 18% 893% 144343 144343 0 0 182622 161285 38.0MB 622MB 2% 37.7MB 621MB 0% 361KB 379KB 5.97KB 17.9KB
13% 18% 875% 143018 143018 0 0 170215 150507 35.5MB 580MB 2% 35.1MB 579MB 0% 378KB 579KB 602KB 11.9KB
2.5. コントローラのCPUコア毎の使用率
::> set -confirmations off ; set advanced ; node run -node $NODE_A -command sysstat -m -c 7200 2
::> set -confirmations off ; set advanced ; node run -node $NODE_B -command sysstat -m -c 7200 2
ANY AVG CPU0 CPU1 CPU2 CPU3 CPU4 CPU5 CPU6 CPU7 CPU8 CPU9 CPU10 CPU11 CPU12 CPU13 CPU14 CPU15 CPU16 CPU17 CPU18 CPU19 CPU20 CPU21 CPU22 CPU23 CPU24 CPU25 CPU26 CPU27 CPU28 CPU29 CPU30 CPU31 CPU32 CPU33 CPU34 CPU35 CPU36 CPU37 CPU38 CPU39 CPU40 CPU41 CPU42 CPU43 CPU44 CPU45 CPU46 CPU47 CPU48 CPU49 CPU50 CPU51 CPU52 CPU53 CPU54 CPU55 CPU56 CPU57 CPU58 CPU59 CPU60 CPU61 CPU62 CPU63 CPU64 CPU65 CPU66 CPU67 CPU68 CPU69 CPU70 CPU71 CPU72 CPU73 CPU74 CPU75 CPU76 CPU77 CPU78 CPU79 CPU80 CPU81 CPU82 CPU83 CPU84 CPU85 CPU86 CPU87 CPU88 CPU89 CPU90 CPU91 CPU92 CPU93 CPU94 CPU95 CPU96 CPU97 CPU98 CPU99 CPU100 CPU101 CPU102 CPU103 CPU104 CPU105 CPU106 CPU107 CPU108 CPU109 CPU110 CPU111 CPU112 CPU113 CPU114 CPU115 CPU116 CPU117 CPU118 CPU119 CPU120 CPU121 CPU122 CPU123 CPU124 CPU125 CPU126 CPU127
100% 21% 36% 100% 36% 100% 28% 100% 27% 100% 28% 100% 27% 100% 28% 100% 27% 100% 28% 100% 29% 100% 27% 100% 29% 100% 40% 100% 39% 100% 42% 100% 41% 100% 31% 100% 29% 100% 30% 100% 29% 100% 27% 100% 28% 100% 28% 100% 30% 100% 31% 100% 29% 100% 29% 100% 32% 100% 28% 100% 28% 100% 27% 100% 30% 100% 9% 100% 7% 100% 8% 100% 8% 100% 11% 100% 8% 100% 8% 100% 9% 100% 12% 100% 10% 100% 11% 100% 10% 100% 11% 100% 10% 100% 12% 100% 10% 100% 10% 100% 11% 100% 10% 100% 14% 100% 19% 100% 18% 100% 11% 100% 10% 100% 10% 100% 10% 100% 10% 100% 11% 100% 11% 100% 13% 100% 16% 100% 13% 100%
100% 21% 38% 100% 36% 100% 28% 100% 28% 100% 28% 100% 28% 100% 29% 100% 28% 100% 28% 100% 29% 100% 29% 100% 29% 100% 43% 100% 42% 100% 43% 100% 42% 100% 30% 100% 31% 100% 31% 100% 30% 100% 28% 100% 28% 100% 33% 100% 29% 100% 34% 100% 29% 100% 30% 100% 32% 100% 32% 100% 29% 100% 29% 100% 30% 100% 9% 100% 6% 100% 7% 100% 9% 100% 9% 100% 9% 100% 7% 100% 8% 100% 11% 100% 8% 100% 10% 100% 9% 100% 11% 100% 8% 100% 11% 100% 9% 100% 8% 100% 11% 100% 11% 100% 14% 100% 9% 100% 10% 100% 10% 100% 11% 100% 8% 100% 9% 100% 11% 100% 11% 100% 11% 100% 11% 100% 20% 100% 22% 100%
100% 22% 37% 100% 36% 100% 30% 100% 29% 100% 28% 100% 28% 100% 28% 100% 30% 100% 31% 100% 31% 100% 29% 100% 32% 100% 44% 100% 43% 100% 43% 100% 44% 100% 32% 100% 30% 100% 31% 100% 30% 100% 28% 100% 29% 100% 32% 100% 31% 100% 32% 100% 30% 100% 32% 100% 32% 100% 30% 100% 30% 100% 30% 100% 31% 100% 10% 100% 8% 100% 9% 100% 10% 100% 12% 100% 10% 100% 10% 100% 10% 100% 12% 100% 10% 100% 11% 100% 9% 100% 12% 100% 10% 100% 10% 100% 10% 100% 10% 100% 13% 100% 12% 100% 13% 100% 11% 100% 10% 100% 10% 100% 10% 100% 11% 100% 11% 100% 11% 100% 11% 100% 11% 100% 14% 100% 21% 100% 20% 100%
100% 21% 37% 100% 37% 100% 27% 100% 27% 100% 29% 100% 29% 100% 28% 100% 29% 100% 29% 100% 28% 100% 29% 100% 29% 100% 44% 100% 43% 100% 45% 100% 44% 100% 30% 100% 28% 100% 31% 100% 29% 100% 27% 100% 27% 100% 30% 100% 30% 100% 31% 100% 29% 100% 29% 100% 31% 100% 28% 100% 30% 100% 29% 100% 31% 100% 9% 100% 7% 100% 8% 100% 8% 100% 9% 100% 9% 100% 7% 100% 8% 100% 11% 100% 8% 100% 10% 100% 8% 100% 11% 100% 9% 100% 9% 100% 9% 100% 12% 100% 11% 100% 11% 100% 14% 100% 9% 100% 10% 100% 10% 100% 10% 100% 9% 100% 11% 100% 9% 100% 10% 100% 11% 100% 12% 100% 22% 100% 22% 100%
100% 21% 32% 100% 36% 100% 29% 100% 27% 100% 29% 100% 27% 100% 28% 100% 28% 100% 29% 100% 28% 100% 28% 100% 28% 100% 41% 100% 41% 100% 43% 100% 41% 100% 30% 100% 29% 100% 31% 100% 30% 100% 26% 100% 27% 100% 30% 100% 29% 100% 31% 100% 29% 100% 29% 100% 30% 100% 29% 100% 30% 100% 28% 100% 30% 100% 9% 100% 8% 100% 10% 100% 10% 100% 10% 100% 9% 100% 9% 100% 10% 100% 11% 100% 9% 100% 10% 100% 9% 100% 11% 100% 10% 100% 10% 100% 11% 100% 9% 100% 13% 100% 12% 100% 15% 100% 12% 100% 11% 100% 11% 100% 15% 100% 10% 100% 10% 100% 10% 100% 13% 100% 10% 100% 14% 100% 22% 100% 21% 100%
100% 20% 27% 100% 36% 100% 28% 100% 26% 100% 28% 100% 26% 100% 28% 100% 27% 100% 28% 100% 28% 100% 28% 100% 27% 100% 42% 100% 42% 100% 43% 100% 43% 100% 29% 100% 28% 100% 30% 100% 30% 100% 26% 100% 26% 100% 29% 100% 29% 100% 28% 100% 28% 100% 29% 100% 30% 100% 27% 100% 29% 100% 29% 100% 30% 100% 11% 100% 7% 100% 8% 100% 8% 100% 10% 100% 9% 100% 8% 100% 9% 100% 10% 100% 10% 100% 9% 100% 8% 100% 11% 100% 10% 100% 10% 100% 8% 100% 9% 100% 10% 100% 9% 100% 13% 100% 10% 100% 9% 100% 9% 100% 21% 100% 9% 100% 10% 100% 8% 100% 10% 100% 11% 100% 12% 100% 23% 100% 22% 100%
100% 21% 28% 100% 36% 100% 27% 100% 27% 100% 30% 100% 27% 100% 32% 100% 28% 100% 28% 100% 29% 100% 28% 100% 28% 100% 40% 100% 40% 100% 41% 100% 40% 100% 30% 100% 30% 100% 30% 100% 33% 100% 26% 100% 27% 100% 28% 100% 29% 100% 29% 100% 29% 100% 29% 100% 29% 100% 27% 100% 29% 100% 28% 100% 29% 100% 10% 100% 8% 100% 8% 100% 9% 100% 11% 100% 8% 100% 8% 100% 10% 100% 10% 100% 8% 100% 11% 100% 9% 100% 11% 100% 10% 100% 11% 100% 9% 100% 10% 100% 12% 100% 12% 100% 15% 100% 11% 100% 10% 100% 11% 100% 22% 100% 10% 100% 11% 100% 10% 100% 10% 100% 12% 100% 13% 100% 22% 100% 20% 100%
100% 20% 27% 100% 35% 100% 27% 100% 25% 100% 28% 100% 27% 100% 26% 100% 27% 100% 27% 100% 28% 100% 27% 100% 26% 100% 41% 100% 40% 100% 40% 100% 42% 100% 28% 100% 29% 100% 30% 100% 29% 100% 26% 100% 26% 100% 30% 100% 29% 100% 29% 100% 27% 100% 29% 100% 30% 100% 28% 100% 28% 100% 28% 100% 29% 100% 8% 100% 7% 100% 8% 100% 8% 100% 8% 100% 8% 100% 8% 100% 8% 100% 10% 100% 7% 100% 10% 100% 8% 100% 9% 100% 9% 100% 9% 100% 9% 100% 9% 100% 12% 100% 10% 100% 13% 100% 10% 100% 8% 100% 11% 100% 20% 100% 10% 100% 9% 100% 11% 100% 10% 100% 9% 100% 12% 100% 21% 100% 21% 100%
2.6. Disk READ/WRITEのスループット、Diskの負荷、キャッシュヒット率、NFS/CIFS/FCP/iSCSI/NVMeの流量
::> set -confirmations off ; set advanced ; node run -node $NODE_A -command sysstat -x -c 7200 2
::> set -confirmations off ; set advanced ; node run -node $NODE_B -command sysstat -x -c 7200 2
CPU NFS CIFS HTTP Total Net kB/s Disk kB/s Tape kB/s Cache Cache CP CP_Ty CP_Ph Disk OTHER FCP iSCSI FCP kB/s iSCSI kB/s NVMeF kB/s kB/s NVMeT kB/s kB/s
in out read write read write age hit time [T--H--F--N--B--O--#--:] [n--v--p--f] util in out in out in out in out
19% 12755 0 0 13320 40995 15866 72530 2820 0 0 43s 97% 1% 0--0--0--0--0--0--0--0 0--0--0--0 0% 565 0 0 0 0 0 0 0 0 0 0 0 0
27% 15362 0 0 15481 49788 17512 280828 136870 0 0 44 98% 9% 0--0--0--0--0--0--0--0 0--0--0--0 1% 119 0 0 0 0 0 0 0 0 0 0 0 0
35% 17676 0 0 17778 65531 19806 417760 16 0 0 2s 99% 0% 0--0--0--0--0--0--0--0 0--0--0--0 1% 102 0 0 0 0 0 0 0 0 0 0 0 0
24% 15634 0 0 15847 51274 17444 379186 18 0 0 58s 100% 0% 0--0--0--0--0--0--0--0 0--0--0--0 1% 213 0 0 0 0 0 0 0 0 0 0 0 0
34% 77736 0 0 77826 190442 106375 731256 12 0 0 1 99% 0% 0--0--0--0--0--0--0--0 0--0--0--0 3% 90 0 0 0 0 0 0 0 0 0 0 0 0
53% 184053 0 0 184176 529551 291610 980842 1241945 0 0 1 98% 42% 0--0--0--0--0--0--0--0 0--0--0--0 5% 123 0 0 0 0 0 0 0 0 0 0 0 0
54% 185127 0 0 185315 565173 278311 986281 217669 0 0 9s 98% 16% 0--0--0--0--0--0--0--0 0--0--0--0 4% 188 0 0 0 0 0 0 0 0 0 0 0 0
56% 186824 0 0 186893 565524 283546 1006761 177677 0 0 10s 98% 16% 0--0--0--0--0--0--0--0 0--0--0--0 5% 69 0 0 0 0 0 0 0 0 0 0 0 0
56% 187718 0 0 187863 566591 287472 1045319 243672 0 0 37s 98% 20% 0--0--0--1--0--0--0--0 1--0--0--0 5% 145 0 0 0 0 0 0 0 0 0 0 0 0
55% 185092 0 0 185170 556597 285699 1043116 259112 0 0 44 98% 19% 0--0--0--1--0--0--0--0 1--0--0--0 5% 78 0 0 0 0 0 0 0 0 0 0 0 0
56% 189335 0 0 189453 563549 295565 876418 336696 0 0 14s 98% 24% 0--0--0--1--0--0--0--0 0--0--1--0 4% 118 0 0 0 0 0 0 0 0 0 0 0 0
56% 191676 0 0 191764 568528 300703 892750 332534 0 0 42s 98% 19% 0--0--0--1--0--0--0--0 0--0--0--1 3% 88 0 0 0 0 0 0 0 0 0 0 0 0
56% 192704 0 0 192800 568371 309251 867271 251772 0 0 44s 98% 16% 0--0--0--0--0--0--0--0 0--0--0--0 4% 96 0 0 0 0 0 0 0 0 0 0 0 0
56% 191397 0 0 191548 566231 305027 861804 245088 0 0 59 98% 14% 0--0--0--0--0--0--0--0 0--0--0--0 4% 151 0 0 0 0 0 0 0 0 0 0 0 0
56% 191777 0 0 191939 563491 307131 858103 244474 0 0 59 98% 15% 0--0--0--0--0--0--0--0 0--0--0--0 3% 162 0 0 0 0 0 0 0 0 0 0 0 0
55% 191504 0 0 191604 561257 309747 862525 246275 0 0 21s 98% 15% 0--0--0--0--0--0--0--0 0--0--0--0 3% 100 0 0 0 0 0 0 0 0 0 0 0 0
56% 191758 0 0 192000 563913 309376 859622 243652 0 0 23s 98% 15% 0--0--0--0--0--0--0--0 0--0--0--0 4% 242 0 0 0 0 0 0 0 0 0 0 0 0
57% 191156 0 0 191822 558556 313726 872332 244476 0 0 25s 98% 24% 0--0--0--0--0--0--0--0 0--0--0--0 4% 666 0 0 0 0 0 0 0 0 0 0 0 0
56% 193043 0 0 193615 564880 308678 890209 244372 0 0 60 98% 18% 0--0--0--0--0--0--0--0 0--0--0--0 4% 572 0 0 0 0 0 0 0 0 0 0 0 0
56% 192572 0 0 192900 564840 311437 880102 243188 0 0 58s 98% 15% 0--0--0--0--0--0--0--0 0--0--0--0 3% 328 0 0 0 0 0 0 0 0 0 0 0 0
CPU NFS CIFS HTTP Total Net kB/s Disk kB/s Tape kB/s Cache Cache CP CP_Ty CP_Ph Disk OTHER FCP iSCSI FCP kB/s iSCSI kB/s NVMeF kB/s kB/s NVMeT kB/s kB/s
in out read write read write age hit time [T--H--F--N--B--O--#--:] [n--v--p--f] util in out in out in out in out
56% 194852 0 0 194924 575562 313198 899710 241911 0 0 30s 98% 15% 0--0--0--0--0--0--0--0 0--0--0--0 3% 72 0 0 0 0 0 0 0 0 0 0 0 0
58% 193158 0 0 193522 568958 308988 884344 246489 0 0 30s 98% 16% 0--0--0--0--0--0--0--0 0--0--0--0 4% 364 0 0 0 0 0 0 0 0 0 0 0 0
57% 194852 0 0 194982 572497 311824 893665 244780 0 0 29s 98% 14% 0--0--0--0--0--0--0--0 0--0--0--0 3% 130 0 0 0 0 0 0 0 0 0 0 0 0
2.7. NICが本来動作するべきノード、ポートで動作しているかの確認
::> set -confirmations off ; set advanced ; net int show -vserver $SVM_NAME -fields home-node,home-port,curr-node,curr-port,status-oper,is-home,status-admin,status-ope
(network interface show)
vserver lif home-node home-port curr-node curr-port status-oper is-home status-admin
------- -------------- ---------- --------- ---------- --------- ----------- ------- ------------
a90svm1 n1_e11a_seg1_1 aff-a90-01 e11a aff-a90-01 e11a up true up
a90svm1 n1_e11a_seg1_2 aff-a90-01 e11a aff-a90-01 e11a up true up
a90svm1 n1_e6a_seg1_1 aff-a90-01 e6a aff-a90-01 e6a up true up
a90svm1 n1_e6a_seg1_2 aff-a90-01 e6a aff-a90-01 e6a up true up
a90svm1 n2_e11a_seg1_1 aff-a90-02 e11a aff-a90-02 e11a up true up
a90svm1 n2_e11a_seg1_2 aff-a90-02 e11a aff-a90-02 e11a up true up
a90svm1 n2_e6a_seg1_1 aff-a90-02 e6a aff-a90-02 e6a up true up
a90svm1 n2_e6a_seg1_2 aff-a90-02 e6a aff-a90-02 e6a up true up
3. 実行回数(iterations)や頻度(interval)について
コマンドによって、実行回数や頻度のデフォルト値や最小値が異なります。
statistics lif showは5秒に1回、statistics show-periodicは2秒に1回が最小です。
これらはコマンドによって異なるので、CLIで各コマンドの後に"?"マークを付けると引数を確認できます。
aff-a90::> statistics lif show ?
[ -lif <text> ] LIF
[ -vserver <vserver name> ] Vserver
[ -sort-key <text> ] Column to Sort By
[[-interval] {5..300}] Interval (default: 5)
[[-iterations] {0..50}] Iterations (default: 1)
[[-max] {1..100}] Maximum Number of Instances (default: 25)
aff-a90::> statistics lif show
aff-a90::> node run -node aff-a90-01 -command sysstat -x -c ?
sysstat: -c value must be a positive, non-zero integer
usage: sysstat [-c count] [-s] [-u | -x | -m | -f | -i | -b | -n] [-d] [interval]
-c count - the number of iterations to execute
-s - print out summary statistics when done
-u - print out utilization format instead
-x - print out all fields (overrides -u)
-m - print out multiprocessor statistics
-f - print out FCP target statistics
-i - print out iSCSI target statistics
-n - print out NVMe-oF target statistics
-b - print out SAN statistics
-d - print without HDD and SSD stats
for default, -u and -x formats
interval - the interval between iterations in seconds, default is 1.5 seconds
aff-a90::> node run -node aff-a90-01 -command sysstat -x -c
4. スクリプト本体
4.1. 動作前提
- Linuxマシン上のユーザとONTAPの間でSSHの鍵交換をして、パスワード無しのログインを実現しておく
- スクリプト本体を鍵交換したユーザのホームディレクトリに配置して、シェル内のディレクトリパスを変更する
- 環境変数を書き換える (接続先IPやSVM名、ノード名など)
4.2. 環境変数
CMGMT_IP=
SSHの接続先となる、ONTAPのクラスタ管理用のIPアドレスを入力
ONTAP_ADMIN_NAME=
SSHの鍵交換した際の、ONTAP側のログインユーザ名。
`DATE=`date +%m%d_%H%M%S`
ログディレクトリやログファイル名にこの日付を利用。
mkdir /root/scripts/logs/$DATE
実行する度に日付と日時のディレクトリを作成。
実行ユーザによってパスを書き替えてください。
chmod 766 /root/scripts/logs/$DATE
実行ユーザによってパスを書き替えてください。
LOG_DIR=/root/scripts/logs/$DATE
実行ユーザによってパスを書き替えてください。
NODE_A=aff-a90-01
NODE_B=aff-a90-02
ONTAPのクラスタ内にいるノード名を記載。
SVM_NAME=a90svm1
データアクセス先のユーザSVM名。
vserver show -type data
コマンドでSVM名は確認可能
ADMIN_SVM=aff-a90
ONTAPの管理SVM名を指定。
vserver show -type admin
で管理SVM名を確認可能
4.3. 言い訳 / 注意事項
- ハードコードされているコマンドや引数が多い
"ssh -i /root/.ssh/id_rsa"などは実行するLinuxユーザに合わせて置換してください。
コマンドの引数の数値(実行回数や実行間隔など)もそのまま書いているので、変更する際の編集が手間です。 - 1つのONTAPクラスタ内部にノード数が増えると、シェルを書き換えるのが面倒
※ やっている事がわかりやすいようにコマンドを羅列しています。(という言い訳)
4.4. サンプルスクリプト
# https://kb.netapp.com/on-prem/ontap/Ontap_OS/OS-KBs/Unable_to_access_the_cluster_via_SSH_due_to_client_s_excessive_incoming_connections
# ONTAP supports a maximum of 64 concurrent SSH sessions per node. If the cluster management LIF resides on the node, it shares this limit with the node management LIF.
# If the rate of incoming connections is higher than 10 per second, the service is temporarily disabled for 60 seconds.
CMGMT_IP=10.128.217.20
ONTAP_ADMIN_NAME=rocky9admin
DATE=`date +%m%d_%H%M%S`
mkdir /root/scripts/logs/$DATE
chmod 766 /root/scripts/logs/$DATE
LOG_DIR=/root/scripts/logs/$DATE
NODE_A=aff-a90-01
NODE_B=aff-a90-02
# SVM_NAME: vserver show -type data
SVM_NAME=a90svm1
# ADMIN_SVM: vserver show -type admin
ADMIN_SVM=aff-a90
# LIF STATUS
LOG_FILE=$LOG_DIR/$DATE-net_int_show_is-home_$SVM_NAME.txt
echo Start $DATE \(Check if LIFs are migrated or not.\) >> $LOG_FILE
ssh -i /root/.ssh/id_rsa $ONTAP_ADMIN_NAME@$CMGMT_IP "set -confirmations off ; set advanced ; net int show -vserver $SVM_NAME -fields home-node,home-port,curr-node,curr-port,status-oper,is-home,status-admin,status-ope" >> $LOG_FILE &
# SYSSTAT -M
LOG_FILE=$LOG_DIR/$DATE-sysstat_m_$NODE_A.txt
echo Start $DATE \(interval 2, iterations 7200\) >> $LOG_FILE
ssh -i /root/.ssh/id_rsa $ONTAP_ADMIN_NAME@$CMGMT_IP "set -confirmations off ; set advanced ; node run -node $NODE_A -command sysstat -m -c 7200 2" >> $LOG_FILE &
LOG_FILE=$LOG_DIR/$DATE-sysstat_m_$NODE_B.txt
echo Start $DATE \(interval 2, iterations 7200\) >> $LOG_FILE
ssh -i /root/.ssh/id_rsa $ONTAP_ADMIN_NAME@$CMGMT_IP "set -confirmations off ; set advanced ; node run -node $NODE_B -command sysstat -m -c 7200 2" >> $LOG_FILE &
# STATISTICS LIF SHOW
LOG_FILE=$LOG_DIR/$DATE-statistics_lif_$ADMIN_SVM.txt
echo Start $DATE \(interval 5, iterations 0, max 40\) >> $LOG_FILE
ssh -i /root/.ssh/id_rsa $ONTAP_ADMIN_NAME@$CMGMT_IP "set -confirmations off ; set advanced ; statistics lif show -interval 5 -iterations 0 -max 40 -sort-key instance_name -vserver $SVM_NAME" >> $LOG_FILE &
# STATISTICS VOLUME SHOW
LOG_FILE=$LOG_DIR/$DATE-statistics_vol_$ADMIN_SVM.txt
echo Start $DATE \(interval 5, iterations 0, max 100\) >> $LOG_FILE
ssh -i /root/.ssh/id_rsa $ONTAP_ADMIN_NAME@$CMGMT_IP "set -confirmations off ; set advanced ; statistics volume show -vserver $SVM_NAME -interval 5 -iterations 0 -max 100" >> $LOG_FILE &
# STATISTICS SHOW-PERIODIC
LOG_FILE=$LOG_DIR/$DATE-statistics_show-periodic_all.txt
echo Start $DATE \(interval 2, iterations 7200\) >> $LOG_FILE
ssh -i /root/.ssh/id_rsa $ONTAP_ADMIN_NAME@$CMGMT_IP "set -confirmations off ; set advanced ; statistics show-periodic -interval 2 -iterations 7200" >> $LOG_FILE &
LOG_FILE=$LOG_DIR/$DATE-statistics_show-periodic_$NODE_A.txt
echo Start $DATE \(interval 2, iterations 7200\) >> $LOG_FILE
ssh -i /root/.ssh/id_rsa $ONTAP_ADMIN_NAME@$CMGMT_IP "set -confirmations off ; set advanced ; statistics show-periodic -interval 2 -iterations 7200 -node $NODE_A" >> $LOG_FILE &
LOG_FILE=$LOG_DIR/$DATE-statistics_show-periodic_$NODE_B.txt
echo Start $DATE \(interval 2, iterations 7200\) >> $LOG_FILE
ssh -i /root/.ssh/id_rsa $ONTAP_ADMIN_NAME@$CMGMT_IP "set -confirmations off ; set advanced ; statistics show-periodic -interval 2 -iterations 7200 -node $NODE_B" >> $LOG_FILE &
# SYSSTAT -X
LOG_FILE=$LOG_DIR/$DATE-sysstat_x_$NODE_A.txt
echo Start $DATE \(interval 2, iterations 7200\) >> $LOG_FILE
ssh -i /root/.ssh/id_rsa $ONTAP_ADMIN_NAME@$CMGMT_IP "set -confirmations off ; set advanced ; node run -node $NODE_A -command sysstat -x -c 7200 2" >> $LOG_FILE &
LOG_FILE=$LOG_DIR/$DATE-sysstat_x_$NODE_B.txt
echo Start $DATE \(interval 2, iterations 7200\) >> $LOG_FILE
ssh -i /root/.ssh/id_rsa $ONTAP_ADMIN_NAME@$CMGMT_IP "set -confirmations off ; set advanced ; node run -node $NODE_B -command sysstat -x -c 7200 2" >> $LOG_FILE &
# DO NOT DELETE THIS SLEEP
sleep 1
# QOS STATISTICS LATENCY SHOW
LOG_FILE=$LOG_DIR/$DATE-qos_statistics_latency_show_$ADMIN_SVM.txt
echo Start $DATE \(interval 1, iterations 14400\) >> $LOG_FILE
ssh -i /root/.ssh/id_rsa $ONTAP_ADMIN_NAME@$CMGMT_IP "set -confirmations off ; set advanced ; qos statistics latency show -iterations 14400" >> $LOG_FILE &
5. その他
もっと良いスクリプトを書いて共有してくれる人を募集中。
上記以外にもONTAPのifstatコマンドで物理ポートの統計情報を拾ったり、パケット破棄した数を拾ったり、特定のストレージの状況を検出したら設定変更のコマンドを投入するなど、いろいろできると思います。