pandas使ってデータの中身を見るときdf
と打つとダラダラっと長い出力が出てきて見た目がよろしくありません。
df = pd.DataFrame(np.random.randn(100,4)); df
0 | 1 | 2 | 3 | |
---|---|---|---|---|
0 | 0.374859 | 0.327898 | 2.215511 | 1.165490 |
1 | -0.939833 | -0.531873 | -1.717368 | -0.584834 |
2 | 0.759525 | 1.992222 | -0.352082 | -1.500736 |
3 | -0.279484 | -0.278289 | 0.625053 | 0.362855 |
4 | 1.151177 | -1.398746 | 0.391291 | 0.673220 |
5 | -0.392235 | -0.973586 | 0.243700 | -2.899188 |
6 | 0.837239 | 0.670279 | -0.692629 | -1.126292 |
7 | -0.921781 | -2.438753 | -0.519993 | 0.482150 |
8 | 2.459798 | 1.219577 | 0.770672 | -1.390487 |
9 | -1.093845 | 0.343168 | -0.229751 | 1.172888 |
10 | -0.437252 | -0.824611 | -0.346145 | 0.785992 |
11 | 1.193672 | -0.193474 | 0.676684 | -1.468454 |
12 | 1.039551 | 0.234592 | -0.192957 | -1.210177 |
13 | 1.081615 | -0.988146 | -0.021931 | 1.137428 |
14 | 0.470213 | 1.239319 | -0.346861 | -0.288200 |
15 | -0.339914 | -1.580660 | -0.432387 | -0.202277 |
16 | 1.141389 | 0.236465 | -1.477666 | -0.264886 |
17 | -0.686339 | 0.971620 | -0.733747 | -0.110410 |
18 | 0.266442 | -0.168084 | -2.021432 | -1.337447 |
19 | 0.698942 | 1.409780 | -0.506928 | 0.999617 |
20 | 0.432697 | -0.629534 | -0.605271 | -2.336144 |
21 | 1.377673 | 0.761185 | 1.023692 | -1.472238 |
22 | 0.152084 | -0.725003 | 1.553365 | -0.544019 |
23 | 0.156944 | 0.505415 | -1.222674 | 0.423808 |
24 | 0.479288 | 0.201019 | -0.091332 | 0.254680 |
25 | -1.184456 | -0.095066 | -0.885104 | 0.421549 |
26 | 1.040014 | -1.381022 | 1.869261 | 1.437337 |
27 | 0.478984 | -0.944046 | 0.352453 | 2.569114 |
28 | -0.439603 | -1.298592 | 0.913691 | -0.622890 |
29 | -0.545850 | -0.872281 | 0.213367 | -0.681539 |
... | ... | ... | ... | ... |
70 | -1.469679 | -0.456337 | -0.329848 | 0.286484 |
71 | -1.219977 | -2.282746 | 0.506492 | -0.200502 |
72 | 0.365171 | 0.926229 | 0.935084 | -1.001133 |
73 | -0.842211 | -0.040298 | -0.728098 | 2.851352 |
74 | -0.897560 | 0.861064 | 1.990610 | 0.267552 |
75 | -0.703071 | 0.784476 | -1.002520 | -0.450417 |
76 | 0.033787 | 0.073530 | 0.214343 | 1.787105 |
77 | -1.258251 | -0.030197 | 1.320570 | 0.393222 |
78 | -1.766407 | -0.996086 | -0.192385 | -0.513102 |
79 | -0.629811 | -0.487538 | 0.923048 | -1.497247 |
80 | 0.083737 | 0.317975 | 0.325503 | 0.372319 |
81 | -0.391032 | -1.192947 | 0.312277 | -0.249235 |
82 | -1.525711 | -0.994144 | -1.411683 | -0.297697 |
83 | -0.794180 | 0.776143 | 0.057774 | 1.659901 |
84 | -0.270637 | -1.165053 | 0.508089 | -0.445596 |
85 | -1.961543 | 1.973141 | 0.533462 | -1.327931 |
86 | -0.100805 | -0.162729 | 1.448156 | -0.224008 |
87 | -0.514309 | 0.323078 | -0.233127 | 1.384196 |
88 | -2.516856 | 0.374363 | 1.129207 | 1.069754 |
89 | 0.577997 | -0.767833 | 0.923292 | -0.311372 |
90 | 0.758016 | -0.920520 | 0.109853 | 0.021920 |
91 | 0.406649 | -0.239311 | 1.024492 | 1.009525 |
92 | -1.666999 | 1.912280 | -1.959626 | -1.008634 |
93 | 0.222210 | -1.378929 | -0.609868 | 0.749869 |
94 | -1.622319 | 0.035508 | -1.547729 | -0.480135 |
95 | 0.895873 | -0.045676 | 0.180615 | -1.252418 |
96 | -0.630758 | -0.285296 | 0.160133 | 1.106705 |
97 | -1.909217 | 0.841634 | -0.011388 | 0.348177 |
98 | -1.271435 | 1.725388 | 1.075685 | -0.164461 |
99 | 0.379877 | -2.547350 | 0.899402 | -1.615333 |
100 rows × 4 columns
pandasのオプションを変更
出力列の最大値はset_option
メソッドで変更できます。
参考: pandas 0.21.1 documentation
pd.set_option("display.max_rows", 10)
df
0 | 1 | 2 | 3 | |
---|---|---|---|---|
0 | 0.044887 | 1.109229 | -1.404712 | 0.014551 |
1 | -2.032691 | -0.435130 | -0.428953 | -0.537191 |
2 | -0.777178 | -0.435460 | 0.848413 | -1.635667 |
3 | -0.213586 | -1.509976 | -0.635302 | -0.138209 |
4 | 0.639734 | 0.446097 | -1.312515 | 0.783796 |
... | ... | ... | ... | ... |
95 | -0.288322 | 0.132045 | -0.405144 | -0.542612 |
96 | -0.631888 | -0.487683 | 0.684156 | -0.114015 |
97 | -0.405025 | 1.719596 | 0.451788 | 0.930674 |
98 | 0.146326 | 1.032291 | -0.474848 | 0.847260 |
99 | 0.473070 | 0.408938 | -0.504452 | -0.214760 |
100 rows × 4 columns
戻すときはsetをresetに変えればOKです。(第2引数にsetした値の10も指定しなきゃいけないんだ・・・)
pd.reset_option("display.max_rows", 10)
df
0 | 1 | 2 | 3 | |
---|---|---|---|---|
0 | 0.044887 | 1.109229 | -1.404712 | 0.014551 |
1 | -2.032691 | -0.435130 | -0.428953 | -0.537191 |
2 | -0.777178 | -0.435460 | 0.848413 | -1.635667 |
3 | -0.213586 | -1.509976 | -0.635302 | -0.138209 |
4 | 0.639734 | 0.446097 | -1.312515 | 0.783796 |
5 | -0.156756 | 0.521311 | 0.060626 | 0.206347 |
6 | 0.591887 | 1.441567 | 0.587750 | -0.240194 |
7 | -0.098514 | 1.053005 | 0.072088 | -0.891726 |
8 | 1.484554 | -0.360987 | -1.724210 | -1.516901 |
9 | -0.918722 | 0.344975 | -0.439208 | -1.284894 |
10 | -0.223029 | -0.107058 | 1.234283 | -1.055316 |
11 | -0.806544 | 0.744367 | 0.594333 | -0.993136 |
12 | -0.680134 | 1.570801 | 1.204924 | -0.859910 |
13 | -0.639150 | -0.004267 | -0.691408 | -0.214076 |
14 | -0.219878 | -0.514751 | -1.332166 | 0.570380 |
15 | 1.990532 | -1.174292 | -0.118421 | -0.113356 |
16 | 0.653598 | 0.100153 | 1.636529 | 0.052311 |
17 | -1.045861 | -0.064809 | -0.433254 | 0.964098 |
18 | 0.231979 | 0.067611 | -1.253458 | -1.037114 |
19 | 1.344306 | -0.936234 | 0.594781 | 0.105511 |
20 | -0.413086 | -0.486708 | -0.911816 | 1.050004 |
21 | 0.973888 | -1.365909 | -3.741730 | -1.507470 |
22 | -0.821778 | 0.355387 | -1.467101 | 0.362862 |
23 | 0.005682 | -0.254259 | 1.408601 | 0.772690 |
24 | 0.761357 | -0.431552 | 1.230341 | 1.244104 |
25 | 0.433293 | -0.185350 | 0.937934 | -0.913643 |
26 | -0.202213 | 0.528685 | -0.745797 | -2.023442 |
27 | 1.285016 | 0.756849 | 0.636789 | 0.035517 |
28 | -2.249611 | 1.001626 | -0.071847 | -0.490456 |
29 | -0.827190 | -0.157449 | -1.775256 | 0.680590 |
... | ... | ... | ... | ... |
70 | -0.283343 | 1.396622 | -1.375399 | -0.297667 |
71 | -1.792001 | 1.488617 | -0.047619 | 0.584341 |
72 | -1.472638 | -3.259683 | -0.706456 | 0.508512 |
73 | 0.743772 | -0.313420 | 1.423694 | -1.095836 |
74 | -0.923600 | -0.320489 | -0.920354 | -0.676194 |
75 | -0.382733 | 0.748782 | 0.510318 | 0.190481 |
76 | -0.834656 | -0.927456 | 1.718550 | 0.244518 |
77 | -0.537161 | 0.315323 | 0.243676 | -1.853278 |
78 | -0.549080 | 0.659434 | -0.627163 | 1.142092 |
79 | 1.497259 | -0.183383 | -0.931365 | -0.712263 |
80 | 0.809390 | 0.696450 | -0.949674 | 0.333511 |
81 | 0.107922 | 0.323430 | 1.218619 | -0.486692 |
82 | -0.969837 | 0.585856 | 1.138128 | 0.399262 |
83 | -0.423241 | 0.855566 | -1.322747 | -0.313059 |
84 | -0.708709 | -1.031457 | -0.361363 | 1.389282 |
85 | -1.155997 | -0.054445 | -1.037225 | -2.020944 |
86 | -0.509943 | 1.279200 | -1.473619 | 1.070197 |
87 | 0.593176 | 0.660035 | 0.809127 | -1.455174 |
88 | 1.867072 | -0.697688 | -1.144857 | -1.740410 |
89 | 0.500170 | -0.266405 | 0.226681 | -0.800579 |
90 | -1.746962 | 1.414762 | 0.789651 | -1.362200 |
91 | -0.289363 | 1.300986 | 0.210491 | 1.529958 |
92 | -0.852068 | -0.048329 | -0.269035 | 0.250980 |
93 | -0.388686 | -1.312654 | -1.036473 | -1.297159 |
94 | -1.035741 | 0.097650 | 0.454851 | 2.067922 |
95 | -0.288322 | 0.132045 | -0.405144 | -0.542612 |
96 | -0.631888 | -0.487683 | 0.684156 | -0.114015 |
97 | -0.405025 | 1.719596 | 0.451788 | 0.930674 |
98 | 0.146326 | 1.032291 | -0.474848 | 0.847260 |
99 | 0.473070 | 0.408938 | -0.504452 | -0.214760 |
100 rows × 4 columns
head/tailメソッド
df.head()とかdf.tail()すれば表示を少なくできます。
df.head()
0 | 1 | 2 | 3 | |
---|---|---|---|---|
0 | 0.044887 | 1.109229 | -1.404712 | 0.014551 |
1 | -2.032691 | -0.435130 | -0.428953 | -0.537191 |
2 | -0.777178 | -0.435460 | 0.848413 | -1.635667 |
3 | -0.213586 | -1.509976 | -0.635302 | -0.138209 |
4 | 0.639734 | 0.446097 | -1.312515 | 0.783796 |
df.tail()
0 | 1 | 2 | 3 | |
---|---|---|---|---|
95 | -0.288322 | 0.132045 | -0.405144 | -0.542612 |
96 | -0.631888 | -0.487683 | 0.684156 | -0.114015 |
97 | -0.405025 | 1.719596 | 0.451788 | 0.930674 |
98 | 0.146326 | 1.032291 | -0.474848 | 0.847260 |
99 | 0.473070 | 0.408938 | -0.504452 | -0.214760 |
head/tailをつなげる
しかしながら、これだけでは不十分で、データの頭とおしりを同時に確認したい局面がよくあるんですよね。headとtail両方同時に出力できたら便利なのに・・・。
やっている人がいました
Stack Overflow - python pandas select both head and tail
df.head().append(df.tail())
とすればheadとtailをつなげてくれます。
df.head().append(df.tail())
0 | 1 | 2 | 3 | |
---|---|---|---|---|
0 | 0.044887 | 1.109229 | -1.404712 | 0.014551 |
1 | -2.032691 | -0.435130 | -0.428953 | -0.537191 |
2 | -0.777178 | -0.435460 | 0.848413 | -1.635667 |
3 | -0.213586 | -1.509976 | -0.635302 | -0.138209 |
4 | 0.639734 | 0.446097 | -1.312515 | 0.783796 |
95 | -0.288322 | 0.132045 | -0.405144 | -0.542612 |
96 | -0.631888 | -0.487683 | 0.684156 | -0.114015 |
97 | -0.405025 | 1.719596 | 0.451788 | 0.930674 |
98 | 0.146326 | 1.032291 | -0.474848 | 0.847260 |
99 | 0.473070 | 0.408938 | -0.504452 | -0.214760 |
head/tailをつなげたやつのタイプを短くする
lambda式を使ってメソッドとして登録しちゃいます。
メソッドの名前はlessとしましたが、好きな名前をつけて、~/.ipython/profile_default/startup
以下にスクリプトを保存していつでも呼び出せるようにしちゃいましょう。
pd.DataFrame.less = lambda df, n=10: self.head(n//2).append(self.tail(n//2))
df.less()
0 | 1 | 2 | 3 | |
---|---|---|---|---|
0 | 0.044887 | 1.109229 | -1.404712 | 0.014551 |
1 | -2.032691 | -0.435130 | -0.428953 | -0.537191 |
2 | -0.777178 | -0.435460 | 0.848413 | -1.635667 |
3 | -0.213586 | -1.509976 | -0.635302 | -0.138209 |
4 | 0.639734 | 0.446097 | -1.312515 | 0.783796 |
95 | -0.288322 | 0.132045 | -0.405144 | -0.542612 |
96 | -0.631888 | -0.487683 | 0.684156 | -0.114015 |
97 | -0.405025 | 1.719596 | 0.451788 | 0.930674 |
98 | 0.146326 | 1.032291 | -0.474848 | 0.847260 |
99 | 0.473070 | 0.408938 | -0.504452 | -0.214760 |
引数指定で出力行を変えられます。
df.less(2)
0 | 1 | 2 | 3 | |
---|---|---|---|---|
0 | 0.044887 | 1.109229 | -1.404712 | 0.014551 |
99 | 0.473070 | 0.408938 | -0.504452 | -0.214760 |
jupyter notebookで人に見せる記事を書くときはググれる情報であるset_option
で出力を常に変えたり、ix, iloc, loc
メソッドで出力を絞ると良いですが、下書き段階やipythonで一時的に出力行を絞るときはtail
とhead
の合わせ技の方が有効かもしれません。