nlコマンドの論理ページ

Last updated at 2020-12-14Posted at 2020-12-08

nlコマンドには論理ページ（Logical pages）という概念があります。しかし、テキストに行番号をつけるだけなら気にする必要はなく、普段からページを意識してnlを使うことはほとんどありません。この記事では非常に影が薄い「論理ページ」について説明します。

論理ページの使い方

論理ページを簡単に要約すると、nlが読み込んだデータは「ヘッダー」「ボディ」「フッター」から構成される「ページ」として解釈されるということです。この機能を使えば**「一部の行だけに行番号をつける」**ということが実現できます。もちろん「ページ」は連続してドキュメントのような大きなデータを構成することができます。その場合、行番号はページごとにリセットされます。

ページ内のそれぞれのセクションは、先頭に配置される\:\:\:（ヘッダー） \:\:（ボディ） \:（フッター）という3つの文字列を使って「区切り」を識別します。そして「ヘッダー」と「フッター」には行番号を振りません。区切り文字列が指定されないデータは、すべてのデータが単一のボディとして解釈されます。実際には「すべて単一のボディ」とみなすnlの使い方がほとんどでしょう。

機能としての存在感はイマイチな論理ページですが、各nlのマニュアルでは論理ページについて多くのテキストを割いて説明しています。しかしながら、なんとなく読み飛ばしてしまうことがほとんどだと思いますので、あらためて眺めてみましょう。どれもほぼ同等の説明がされていますが、特筆すべきことはcoreutilsのnlでは「セクションの区切りは空行に置き換えられる」と明記されている点です。また、busyboxのnlは論理ページをサポートしていません。

info nl (GNU coreutils)

   ‘nl’ decomposes its input into (logical) page sections; by default,
the line number is reset to 1 at each logical page section.  ‘nl’ treats
all of the input files as a single document; it does not reset line
numbers or logical pages between files.

   A logical page consists of three sections: header, body, and footer.
Any of the sections can be empty.  Each can be numbered in a different
style from the others.

   The beginnings of the sections of logical pages are indicated in the
input file by a line containing exactly one of these delimiter strings:

‘\:\:\:’
     start of header;
‘\:\:’
     start of body;
‘\:’
     start of footer.

   The two characters from which these strings are made can be changed
from ‘\’ and ‘:’ via options (see below), but the pattern and length of
each string cannot be changed.

   A section delimiter is replaced by an empty line on output.  Any text
that comes before the first section delimiter string in the input file
is considered to be part of a body section, so ‘nl’ treats a file that
contains no section delimiters as a single body section.

man nl (BSD)

     The nl utility treats the text it reads in terms of logical pages.  Unless
     specified otherwise, line numbering is reset at the start of each logical
     page.  A logical page consists of a header, a body and a footer section;
     empty sections are valid.  Different line numbering options are independently
     available for header, body and footer sections.

     The starts of logical page sections are signalled by input lines containing
     nothing but one of the following sequences of delimiter characters:

           Line      Start of
           \:\:\:    header
           \:\:      body
           \:        footer

     If the input does not contain any logical page section signalling directives,
     the text being read is assumed to consist of a single logical page body.

POSIX

The nl utility views the text it reads in terms of logical pages. Line numbering shall be reset at the start of each logical page. A logical page consists of a header, a body, and a footer section. Empty sections are valid. Different line numbering options are independently available for header, body, and footer (for example, no numbering of header and footer lines while numbering blank lines only in the body).

The starts of logical page sections shall be signaled by input lines containing nothing but the following delimiter characters:

|Line|Start of|
|:--|:--|
\:\:\:|Header
\:\:|Body
\:|Footer

Unless otherwise specified, nl shall assume the text being read is in a single logical page body.

論理ページを試す

ページの表示

最初の記事から使用しているデータ「utl-kita」を少し編集して、1ページ目が「東海道線」2ページ目が「宇都宮線・高崎線」として、ヘッダーとフッターを追記したデータ「utl-doc」を作って試してみましょう。

coreutils

$ cat utl-doc
\:\:\:
【東海道線（JT）】
\:\:
東京
新橋
品川
川崎
横浜
戸塚
大船
\:
小田原方面へ直通
\:\:\:
【宇都宮線・高崎線（JU）】
\:\:
東京
上野
尾久
赤羽
浦和
さいたま新都心
大宮
\:
宇都宮線: 宇都宮方面へ直通
高崎線: 高崎方面へ直通

$ nl utl-doc

       【東海道線（JT）】

     1	東京
     2	新橋
     3	品川
     4	川崎
     5	横浜
     6	戸塚
     7	大船

       小田原方面へ直通

       【宇都宮線・高崎線（JU）】

     1	東京
     2	上野
     3	尾久
     4	赤羽
     5	浦和
     6	さいたま新都心
     7	大宮

       宇都宮線: 宇都宮方面へ直通
       高崎線: 高崎方面へ直通

coreutilsのnlでは、マニュアルに記載の通りセクションの区切り文字が空行に置き換えられています。一方、BSD系のnlでは区切りは空行に変換されず削除されるようです。

BSD

$ nl utl-doc
      	【東海道線（JT）】
     1	東京
     2	新橋
     3	品川
     4	川崎
     5	横浜
     6	戸塚
     7	大船
      	小田原方面へ直通
      	【宇都宮線・高崎線（JU）】
     1	東京
     2	上野
     3	尾久
     4	赤羽
     5	浦和
     6	さいたま新都心
     7	大宮
      	宇都宮線: 宇都宮方面へ直通
      	高崎線: 高崎方面へ直通

番号のリセット

もう少し簡単な利用例を検討してみましょう。先ほどの例を参考に、路線で区切ったデータ「utl-doc2」を作って試してみます。ボディの区切り文字列\:\:を使うことで、途中で番号をリセットすることができます。

$ cat utl-doc2 
東京
新橋
品川
川崎
横浜
戸塚
大船
\:\:
東京
上野
尾久
赤羽
浦和
さいたま新都心
大宮
$ nl utl-doc2 
     1	東京
     2	新橋
     3	品川
     4	川崎
     5	横浜
     6	戸塚
     7	大船

     1	東京
     2	上野
     3	尾久
     4	赤羽
     5	浦和
     6	さいたま新都心
     7	大宮

もしお手元の環境でうまくいかない場合は、こちらの記事もご確認ください。確認しても解決はしませんが。

coreutilsの`nl`で区切り文字列から変換される空行

前回の記事では、nlは入力された空行に行番号は付加しないが、出力後は空行ではなくなることについても説明しました。それでは「区切り文字列から変換された空行」は「真に空行」なのでしょうか。検証してみたいと思います。このnlを重ねる方法を応用すると、nlを用いて空行かどうかを判別することができます。