More than 5 years have passed since last update.

TeXの索引エントリ\indexを自動で指定単語の直後に埋め込むスクリプト

Last updated at 2015-07-15Posted at 2015-07-11

TeX文書にmendex索引用のインデックスをつけていく作業が手入力で面倒くさいので、自動でやってくれる方法を探してみたのですが、ここを読む限り、すぐ使える方法はないようです。なので、簡単なスクリプトをrubyでつくってみました。

使い方の例

準備

まず、インデックス化していないTeX文書を用意します。

main.tex

\documentclass[11pt,a4paper]{jsarticle}
\usepackage{makeidx}
\makeindex
\begin{document}
化学ポテンシャルは熱力学で用いられる示強性状態量の一つである。化学ポテンシャルは、アメリカの化学者ウィラード・ギブズにより導入された概念である。 化学ポテンシャルは、物質の多寡により系が潜在的に持つエネルギーの大きさの尺度となる量である。 例えば、半透膜で隔てられた二つの系の間に濃度差が有った場合、浸透圧が生じ仕事を為す事が出来る。物質が存在することにより系は潜在的にエネルギーを持つ。 その系に含まれるある成分の単位物質量あたりのギブスエネルギーがその成分の化学ポテンシャルに相当する。
示強性状態量である化学ポテンシャルと示量性状態量である物質量は互いに共役
な関係であり、掛け合わせるとエネルギーの次元となる。
\printindex
\end{document}

それから、インデックス化したい単語のリストをつくっておきます。

indexlist

しきょうせいじょうたいりょう@示強性状態量
しりょうせいじょうたいりょう@示量性状態量
しんとうあつ@浸透圧

あとは後述のaddindex.rbを用意して準備は終わりです。

実行

索引インデックスをつけるには、以下のコマンドのとおり、addindex.rbに対して、第1引数にTeXファイル、第2引数にインデックス化したい単語のリストを渡します。
$ ruby addindex.rb main.tex indexlist
すると、

main.tex

\documentclass[11pt,a4paper]{jsarticle}
\usepackage{makeidx}
\makeindex
\begin{document}
化学ポテンシャルは熱力学で用いられる示強性状態量<span class='marker'>\index{しきょうせいじょうたいりょう@示強性状態量}</span>の一つである。化学ポテンシャルは、アメリカの化学者ウィラード・ギブズにより導入された概念である。 化学ポテンシャルは、物質の多寡により系が潜在的に持つエネルギーの大きさの尺度となる量である。 例えば、半透膜で隔てられた二つの系の間に濃度差が有った場合、浸透圧\index{しんとうあつ@浸透圧}が生じ仕事を為す事が出来る。物質が存在することにより系は潜在的にエネルギーを持つ。 その系に含まれるある成分の単位物質量あたりのギブスエネルギーがその成分の化学ポテンシャルに相当する。
示強性状態量\index{しきょうせいじょうたいりょう@示強性状態量}である化学ポテンシャルと示量性状態量\index{しりょうせいじょうたいりょう@示量性状態量}である物質量は互いに共役
な関係であり、掛け合わせるとエネルギーの次元となる。
\printindex
\end{document}

のように、main.texが書き換わります。つまり、indexlistで指定した単語が本文中で見つかった場合、その直後に\index{}コマンドが自動で挿入されます。TeXコマンドや文中のファイル名などに対して誤って\index{}をつけないようにしています。他の補助機能として、とりあえずTeXの子ファイル(\include)も自動で編集する対象になるようにしています。

本スクリプトを実行する前にTeX文書のバックアップを必ずとってください。

addindex.rbの中身(2015.7.15更新)

Stringクラスを継承してTeXクラスをつくって、そこにいろいろメソッドをたてているだけです。これをテキストエディタに貼り付けて、ファイル名をつけて保存してください（addindex.rbである必要はありません）。

addindex.rb

# !/usr/bin/ruby
# encoding: utf-8

class Operand < Hash
  def initialize(str)
    @str = str
    self.replace generate
    super
  end

  private
  
  def generate
    all_blocks = escape_blocks.product([false]) + non_escape_blocks.product([true])
    all_blocks = all_blocks.sort do |(k1, v1), (k2, v2)| k1 <=> k2 end
    return Hash[*all_blocks.flatten(1)] if valid?(all_blocks)
  end
  
  def valid?(blocks)
    blocks.each_cons(2) do |h1,h2|
      false if h1[0][0]+h1[0][1] != h2[0][0]
    end
    false if blocks[-1][0][0] + blocks[-1][0][1] != @str.length
    true
  end

  def escape_blocks # escape range defined by [idx, len]
    escape = [[0,0]]
    begin
      escape_each = minimum_match(@str[escape[-1][0]+escape[-1][1],@str.length])
      break if escape_each.compact.size == 0
      escape_each[0] = escape_each[0] + escape[-1][0]+escape[-1][1]
      escape << escape_each
    end while escape[-1][0]+escape[-1][1] < @str.length 
    escape.shift
    escape
  end
  
  def non_escape_blocks # non escape range defined by [idx, len]
    escape = escape_blocks
    non_escape = [[0,escape[0][0]]]
    escape.each_cons(2) do |h1,h2|
      non_escape << [h1[0]+h1[1], h2[0]-(h1[0]+h1[1])]
    end
    non_escape << [escape[-1][0]+escape[-1][1], @str.length-(escape[-1][0]+escape[-1][1])]
    non_escape
  end

  def minimum_match(str)
    def match_idx_len(rx,str)
      match_idx = rx =~ str
      match_len = $&.length unless $&.nil?
      [match_idx,match_len]
    end
    
    rx_escape1 = /\\\w*?(?<paren>\{(?:[^\{\}]|\g<paren>)*\})/ # escape for \***{***}
    rx_escape2 = /[\$].*?[\$]/ # escape for $***$
    rx_escape3 = /(?<paren>\{(?:[^\{\}]|\g<paren>)*\})/  # escape for {}
    matches = [rx_escape1,rx_escape2,rx_escape3].map do |rx| match_idx_len(rx,str) end
    case matches.transpose[0].compact.min
    when matches.transpose[0][0] then matches[0]
    when matches.transpose[0][1] then matches[1]
    when matches.transpose[0][2] then matches[2]
    end
  end
end

class TeX < String
  def initialize(str)
    super
  end

  def child
    lines.flat_map do |line| $2 + ".tex" if line[/(?!^%)(\\include\{)(.+?)(\})/,2] end.compact
  end

  def parent?
    match(/\\begin\{document\}/)
  end

  def indexing!(index)
    list = index_entries(index)
    revert_indexing!(index)
    rx_adder = Regexp.union(list.keys)
    str = Operand.new(self).reduce('') do |str,(k,v)|
      v ?  str << self[k[0],k[1]].gsub(rx_adder, list) : str << self[k[0],k[1]]
    end
    self.replace str
  end

  def revert_indexing!(index)
    list = index_entries(index)
    rx_subtracter = Regexp.union(list.values)
    gsub!(rx_subtracter, list.invert)
  end

  private
  
  def index_entries(str)
   list = str.lines.flat_map do |line|
      word = line.include?("@") ? line.split("@")[1].chomp : line.chomp
      command = word.chomp+'\index{'+line.chomp+'}'
      [word, command]
    end
    Hash[*list]
  end
end

# load TeX file
tex_file = File.open(ARGV[0], "r+:UTF-8")
tex = TeX.new(tex_file.read)
# load index description file 
index = IO.read(ARGV[1])

# For main TeX file
tex.indexing!(index)
tex_file.rewind
tex_file.write(tex)
tex_file.close

# For child TeX file(s)
if tex.parent? then
  print "\n===CHILDREN===\n"
  tex.child.each do |file_path|
    full_file_path = File.dirname(ARGV[0]) + "/" + file_path
    next unless File.exist?(full_file_path)
    puts "adding indices in " + full_file_path
    File.open(full_file_path, "r+:UTF-8") do |tex_file|
      tex_child = TeX.new(tex_file.read) 
      tex_child.indexing!(index)
      tex_file.rewind
      tex_file.write(tex_child)
    end
  end
end

You get articles that match your needs
You can efficiently read back useful information
You can use dark theme

What you can do with signing up