1. kenbeese

    cmigemoについて言及

    kenbeese
Changes in body
Source | HTML | Preview

追記

@syohexさんがcmigemoを修正してくれたため、最新版のcmigemoでは改行含んでても検索できるので、早いし、インストール簡単だし、cmigemoを使った方がいいと思う。
参考ページ:Emacsのmigemoユーザーは最新版のcmigemo使った方がいいと思うんだ

はじめに

Emacsで日本語を書く際、適当な所で改行してるため、
改行挟んでも検索できるruby版のmigemoがずっと使いたかった。
ruby1.9以降、簡単に入らなくなっていたから、頑張った作業メモえ

手順

http://0xcc.net/migemo/ にある
romkan.rb, bsearch.rbをダウンロードしてきて各ファイルの先頭いん以下を追加.

# -*- encoding:euc-jp -*-

それでLOAD_PATHの通った所に置く。
私の場合はusr/lib/ruby/2.0.0/site_ruby に置く

migemo-0.40.tar.gzを取ってきて、うまく動いてるっぽいfedoraのパッチをとってきて適用!

tar xvzf migemo-0.40.tar.gz
cd migemo-0.40
patch < ../patch1
patch < ../patch2

ちなみにpatchは以下の二つ

patch1
--- migemo-0.40/migemo-dict.rb.bz830559 2012-06-11 23:17:27.000000000 +0900
+++ migemo-0.40/migemo-dict.rb  2012-06-11 23:18:06.000000000 +0900
@@ -39,7 +39,7 @@

   private
   def decompose (line)
-    array = line.chomp.split("\t").delete_if do |x| x == nil end
+    array = line.chomp.force_encoding("EUC-JP").split("\t").delete_if do |x| x == nil end
     key = array.shift
     values = array
     raise if key == nil
patch2
diff -ur migemo-0.40/genchars.sh migemo-0.40-1.9.1/genchars.sh
--- migemo-0.40/genchars.sh 2001-08-13 18:30:48.000000000 +0900
+++ migemo-0.40-1.9.1/genchars.sh   2010-09-24 00:32:26.000000000 +0900
@@ -1,6 +1,6 @@
 #! /bin/sh

-ruby -rromkan -nle 'head = split[0]; if /^\w+$/ =~ head then puts head else roma = head.to_roma; puts roma, roma.to_kunrei end' migemo-dict |uniq> tmp.ascii.words
+ruby -rromkan -nle 'head = $_.split[0]; if /^\w+$/ =~ head then puts head else roma = head.to_roma; puts roma, roma.to_kunrei end' migemo-dict |uniq> tmp.ascii.words

 # Get the top 500 frequent ngrams.
 for i in 1 2 3 4 5 6 7 8; do
diff -ur migemo-0.40/migemo migemo-0.40-1.9.1/migemo
--- migemo-0.40/migemo  2003-05-27 12:01:10.000000000 +0900
+++ migemo-0.40-1.9.1/migemo    2010-09-24 00:32:26.000000000 +0900
@@ -10,7 +10,6 @@
 # the GNU General Public License version 2.
 #

-$KCODE = "e"

 require 'migemo'
 require 'getoptlong'
diff -ur migemo-0.40/migemo-cache.rb migemo-0.40-1.9.1/migemo-cache.rb
--- migemo-0.40/migemo-cache.rb 2001-07-15 02:38:56.000000000 +0900
+++ migemo-0.40-1.9.1/migemo-cache.rb   2010-09-24 00:32:26.000000000 +0900
@@ -1,5 +1,4 @@
 require 'migemo'
-$KCODE="e"
 raise if ARGV[0] == nil
 dict = ARGV[0]
 static_dict = MigemoStaticDict.new(dict)
@@ -18,10 +17,10 @@
   migemo = Migemo.new(static_dict, pattern)
   migemo.optimization = 3
   data = Marshal.dump(migemo.regex_tree)
-  output = [pattern.length].pack("N") + pattern +
-    [data.length].pack("N") + data
+  output = [pattern.bytesize].pack("N") + pattern.dup.force_encoding("ASCII-8BIT") +
+    [data.bytesize].pack("N") + data
   cache.print output
   index.print [idx].pack("N")
-  idx += output.length
+  idx += output.bytesize
 end

diff -ur migemo-0.40/migemo-convert.rb migemo-0.40-1.9.1/migemo-convert.rb
--- migemo-0.40/migemo-convert.rb   2003-05-26 15:55:22.000000000 +0900
+++ migemo-0.40-1.9.1/migemo-convert.rb 2010-09-24 00:32:26.000000000 +0900
@@ -1,3 +1,4 @@
+# -*- encoding:euc-jp -*-
 #
 # Ruby/Migemo - a library for Japanese incremental search.
 #
diff -ur migemo-0.40/migemo-dict.rb migemo-0.40-1.9.1/migemo-dict.rb
--- migemo-0.40/migemo-dict.rb  2002-10-22 14:38:14.000000000 +0900
+++ migemo-0.40-1.9.1/migemo-dict.rb    2010-09-24 00:32:26.000000000 +0900
@@ -1,3 +1,4 @@
+# -*- encoding:euc-jp -*-
 #
 # Ruby/Migemo - a library for Japanese incremental search.
 #
@@ -122,8 +123,8 @@
   def lookup (pattern)
     raise if pattern == nil
     pattern = pattern.downcase
-    idx = @index.bsearch_first do |idx|
-      key, data = decompose(idx)
+    idx = @index.bsearch_first do |idx1|
+      key, data = decompose(idx1)
       key <=> pattern
     end
     if idx
diff -ur migemo-0.40/migemo-index.rb migemo-0.40-1.9.1/migemo-index.rb
--- migemo-0.40/migemo-index.rb 2003-05-26 15:45:53.000000000 +0900
+++ migemo-0.40-1.9.1/migemo-index.rb   2010-09-24 00:32:26.000000000 +0900
@@ -19,5 +19,5 @@
   unless line =~ /^;/
     print [offset].pack("N")
   end
-  offset += line.length
+  offset += line.bytesize
 end
diff -ur migemo-0.40/migemo.rb.in migemo-0.40-1.9.1/migemo.rb.in
--- migemo-0.40/migemo.rb.in    2003-05-28 21:00:52.000000000 +0900
+++ migemo-0.40-1.9.1/migemo.rb.in  2010-09-24 00:33:04.000000000 +0900
@@ -1,3 +1,4 @@
+# -*- encoding:euc-jp -*-
 #
 # Ruby/Migemo - a library for Japanese incremental search.
 #
@@ -14,7 +15,6 @@
 require 'migemo-dict'
 require 'migemo-regex'
 require 'romkan'
-require 'jcode'
 include MigemoRegex

 class String
@@ -24,7 +24,7 @@
   end

   def quotemeta
-    self.gsub(/([^ \w])/, '\\\\\\1')
+    self.gsub(/([[:punct:]])/, '\\\\\\1')
   end

   def first
@@ -177,7 +177,7 @@
     expand_kanas.each do |x|
       compiler.push(x)
       compiler.push(x.to_katakana)
-      expand_words(@static_dict, x).each do |x| compiler.push(x) end
+      expand_words(@static_dict, x).each do |y| compiler.push(y) end
     end
     expand_words(@static_dict, @pattern).each do |x| compiler.push(x) end
     compiler.uniq
@@ -188,7 +188,7 @@
   def lookup_user_dict
     compiler = RegexCompiler.new
     expand_kanas.each do |x|
-      expand_words(@user_dict, x).each do |x| compiler.push(x) end
+      expand_words(@user_dict, x).each do |y| compiler.push(y) end
     end
     expand_words(@user_dict, @pattern).each do |x| compiler.push(x) end
     compiler.uniq
diff -ur migemo-0.40/tests/Makefile.in migemo-0.40-1.9.1/tests/Makefile.in
--- migemo-0.40/tests/Makefile.in   2003-05-29 17:09:03.000000000 +0900
+++ migemo-0.40-1.9.1/tests/Makefile.in 2010-09-24 00:32:26.000000000 +0900
@@ -203,7 +203,7 @@
 test-dict.cache: test-dict test-dict.idx ../migemo-cache.rb
    ruby -rromkan -ne 'puts $$1.to_roma if /^(.+?)  /' test-dict |\
    while read line; do\
-       echo $$line | ruby -ne 'chomp!;1.upto($$_.length) do |x| puts $$_[0,x] end';\
+       echo $$line | ruby -ne '$$_.chomp!;1.upto($$_.length) do |x| puts $$_[0,x] end';\
    done | ruby -I.. ../migemo-cache.rb test-dict

 clean-local:
diff -ur migemo-0.40/migemo-regex.rb migemo-0.40-1.9.1/migemo-regex.rb
--- migemo-0.40/migemo-regex.rb
+++ migemo-0.40-1.9.1/migemo-regex.rb
@@ -1,3 +1,4 @@
+# -*- encoding:euc-jp -*-
 #
 # Ruby/Migemo - a library for Japanese incremental search.
 #

以下のコマンドでインストール

./configure
LANG=ja_JP.eucJP make
make install

これでEmacsでのmigemo検索がいい感じになった!
ちゃんとrubyのコードが読めないから、大変だったんだろうな... これを機に勉強するか...