LoginSignup
5
3

More than 5 years have passed since last update.

日本語WordNetからAndroid Nの名前候補をリストアップするの巻 #NameAndroidN

Posted at
  1. Nで始まる食品の名詞をリストアップすればよさそう。
  2. そういやWordNetってのがあったなあ。(使ったこと無い。)
  3. 上位語とか下位語とか取れるんだっけ?
  4. 上位語を手繰ってfoodが出てきたら食品のはず。
  5. たぶん知らん英単語いっぱい出てくるし日本語でも出て欲しい。
  6. 日本語WordNetだ。

日本語WordNetのSQLite3版のダウンロード

日本語WordNetのSQLite3のデータベースをダウンロードします。

wget http://nlpwww.nict.go.jp/wn-ja/data/1.1/wnjpn.db.gz
gunzip wnjpn.db.gz

Rubyスクリプトを書く

NameAndroidN.rb
require 'sqlite3'

$db = SQLite3::Database.new('wnjpn.db')

# Nで始まるsynsetのidを抽出する
def get_name_android_n
  $db.execute("select * from synset where name like ?;", 'n%').select{ |synset, pos, name, src|
     pos == 'n'
  }.map{ |synset, pos, name, src|
    synset
  }.flatten
end

# synsetのidから名前を取得する
def get_name_from_synset(synset)
  $db.execute("select name from synset where synset in (?);", synset).flatten.uniq.first
end

# 指定したsynset idの上位語のsynset idのリストを取得する
def get_hype_synsets(synset)
  $db.execute("select synset2 from synlink where synset1 in (?) and link in ('hype');", synset).flatten.uniq
end

# 指定したsynset idに対応する日本語を取得する
def get_jpn_lemmas(synset)
  ids = $db.execute("select wordid from sense where synset in (?) and lang in ('jpn');", synset).flatten
  ids.map{ |wordid|
    $db.execute("select lemma from word where wordid in (?);", wordid)
  }.flatten.uniq
end

def print_jpn_lemmas(synset)
  print get_jpn_lemmas(synset).join(' ')
  print ','
end

def print_hypes(synset)
  print get_name_from_synset(synset)
  print ','
  get_hype_synsets(synset).each do |hype_synset|
    print_hypes(hype_synset)
  end
end

get_name_android_n.each do |synset|
  print_jpn_lemmas(synset)
  print_hypes(synset)
  puts
end

叩く

ruby NameAndroidN.rb | grep food > NameAndroidN.csv

結果

NameAndroidN.csv
滋養分 養い 栄養分 滋養 栄養 養分 営養素 営養 栄養素 栄養物 精分,nutrition,food,substance,matter,physical_entity,entity,
ナツメグ ニクズク 肉荳蒄 ナツメッグ,nutmeg,spice,seasoner,fixings,foodstuff,food,substance,matter,physical_entity,entity,
,newburg_sauce,sauce,condiment,seasoner,fixings,foodstuff,food,substance,matter,physical_entity,entity,
,northern_spy,dessert_apple,apple,edible_fruit,green_goods,food,solid,matter,physical_entity,entity,fruit,reproductive_structure,plant_organ,plant_part,natural_object,whole,physical_object,physical_entity,entity,false_fruit,fruit,reproductive_structure,plant_organ,plant_part,natural_object,whole,physical_object,physical_entity,entity,
ボイルドディナー,new_england_boiled_dinner,dish,nutrition,food,substance,matter,physical_entity,entity,
大手亡,navy_bean,common_bean,edible_bean,legume,vegetable,green_goods,food,solid,matter,physical_entity,entity,
,nacho,tortilla_chip,corn_chip,snack_food,dish,nutrition,food,substance,matter,physical_entity,entity,
,northern_lobster,lobster,shellfish,seafood,food,solid,matter,physical_entity,entity,
,nutmeg_melon,sweet_melon,melon,edible_fruit,green_goods,food,solid,matter,physical_entity,entity,fruit,reproductive_structure,plant_organ,plant_part,natural_object,whole,physical_object,physical_entity,entity,
ニッパ,nipa,inebriant,drink,food,substance,matter,physical_entity,entity,liquid,fluid,substance,matter,physical_entity,entity,part,relation,abstract_entity,entity,street_drug,drug,agent,cause,physical_entity,entity,substance,matter,physical_entity,entity,
ナスタチウム 金蓮花,nasturtium,seasoner,fixings,foodstuff,food,substance,matter,physical_entity,entity,
ニコチン酸 ナイアシン,nicotinic_acid,vitamin_b_complex,water-soluble_vitamin,vitamin,nutrition,food,substance,matter,physical_entity,entity,
,nada_daiquiri,daiquiri,cocktail,mixed_drink,inebriant,drink,food,substance,matter,physical_entity,entity,liquid,fluid,substance,matter,physical_entity,entity,part,relation,abstract_entity,entity,street_drug,drug,agent,cause,physical_entity,entity,substance,matter,physical_entity,entity,
,nonpareil,confection,kickshaw,nutrition,food,substance,matter,physical_entity,entity,
,nonpareil,chocolate_candy,chocolate,food,solid,matter,physical_entity,entity,
ネッセルローデ,nesselrode,pudding,afters,course,nutrition,food,substance,matter,physical_entity,entity,
ネクタリン,nectarine,edible_fruit,green_goods,food,solid,matter,physical_entity,entity,fruit,reproductive_structure,plant_organ,plant_part,natural_object,whole,physical_object,physical_entity,entity,
,nut_bread,quick_bread,bread,baked_goods,food,solid,matter,physical_entity,entity,starches,foodstuff,food,substance,matter,physical_entity,entity,
,nova_salmon,lox,smoked_salmon,salmon,fish,food,solid,matter,physical_entity,entity,
,new_york_strip,beefsteak,steak,cut_of_meat,meat,food,solid,matter,physical_entity,entity,
ナフトキノン,naphthoquinone,fat-soluble_vitamin,vitamin,nutrition,food,substance,matter,physical_entity,entity,
ナン,nan,bread,baked_goods,food,solid,matter,physical_entity,entity,starches,foodstuff,food,substance,matter,physical_entity,entity,
,new_england_clam_chowder,clam_chowder,chowder,soup,dish,nutrition,food,substance,matter,physical_entity,entity,
甘露 ネクター,nectar,fruit_crush,drink,food,substance,matter,physical_entity,entity,liquid,fluid,substance,matter,physical_entity,entity,part,relation,abstract_entity,entity,
ヌガー,nougat,confect,confection,kickshaw,nutrition,food,substance,matter,physical_entity,entity,
,nut_butter,paste,condiment,seasoner,fixings,foodstuff,food,substance,matter,physical_entity,entity,
ネック 首,neck,cut_of_meat,meat,food,solid,matter,physical_entity,entity,
,negus,mulled_wine,vino,inebriant,drink,food,substance,matter,physical_entity,entity,liquid,fluid,substance,matter,physical_entity,entity,part,relation,abstract_entity,entity,street_drug,drug,agent,cause,physical_entity,entity,substance,matter,physical_entity,entity,
,napoleon,french_pastry,pastry,baked_goods,food,solid,matter,physical_entity,entity,
麺 ヌードル,noodle,alimentary_paste,food,solid,matter,physical_entity,entity,
,nosh,collation,repast,nutrition,food,substance,matter,physical_entity,entity,
ナポリタンアイスクリーム,neapolitan_ice_cream,ice_cream,frozen_dessert,afters,course,nutrition,food,substance,matter,physical_entity,entity,
,nut_bar,confect,confection,kickshaw,nutrition,food,substance,matter,physical_entity,entity,
,near_beer,drink,food,substance,matter,physical_entity,entity,liquid,fluid,substance,matter,physical_entity,entity,part,relation,abstract_entity,entity,
,nosh-up,repast,nutrition,food,substance,matter,physical_entity,entity,
,newtown_wonder,cooking_apple,apple,edible_fruit,green_goods,food,solid,matter,physical_entity,entity,fruit,reproductive_structure,plant_organ,plant_part,natural_object,whole,physical_object,physical_entity,entity,false_fruit,fruit,reproductive_structure,plant_organ,plant_part,natural_object,whole,physical_object,physical_entity,entity,
ネーブル ネーブルオレンジ,navel_orange,sweet_orange,orange,citrus_fruit,edible_fruit,green_goods,food,solid,matter,physical_entity,entity,fruit,reproductive_structure,plant_organ,plant_part,natural_object,whole,physical_object,physical_entity,entity,
,nonfat_dry_milk,dry_milk,milk,dairy_product,foodstuff,food,substance,matter,physical_entity,entity,drink,food,substance,matter,physical_entity,entity,liquid,fluid,substance,matter,physical_entity,entity,part,relation,abstract_entity,entity,
ヌガー,nougat_bar,confect,confection,kickshaw,nutrition,food,substance,matter,physical_entity,entity,

おわりに

ヌードルとかヌガーとかが無難?ナポリタンアイスクリームはアイスクリームとかぶるわな〜。

5
3
2

Register as a new user and use Qiita more conveniently

  1. You get articles that match your needs
  2. You can efficiently read back useful information
  3. You can use dark theme
What you can do with signing up
5
3