More than 5 years have passed since last update.

RMeCabでデータフレームのテキスト列を読み込む

Last updated at 2016-02-19Posted at 2016-02-19

docDF() 関数は、ファイル、ファイルフォルダ、データフレームを対象に、文字ないし単語頻度、Ngram頻度などを出力します。
参考 http://rmecab.jp/wiki/index.php?RMeCabFunctions#icae4377

> library(RMeCab)
> target <- read.csv(text ='
+ ID,Sex,Reply
+ 1,M,写真とってくれよ
+ 2,F,写真とってください
+ 3,M,大きい写真とってね
+ 4,F,写真とってください
+ 5,M,写真とってっす
+ ')

> res <- docDF(target, col = 3, type=1, N=2,pos = c("名詞","動詞"),   Genkei = 1, nDF = 1)
number of extracted terms = 3
now making a data frame. wait a while!

> res
    N1       N2      POS1        POS2 Row1 Row2 Row3 Row4 Row5
1 とっ ください 動詞-動詞 自立-非自立    0    1    0    1    0
2 とっ     くれ 動詞-動詞 自立-非自立    1    0    0    0    0
3 写真     とっ 名詞-動詞   一般-自立    1    1    1    1    1
> library(RMeCab)
> target <- read.csv(text ='
+ ID,Sex,Reply
+ 1,M,写真とってくれよ
+ 2,F,写真とってください
+ 3,M,大きい写真とってね
+ 4,F,写真とってください
+ 5,M,写真とってっす
+ ')


> #おまけ
> #おまけ
>  (res <- docDF(target, col = 3, type=1, N=2,pos = c("名詞","動詞"), nDF = 2))
number of extracted terms = 3
now making a data frame. wait a while!

           TERM      POS1        POS2 Row1 Row2 Row3 Row4 Row5
1 とる-くださる 動詞-動詞 自立-非自立    0    1    0    1    0
2   とる-くれる 動詞-動詞 自立-非自立    1    0    0    0    0
3     写真-とる 名詞-動詞   一般-自立    1    1    1    1    1
>

You get articles that match your needs
You can efficiently read back useful information
You can use dark theme

What you can do with signing up