0
0

Delete article

Deleted articles cannot be recovered.

Draft of this article would be also deleted.

Are you sure you want to delete this article?

GCC コメント除去、単語計算

Last updated at Posted at 2019-05-04

GCC コメント除去、単語計算
https://researchmap.jp/joo4thhg9-1778110/

TOPPERS/sspのコメントを除去し、有効なプログラムの命令、変数、関数名を数え上げ、分析するための資料とするため、
1 GCCでコメントを除去
2 スクリプトで単語を数え上げ
3 分析

という手順で進めます。

コメント除去をプログラムで行おうと思い探していましたが、うまくみつからず。
GCCで消すことに。

1

gcc-4.9 -fpreprocessed -dD -E 入力ファイル >>  出力ファイル

commenout.sh
gcc-4.9 -fpreprocessed -dD -Ealarm.c>>../cmt/text
gcc-4.9 -fpreprocessed -dD -Ealarm.h>>../cmt/text
gcc-4.9 -fpreprocessed -dD -Eallfunc.h>>../cmt/text
gcc-4.9 -fpreprocessed -dD -Earm_m.h>>../cmt/text
gcc-4.9 -fpreprocessed -dD -Ebanner.c>>../cmt/text
gcc-4.9 -fpreprocessed -dD -Ebanner.h>>../cmt/text
gcc-4.9 -fpreprocessed -dD -Ecfg1_out.c>>../cmt/text
gcc-4.9 -fpreprocessed -dD -Echeck.h>>../cmt/text
gcc-4.9 -fpreprocessed -dD -Ecy8c5xlp.h>>../cmt/text
gcc-4.9 -fpreprocessed -dD -Ecyclic.c>>../cmt/text
gcc-4.9 -fpreprocessed -dD -Ecyclic.h>>../cmt/text
gcc-4.9 -fpreprocessed -dD -Edataqueue.c>>../cmt/text
gcc-4.9 -fpreprocessed -dD -Edataqueue.h>>../cmt/text
gcc-4.9 -fpreprocessed -dD -Eeventflag.c>>../cmt/text
gcc-4.9 -fpreprocessed -dD -Eeventflag.h>>../cmt/text
gcc-4.9 -fpreprocessed -dD -Eexception.c>>../cmt/text
gcc-4.9 -fpreprocessed -dD -Eexception.h>>../cmt/text
gcc-4.9 -fpreprocessed -dD -Einterrupt.c>>../cmt/text
gcc-4.9 -fpreprocessed -dD -Einterrupt.h>>../cmt/text
gcc-4.9 -fpreprocessed -dD -Eitron.h>>../cmt/text
gcc-4.9 -fpreprocessed -dD -Ekernel.h>>../cmt/text
gcc-4.9 -fpreprocessed -dD -Ekernel_cfg.c>>../cmt/text
gcc-4.9 -fpreprocessed -dD -Ekernel_cfg.h>>../cmt/text
gcc-4.9 -fpreprocessed -dD -Ekernel_impl.h>>../cmt/text
gcc-4.9 -fpreprocessed -dD -Ekernel_int.h>>../cmt/text
gcc-4.9 -fpreprocessed -dD -Ekernel_rename.h>>../cmt/text
gcc-4.9 -fpreprocessed -dD -Ekernel_unrename.h>>../cmt/text
gcc-4.9 -fpreprocessed -dD -Elog_output.c>>../cmt/text
gcc-4.9 -fpreprocessed -dD -Elog_output.h>>../cmt/text
gcc-4.9 -fpreprocessed -dD -Elogtask.c>>../cmt/text
gcc-4.9 -fpreprocessed -dD -Elogtask.h>>../cmt/text
gcc-4.9 -fpreprocessed -dD -Eprc_cfg1_out.h>>../cmt/text
gcc-4.9 -fpreprocessed -dD -Eprc_config.c>>../cmt/text
gcc-4.9 -fpreprocessed -dD -Eprc_config.h>>../cmt/text
gcc-4.9 -fpreprocessed -dD -Eprc_insn.h>>../cmt/text
gcc-4.9 -fpreprocessed -dD -Eprc_kernel.h>>../cmt/text
gcc-4.9 -fpreprocessed -dD -Eprc_rename.h>>../cmt/text
gcc-4.9 -fpreprocessed -dD -Eprc_sil.h>>../cmt/text
gcc-4.9 -fpreprocessed -dD -Eprc_stddef.h>>../cmt/text
gcc-4.9 -fpreprocessed -dD -Eprc_test.h>>../cmt/text
gcc-4.9 -fpreprocessed -dD -Eprc_timer.c>>../cmt/text
gcc-4.9 -fpreprocessed -dD -Eprc_timer.h>>../cmt/text
gcc-4.9 -fpreprocessed -dD -Eprc_unrename.h>>../cmt/text
gcc-4.9 -fpreprocessed -dD -Equeue.h>>../cmt/text
gcc-4.9 -fpreprocessed -dD -Erx600_uart.c>>../cmt/text
gcc-4.9 -fpreprocessed -dD -Erx600_uart.h>>../cmt/text
gcc-4.9 -fpreprocessed -dD -Esample1.c>>../cmt/text
gcc-4.9 -fpreprocessed -dD -Esample1.h>>../cmt/text
gcc-4.9 -fpreprocessed -dD -Eserial.c>>../cmt/text
gcc-4.9 -fpreprocessed -dD -Eserial.h>>../cmt/text
gcc-4.9 -fpreprocessed -dD -Esil.h>>../cmt/text
gcc-4.9 -fpreprocessed -dD -Estartup.c>>../cmt/text
gcc-4.9 -fpreprocessed -dD -Esys_manage.c>>../cmt/text
gcc-4.9 -fpreprocessed -dD -Esyslog.c>>../cmt/text
gcc-4.9 -fpreprocessed -dD -Esyslog.h>>../cmt/text
gcc-4.9 -fpreprocessed -dD -Et_stddef.h>>../cmt/text
gcc-4.9 -fpreprocessed -dD -Et_syslog.h>>../cmt/text
gcc-4.9 -fpreprocessed -dD -Etarget_cfg1_out.h>>../cmt/text
gcc-4.9 -fpreprocessed -dD -Etarget_config.c>>../cmt/text
gcc-4.9 -fpreprocessed -dD -Etarget_config.h>>../cmt/text
gcc-4.9 -fpreprocessed -dD -Etarget_kernel.h>>../cmt/text
gcc-4.9 -fpreprocessed -dD -Etarget_rename.h>>../cmt/text
gcc-4.9 -fpreprocessed -dD -Etarget_serial.c>>../cmt/text
gcc-4.9 -fpreprocessed -dD -Etarget_serial.h>>../cmt/text
gcc-4.9 -fpreprocessed -dD -Etarget_sil.h>>../cmt/text
gcc-4.9 -fpreprocessed -dD -Etarget_stddef.h>>../cmt/text
gcc-4.9 -fpreprocessed -dD -Etarget_syssvc.h>>../cmt/text
gcc-4.9 -fpreprocessed -dD -Etarget_test.h>>../cmt/text
gcc-4.9 -fpreprocessed -dD -Etarget_timer.h>>../cmt/text
gcc-4.9 -fpreprocessed -dD -Etarget_unrename.h>>../cmt/text
gcc-4.9 -fpreprocessed -dD -Etask.c>>../cmt/text
gcc-4.9 -fpreprocessed -dD -Etask.h>>../cmt/text
gcc-4.9 -fpreprocessed -dD -Etask_manage.c>>../cmt/text
gcc-4.9 -fpreprocessed -dD -Etime_event.c>>../cmt/text
gcc-4.9 -fpreprocessed -dD -Etime_event.h>>../cmt/text
gcc-4.9 -fpreprocessed -dD -Etime_manage.c>>../cmt/text
gcc-4.9 -fpreprocessed -dD -Etool_stddef.h>>../cmt/text
gcc-4.9 -fpreprocessed -dD -Evasyslog.c>>../cmt/text

上記処理で、下記の警告が出た。


sample1.h:58:0: warning: "ERRORTSK_PRIORITY" redefined
 #define ERRORTSK_PRIORITY  (2)
 ^
sample1.h:47:0: note: this is the location of the previous definition
 #define ERRORTSK_PRIORITY  (6)
 ^
sample1.h:59:0: warning: "MAIN_PRIORITY" redefined
 #define MAIN_PRIORITY   (3)
 ^
sample1.h:48:0: note: this is the location of the previous definition
 #define MAIN_PRIORITY   (7)
 ^
sample1.h:60:0: warning: "TASK1_PRIORITY" redefined
 #define TASK1_PRIORITY   (4)
 ^
sample1.h:49:0: note: this is the location of the previous definition
 #define TASK1_PRIORITY   (8)
 ^
sample1.h:61:0: warning: "TASK2_PRIORITY" redefined
 #define TASK2_PRIORITY   (5)
 ^
sample1.h:50:0: note: this is the location of the previous definition
 #define TASK2_PRIORITY   (9)
 ^
sample1.h:62:0: warning: "TASK3_PRIORITY" redefined
 #define TASK3_PRIORITY   (6)
 ^
sample1.h:51:0: note: this is the location of the previous definition
 #define TASK3_PRIORITY   (10)
 ^
sample1.h:63:0: warning: "TASK3_EXEPRIORITY" redefined
 #define TASK3_EXEPRIORITY  (5)
 ^
sample1.h:52:0: note: this is the location of the previous definition
 #define TASK3_EXEPRIORITY  (9)
 ^
t_stddef.h:234:0: warning: "assert" redefined
 #define assert(exp)  ((void) 0)
 ^
t_stddef.h:231:0: note: this is the location of the previous definition
 #define assert(exp)  ((void)((exp) ? 0 : (TOPPERS_assert_fail(#exp, \
 ^
t_stddef.h:260:0: warning: "MERCD" redefined
 #define MERCD(ercd)  ((ER)((((uint_t) (ercd))) | ~0xffU))
 ^
t_stddef.h:258:0: note: this is the location of the previous definition
 #define MERCD(ercd)  ((ER)((int8_t)(ercd)))
 ^
t_stddef.h:274:0: warning: "TMAX_RELTIM" redefined
 #define TMAX_RELTIM  ((RELTIM) LONG_MAX)
 ^
t_stddef.h:272:0: note: this is the location of the previous definition
 #define TMAX_RELTIM  ((RELTIM) UINT_MAX)
 ^

#2
###2.1 初版。URLのところからほとんど持ってきただけの版

wc.rb
#!/usr/bin/ruby
# -*- mode:ruby; coding:utf-8 -*-
# 2014.12.29 Eddited by Dr. OGAWA Kiyoshi
words = Hash.new(0)
File.open("full.txt","r") do |file|
file.read.downcase.scan(/\p{Letter}+/) do |word|
words[word] += 1
end
end
print "WORD\tFREQUENCY\n"
words.sort_by{|word,count| [-count,word]}.each do |word,count|
print "#{word}\t#{count}\n"
end
#https://sites.google.com/…/ruby-guan…/tango-hindo-wo-kazoeru

###2.2 標準入出力に対応した版
0.2版は_を除去していた。
結果を見て、文学ならよいが、プログラムだと_入りの単語で検討しないと無駄が多いことが分かり、0.3版で_入りとした。

ソースは

wc.ry
#!/usr/bin/ruby
# -*- mode:ruby; coding:utf-8 -*-
# Word Counter for source code in programming language without comment
# ver.0.1 2014.12.29, ver.0.2 2014.12.29, ver0.3 2014.12.30
# Eddited by Dr. OGAWA Kiyoshi

words = Hash.new(0)
while buf = STDIN.gets
break if buf.chomp == "exit"
buf.scan(/\w+/) do |word|
words[word] += 1
end
end

print "WORD\tFREQUENCY\n"

words.sort_by{|word,count| [-count,word]}.each do |word,count|
print "#{word}\t#{count}\n"
end

ps. 2015/04/06

$ ./wc.rb < n1570.txt > n1570wc.csv
./wc.rb:12:in `scan': invalid byte sequence in UTF-8 (ArgumentError)
from ./wc.rb:12:in `<main>'

3 分析
処理後のデータ(まだ作業をはじめた段階のもの)
https://researchmap.jp/mu5yptkmp-45645/

単語、頻度、分類、full spell
分類は
1 C言語予約語等
2 C言語プログラム上必要な定義(OS固有でないもの)
3 TOPPERS OS 固有名詞(関数、変数)
4 定数など
に分けようと作業しかけです。
迷いがあるので、一度全部分類したら、識者に相談。

頻度が1回のものは、別の分類をする
1 アセンブラで定義してCで呼び出している
2 定義していて使っていない(アプリなどで呼ぶことを想定しているもの)
3 定義していて使っていない(上記以外)
4 定数など
に分類し、3について精査する。作業ができたら、識者に相談。

----wc.rb
大文字・小文字集約版

#!/usr/bin/ruby
# Word Counter for source code in programming language without comment or standard document without 0C .
# ver.0.1 2014.12.29, 
# ver.0.2 2014.12.29, 
# ver.0.3 2014.12.30 standard I/O
# ver.0.4 2015.4.13 downcase
# https://sites.google.com/site/rubycocoamemo/Home/ruby-guan-lian/tango-hindo-wo-kazoeru
# Eddited by Dr. OGAWA Kiyoshi

words = Hash.new(0)
while buf = STDIN.gets
break if buf.chomp == "exit"
buf.downcase.scan(/\w+/) do |word|
words[word] += 1
end
end

print "WORD\tCOUNT\n"

words.sort_by{|word,count| [-count,word]}.each do |word,count|
print "#{word}\t#{count}\n"
end

p.s. 20170709追記

$ chmod 0777 wc.rb
$ ./wc.rb pcd2.txt
$ ./wc.rb: line 11: syntax error near unexpected token `('
$ ./wc.rb: line 11: `words = Hash.new(0)'

p.s. 2017

f = open('sample.txt')
data = f.read()

# counting
words = {}
for word in data.split():
    words[word] = words.get(word, 0) + 1

# sort by count
d = [(v,k) for k,v in words.items()]
d.sort()
d.reverse()
for count, word in d[:1000]:
    print count, word

最後までおよみいただきありがとうございました。

いいね 💚、フォローをお願いします。

Thank you very much for reading to the last sentence.

Please press the like icon 💚 and follow me for your happy life.

0
0
0

Register as a new user and use Qiita more conveniently

  1. You get articles that match your needs
  2. You can efficiently read back useful information
  3. You can use dark theme
What you can do with signing up
0
0

Delete article

Deleted articles cannot be recovered.

Draft of this article would be also deleted.

Are you sure you want to delete this article?