7
8

Delete article

Deleted articles cannot be recovered.

Draft of this article would be also deleted.

Are you sure you want to delete this article?

More than 5 years have passed since last update.

Pythonであるディレクトリ以下のファイル全てに対して特定の文字列があるかチェックして対象行を出力

Last updated at Posted at 2014-04-15

概要

[DIR_NAME]以下ファイル全てを対象に、
[TARGET_ENCODING_LIST]に定義されている文字コードのテキストファイルかチェックして、
テキストファイルなら、[SEARCH_WORD]があるか検索して、
結果を、[OUTPUT_NAME]のファイル名に出力します。

環境

Windows8+Python2.6系

コード

find_directory.py
#!/usr/bin/python
# -*- coding: utf-8 -*-
# vim: fileencoding=utf-8

import os , sys , codecs

DIR_NAME = 'C:\\html\\HOGE\\'
OUTPUT_NAME = 'result_find_file_list.csv'

SEARCH_WORD = '<font'

TARGET_ENCODINGS = [
	'utf-8',
	'shift-jis',
	'euc-jp',
	'iso2022-jp'
]

FLAG_STDOUT = True
#FLAG_STDOUT = False

import os, sys

write = sys.stdout.write

def guess_charset(data):
	file = lambda d, encoding: d.decode(encoding) and encoding
	for enc in TARGET_ENCODINGS:
		try:
			file(data, enc)
			return enc
		except:
			pass
	return 'binary'

out = codecs.open(OUTPUT_NAME, 'w', 'shift-jis')
out.write('path,line_number,search,target_line\n')

for dirpath, dirs, files in os.walk(DIR_NAME):
	for fn in files:
		path = os.path.join(dirpath, fn)
		fobj = file(path, 'rU')
		data = fobj.read()
		fobj.close()
		try:
			enc = guess_charset(data)
		except:
			continue
		if enc == 'binary':
			continue
		count = 0
		try:
			for l in codecs.open(path, 'r', enc):
				count = count + 1
				if SEARCH_WORD in l:
					output = ''
					try:
						output = '"' + path + '","' + str(count) + '","' + SEARCH_WORD + '","' + l.replace('"',"'").replace('\r','').replace('\n','') + '"\r\n'
					except:
						continue
					if FLAG_STDOUT == True:
						write(output)
					out.write(output)
		except:
			continue

補足

例によって、例外処理は、適当です。
いろいろリファクタリングの余地ありですが、
明日実戦投入したいので、一旦、このまま投稿

7
8
0

Register as a new user and use Qiita more conveniently

  1. You get articles that match your needs
  2. You can efficiently read back useful information
  3. You can use dark theme
What you can do with signing up
7
8

Delete article

Deleted articles cannot be recovered.

Draft of this article would be also deleted.

Are you sure you want to delete this article?