LoginSignup
3
4

More than 5 years have passed since last update.

python3でファイル名と最終更新日時と文字コードの一覧をcsv吐き出しする

Posted at

anaconda3インストール済み
anacondaについては、Macにanacondaをインストールするを参照

文字コードを判定するchardetをインストールする

zsh
$ anaconda search -t conda chardet # anacondaにあるchardetを探す
Using Anaconda API: https://api.anaconda.org
Run 'anaconda show <USER/PACKAGE>' to get more details:
Packages:
     Name                      |  Version | Package Types   | Platforms
     ------------------------- |   ------ | --------------- | ---------------
     anaconda/chardet          |    3.0.4 | conda           | linux-ppc64le, linux-64, win-32, osx-64, linux-32, win-64
...

$ anaconda show anaconda/chardet #探したchardetの中から好きなやつを選んでインストール方法を調べる
Using Anaconda API: https://api.anaconda.org
Name:    chardet
Summary:
Access:  public
Package Types:  conda
Versions:
   + 2.3.0
   + 3.0.2
   + 3.0.3
   + 3.0.4

To install this package with conda run:
     conda install --channel https://conda.anaconda.org/anaconda chardet

$ conda install --channel https://conda.anaconda.org/anaconda chardet # 調べて出てきたインストールコマンドをコピペする
Fetching package metadata .........
Solving package specifications: ..........

Package plan for installation in environment /Users/berry/.pyenv/versions/anaconda3-4.2.0:

The following packages will be downloaded:

    package                    |            build
    ---------------------------|-----------------
    conda-env-2.6.0            |                0          601 B  anaconda
    chardet-3.0.4              |           py35_0         188 KB  anaconda
    requests-2.14.2            |           py35_0         725 KB  anaconda
    pyopenssl-16.2.0           |           py35_0          70 KB  anaconda
    conda-4.3.22               |           py35_0         516 KB  anaconda
    ------------------------------------------------------------
                                           Total:         1.5 MB

The following NEW packages will be INSTALLED:

    chardet:   3.0.4-py35_0  anaconda
    conda-env: 2.6.0-0       anaconda

The following packages will be UPDATED:

    conda:     4.2.9-py35_0           --> 4.3.22-py35_0 anaconda
    pyopenssl: 16.0.0-py35_0          --> 16.2.0-py35_0 anaconda
    requests:  2.11.1-py35_0          --> 2.14.2-py35_0 anaconda

Proceed ([y]/n)? y

Fetching packages ...
conda-env-2.6. 100% |############################################################################| Time: 0:00:00 304.59 kB/s
chardet-3.0.4- 100% |############################################################################| Time: 0:00:02  83.22 kB/s
requests-2.14. 100% |############################################################################| Time: 0:00:52  14.02 kB/s
pyopenssl-16.2 100% |############################################################################| Time: 0:00:02  25.32 kB/s
conda-4.3.22-p 100% |############################################################################| Time: 0:00:18  28.66 kB/s
Extracting packages ...
[      COMPLETE      ]|###############################################################################################| 100%
Unlinking packages ...
[      COMPLETE      ]|###############################################################################################| 100%
Linking packages ...
[      COMPLETE      ]|###############################################################################################| 100%

$ 

ファイル名と更新日時、文字コードをcsvにexportする

  • './file/'以下にあるファイルの一覧をcsvにexport
python3
# support python3

import os
import datetime
import csv
from chardet.universaldetector import UniversalDetector

_path = './file/'

# get encode of a file
def univ_detect(file_dir):
    ud = UniversalDetector()
    with open(file_dir, 'rb') as fd:
        for b in fd:
            ud.feed(b)
            if ud.done:
                break
    ud.close()
    return ud.result['encoding']

# get a file name, path and last-modified timestamp
all_files = []
def get_file_list(file_path):
    # all_files = []
    file_list = [f for f in os.listdir(file_path)]
    for g in file_list:
        g_path = os.path.join(file_path, g)

        # last modified time        
        last_modified = os.path.getmtime(g_path)
        dt = datetime.datetime.fromtimestamp(last_modified).strftime('%Y%m%d_%H:%M:%S')

        # chardet
        encode = 'Directory'
        if os.path.isdir(g_path):
            pass
        else:
            encode = univ_detect(g_path)

        all_files.append([dt, g_path.split(_path,1)[1], ' '.join(['~',encode,'~'])])

        # subdirectory
        if os.path.isdir(g_path):
            subfile_list = [i for i in os.listdir(g_path)]
            for j in subfile_list:
                j_path = os.path.join(g_path, j)

                # last modified time
                sub_last_modified = os.path.getmtime(j_path)
                sub_dt = datetime.datetime.fromtimestamp(sub_last_modified).strftime('%Y%m%d_%H:%M:%S')

                # chardet
                encode = 'Directory'
                if os.path.isdir(j_path):
                    pass
                else:
                    encode = univ_detect(j_path)

                all_files.append([sub_dt, j_path.split(_path,1)[1], ' '.join(['~',encode,'~'])])
    return file_list
    # return all_files

print(get_file_list(_path))

csv_file = [['Last_modified', 'file_path', 'encode']] # Header
csv_file.extend(all_files)
with open('file_checker.csv', 'w') as h:
    writer = csv.writer(h, lineterminator='\n')
    writer.writerows(csv_file)
3
4
0

Register as a new user and use Qiita more conveniently

  1. You get articles that match your needs
  2. You can efficiently read back useful information
  3. You can use dark theme
What you can do with signing up
3
4