データサイエンティストに必要な最低限のPython知識

Last updated at 2026-05-25Posted at 2026-05-05

これは数年前に赤石雅典先生の著書『最短コースでわかる　Pythonプログラミングとデータ分析』を読んだ際に取ったメモの一部です。皆さんの参考になれば幸いです。

実は、データサイエンティストを目指すのであれば、Pythonをソフトウェアエンジニアと同等のレベルまで極める必要はありません。重要なのは、データ処理に関するサードパーティ製ライブラリをいかに使いこなすかという点にあります。

変数

変数への値の代入（左辺が変数、右辺が値）。

# 変数xに値123を代入
x = 123

変数の参照と print 関数による結果の表示。

x = 123
# print関数での表示
print(x)

変数の参照と代入の組み合わせ。

x = 123
y = x + 10
print(y) # 133

# 自己の値を参照して上書きする
w = 10
w = w + 1
print(w) # 11

データ型と算術演算

データ型の確認（整数型 int、浮動小数点型 float、文字列型 str、ブール型 bool）。

a = 123
b = 123.45
c = 'xyz'
d = True
print(type(a)) # <class 'int'>
print(type(b)) # <class 'float'>
print(type(c)) # <class 'str'>
print(type(d)) # <class 'bool'>

基本的な算術演算（加算 +、減算 -、乗算 *、除算 /）。

# 文字列の加算も可能
print('ABC' + 'xyz') # ABCxyz
print(123 - 10)      # 113
print(8 * 123)       # 984
print(10 / 7)        # 1.428...

整数除算（//）、余り（%）、累乗（**）。

print(10 // 7)  # 1
print(10 % 7)   # 3
print(2 ** 10)  # 1024

特殊な代入演算子（累算代入演算子 +=）。

w = 10
w += 1
print(w) # 11

関数呼び出しとメソッド呼び出し

関数呼び出しの例（printの複数引数、len、型の変換int, float）。

print('x = ', 123, 'y = ', 10, sep='\n')
print(len('文字列のサンプル')) # 8
print(int('123'))           # 123
print(float(12))            # 12.0

文字列型のメソッド呼び出し（upper、find）。

s1 = 'I like an apple.'
print(s1.upper())         # I LIKE AN APPLE.
# 'apple'の開始位置を検索（0スタート）
print(s1.find('apple'))   # 10

やや高度なループ処理

enumerate 関数を利用したループ処理。要素と同時にインデックス（何番目か）を取得。

s1 = '文字列'
for i, s in enumerate(s1):
    print(i+1, '番目の文字は', s)

比較・論理演算と条件分岐

比較演算子による真偽値の取得。

x = 1
c = x < 10
print(c) # True

論理演算子（and）を用いた複数条件の組み合わせ。

x = 10
# xは10以上でかつ奇数か
c = (x >= 10) and (x % 2 == 1)
print(c) # False

if, elif, else を使った条件分岐。

x = 123
m = x % 3
if m == 0:
    print('xは3の倍数')
elif m == 1:
    print('余りは1')
else:
    print('余りは2')

階乗計算 (For ループ)

for 文と range 関数を使ったループ処理の基本。

N = 10
# range(1, N+1)は1から10までの整数を生成
for i in range(1, N + 1):
    print(i)

ループ内での計算結果の累積（階乗の計算）。

N = 10
fact = 1
for i in range(1, N + 1):
    fact *= i
    print(i, 'の階乗は', fact)

リストとループ処理

リスト型の定義とインデックスによる参照。

l1 = [1, 3, 5, 7, 9]
print(type(l1)) # <class 'list'>
print(l1[0])    # 1
print(l1[-1])   # 9 (最後の要素)

リストのスライス参照。

l1 = [1, 3, 5, 7, 9]
print(l1[1:4])  # [3, 5, 7]
print(l1[:2])   # [1, 3]
print(l1[2:])   # [5, 7, 9]
print(l1[:-1])  # [1, 3, 5, 7]

リストに対する演算（len、+による結合）とメソッド（append）。

l1 = [1, 3]
l2 = ['First', 'Second']
print(len(l2))  # 2
print(l1 + l2)  # [1, 3, 'First', 'Second']

l4 = []
l4.append('Third')
print(l4)       # ['Third']

文字列もリストのようにインデックスやスライスで参照できる。

s1 = 'I like an apple.'
print(s1[7])     # a
print(s1[10:15]) # apple

in 演算子による要素の存在確認。

l2 = ['First', 'Second', 'Third']
print('Second' in l2) # True

リストを直接回す for ループの基本。

l2 = ['First', 'Second', 'Third']
for e in l2:
    print(e)

range 関数と len を組み合わせた for ループ。

l2 = ['First', 'Second', 'Third']
for i in range(len(l2)):
    print(i, l2[i])

タプル、集合と辞書

タプルの定義と参照（要素は変更不可）。

t1 = (1, 3, 5, 7, 9)
# 要素が1つだけの場合はカンマが必要
t2 = (11,)
print(t1[1:3]) # (3, 5)
# t1[1] = 4 # エラーになる

タプルのアンパック代入とループでの利用。

x, y = (10, 123)

data = [('T254', 12), ('A727', 6)]
for key, value in data:
    print('key=', key, 'value=', value)

集合（set）を利用した重複要素の排除。

s1 = {5, 2, 1, 2, 3}
print(s1) # {1, 2, 3, 5}

# リストの重複をなくすテクニック
l5 = [5, 2, 1, 2, 3, 9, 3, 4]
l7 = list(set(l5))
print(l7) # [1, 2, 3, 4, 5, 9]

辞書の定義と参照。

d1 = {'A': 100, 'B': 200}
print(type(d1)) # <class 'dict'>
print(d1['A'])  # 100

辞書の項目の追加・更新・削除。

d1 = {'A': 100, 'B': 200}
d1['C'] = 300  # 追加
d1['A'] = 150  # 更新
del d1['B']    # 削除
print(d1)      # {'A': 150, 'C': 300}

辞書のメソッド（keys()）と in 演算子。

d1 = {'A': 100, 'B': 200}
print(d1.keys())      # dict_keys(['A', 'B'])
print('A' in d1.keys()) # True

関数定義

独自の関数の定義（1引数、1戻り値）。

def f(x):
    y = x ** 2 - 2 * x
    return y

print(f(3)) # 3

複数の引数を受け取る関数。

def head(s, n):
    r = s[:n]
    return r

print(head('ABCDEFG', 4)) # ABCD

複数の戻り値を返す関数（タプルとして返る）。

def div(s):
    p = s.find('@')
    pre = s[:p]
    post = s[p + 1:]
    return pre, post

p1, p2 = div('abc@xyz.com')
print(p1, p2) # abc xyz.com

戻り値のない関数。

def print_type_value(x):
    print('型: ', type(x))
    print('値: ', x)

print_type_value(123)

関数内の変数（スコープ）に関する注意点。

def add2(x):
    # 外部の変数zを参照できる
    w = x + z
    return w

z = 10
print(add2(1)) # 11

内包表記によるリスト生成（簡潔な書き方）。

cabins = ['B5', 'C22', 'G6']
# 先頭の文字だけを抽出して新しいリストを作成
floors = [x[0] for x in cabins]
print(floors) # ['B', 'C', 'G']

You get articles that match your needs
You can efficiently read back useful information
You can use dark theme

What you can do with signing up