メモなので、自分以外の人が読むことを考えていません。流し読みを繰り返すので、徐々に更新されていきます。太字部分とコードがある部分が気になった箇所。コードは本に書いてあるコードだったり、それをアレンジしたものだったり、コメントを加えていたりします。
1. Pythonic Thinking
1. Know Which Version of Python
- There are two major versions of Python still in active use: Python 2 and Python 3.
- There are multiple popular runtimes for Python: CPython, Jython, IronPython, PyPy,etc.
- Be sure that the command-line for running Python on your system is the version you expect it to be.
- Prefer Python 3 for your next project because that is the primary focus of the Python community.
2. Follow the PEP 8 Style Guide
- Always follow the PEP 8 style guide when writing Python code.
- Sharing a common style with the larger Python community facilitates collaboration with others.
- Using a consistent style makes it easier to modify your own code later.
3. Know the Differences Between bytes
, str
, and unicode
- In Python 3,
bytes
contains sequences of 8-bit values,str
contains sequences of Unicode characters.bytes
andstr
instances can't be used together with operators (like > or +). - In Python 2,
str
contains sequences of 8-bit values,unicode
contains sequences of Unicode characters.str
andunicode
can be used together with operators if the str only contains 7-bit ASCII characters. - Use helper functions to ensure that the inputs you operate on are the type of character sequence you expect (8-bit values, UTF-8 encoded characters, Unicode characters, etc.).
#python 3
def to_str(bytes_or_str):
if isinstance(bytes_or_str, bytes):
value = bytes_or_str.decode('utf-8')
else:
value = bytes_or_str
return value
def to_bytes(bytes_or_str):
if isinstance(bytes_or_str, str):
value = bytes_or_str.encode('utf-8')
else:
value = bytes_or_str
return value
- If you want to read or write binary data to/from a file, always open the file using a binary mode (like
'rb'
or'wb'
).
4. Write Helper Functions Instead of Complex
- Python's syntax makes it all too easy to write single-line expressions that are overly complicated and difficult to read.
- Move complex expressions into helper functions, especially if you need to use the same logic repeatedly.
- The
if
/else
expression provides a more readable alternative to using Boolean operators likeor
andand
in expressions.
5. Know How to Slice Sequences
- Avoid being verbose: Don't supply
0
for thestart
index or the length of the sequence for theend
index. - Slicing is forgiving of
start
orend
indexes that are out of bounds, making it easy to express slices on the front or back boundaries of a sequence (likea[:20]
ora[-20:]
). - Assigning to a
list
slice will replace that range in the original sequence with what's referenced even if their lengths are different.
6. Avoid Using start
, end
, and stride
in a Single
- Specifying
start
,end
, andstride
in a slice can be extremely confusing. - Prefer using positive
stride
values in slices without `start_ or end indexes. Avoid negative stride values if possible. - Avoid using
start
,end
andstride
together in a single slice. If you need all three parameters, consider doing two assignments (one to slice, another to stride) or usingislice
(see Item 46) from the itertools built-in module.
7. Use List Comprehensions Instead of map
and filter
- List comprehensions are clearer than the
map
andfilter
built-in functions because they don't require extralambda
expressions. - List comprehensions allow you to easily skip items from the input list, a behavior
map
doesn't support without help fromfilter
. - Dictionaries and sets also support comprehension expressions.
8. Avoid More Than Two Expressions in List Comprehensions
- List comprehensions support multiple levels of loops and multiple conditions per loop level.
- List comprehensions with more than two expressions are very difficult to read and should be avoided.
9. Consider Generator Expressions for Large Comprehensions
- List comprehensions can cause problems for large inputs by using too much memory.
- Generator expressions avoid memory issues by producing outputs one at a time as an iterator.
- Generator expressions can be composed by passing the iterator from on generator exrepssion into the
for
subexpression of another. - Generator expressions execute very quickly when chained together.
10. Prefer enumerate
Over range
-
enumerate
provides concise syntax for looping over an iterator and getting the index of each item from the iterator as you go. - Prefer
enumerate
instead of looping over a range and indexing into a sequence. - You can supply a second parameter to
enumerate
to specify the number from which to begin counting (zero is the default).
11. Use zip
to Process Iterators in Parallel
- The
zip
built-in function can be used to iterate over multiple iterators in parallel. - In Python 3,
zip
is a lazy generator that produces tuples. In Python 2,zip
returns the full result as a list of tuples. -
zip
truncates its output silently if you supply it with iterators of different lengths. - The
zip_longest
function from theitertools
built-in module lets you iterate over multiple iterators in parallel regardless of their lengths (see Item 46).
12. Avoid else Blocks After for and while Loops
- Python has special syntax that allows
else
blocks to immediately followfor
andwhile
loop interior blocks. - The
else
block after a loop only runs if the loop body did not encounter abreak
statement. - Avoid using
else
blocks after loops because their behavior isn't intuitive and can be confusing.
13. Take Advantage of Each Block in try
/except
/else
/finally
-
try
/finally
compound statement lets you run cleanup code regardless of whether exceptions were raised in thetry
block. - The
else
block helps you minimize the amount of code intry
blocks and visually distinguish the success case from thetry
/except
blocks. - An
else
block can be used to perform additional actions after a successfulltry
block but before common cleanup in afinally
block.
UNDEFINED = object()
def divide_json(path):
handle = open(path, 'r+') # May raise IOError
try:
data = handle.read() # May raise UnicodeDecodeError
op = json.loads(data) # May raise ValueError
value = (op['numerator'] / op['denominator']) # May raise ZeroDivisionError
except ZeroDivisionError as e:
return UNDEFINED
else:
op['result'] = value
result = json.dumps(op)
handle.seek(0)
handle.write(result) # May raise IOError
return value
finally:
handle.close() # Always runs
`try` -> Error? no: -> `else`
yes: -> ZeroDivisionError? yes: -> `return UNDEFINED`
-> `finally`
2. Functions
14. Prefer Exceptions to Returning None
- Functions that return
None
to indicate special meaning are error prone becauseNone
and other values (e.g., zero, the empty string) all evaluate toFalse
in conditional expressions. - Raise exceptions to indicate special situations instead of returning
None
. Expect the calling code to handle exceptions properly when they're documented.
15. Know How Closures Interact with Variable Scope
- Closure functions can refer to variables from any of the scopes in which they were defined.
- By default, closures can't affect enclosing scopes by assigning variables.
- In Python 3, use the
nonlocal
statement to indicate when a closure can modify a variable in its enclosing scopes.
def sort_priority(numbers, group):
found = False
def helper(x):
if x in group:
found = True # <- This assignment will fail. The scopes are different.
return (0, x)
return (1, x)
numbers.sort(key=helper)
return found
# A better way
def sort_priority2(numbers, group):
found = False
def helper(x):
nonlocal found # <- Declaring the variable belongs to a different scope
if x in group:
found = True
return (O, x)
return (1, x)
numbers.sort(key=helper)
return found
- In Python 2, use a mutable value (like a single-item list) to work around the lack of the
nonlocal
statement. - Avoid using
nonlocal
statements for anything beyond simple functions.
16. Consider Generators Instead of Returning Lists
- Using generators can be clearer than the alternative of returning lists of accumulated results.
- The iterator returned by a generator produces the set of values passed to
yield
expressions within the generator function's body. - Generators can produce a sequence of outputs for arbitrarily large inputs because their working memory doesn't include all inputs and outputs.
17. Be Defensive When Iterating Over Arguments
- Beware of functions that iterate over input arguments multiple times. If these arguments are iterators, you may see strange behavior and missing values.
- Python's iterator protocol defines how containers and iterators iteract with the
iter
andnext
built-in functions,for
loops, and related expressions.
class SalesReader(object):
def __init__(self, data_path):
self.data_path = data_path
# Implementing the __iter__ method as a generator
# The __iter__ method returns an iterator object which implements the __next__ method.
# The for loop (sum() in this case) in the caller repeatedly calls the next built-in function on the iterator object
def __iter__(self):
with open(self.data_path) as f:
for line in f:
# f: 5000, 3000, 1500, 500
yield int(line)
def normalize(sales_reader):
total = sum(sales_reader)
result = []
for s in sales_reader:
percent = 100 * s / total
result.append(percent)
return result
path = 'path/to/sales/file'
sales = SalesReader(path)
percentages = normalize(sales)
print(percentages) # -> [50.0, 30.0, 15.0, 5.0]
- You can detect that a value is an iterator (instead of a container) if calling
iter
on it twice produces the same result, which can then be progressed with thenext
built-in function.
nums = [5000, 3000, 1500, 500]
iter(nums) is iter(nums) # -> False
sales = SalesReader()
iter(sales) is iter(sales) # -> False
i = iter(nums)
iter(i) is iter(i) # -> True
def normalize_defensive(container):
if iter(container) is iter(container): # -> Making sure it's not an iterator
raise TypeError('Must supply a container')
total = sum(container)
result = []
for value in container:
percent = 100 * value / total
result.append(percent)
return result
18. Reduce Visual Noise with Variable Positional Arguments
- Functions can accept a variable number of positional arguments by using
*args
in thedef
statement. - You can use the items from a sequence as the positional arguments for a function with the * operator.
- Using the * operator with a generator may cause your program to run out of memory and crash. (-> The arguments are always turned into a tuple)
- Adding new positional parameters to functions that accept
*args
can introduce hard-to-find bugs.
19. Provide Optional Behavior with Keyword Arguments
- Function arguments can be specified by position or by keyword.
- Keywords make it clear what the purpose of each argument is when it would be confusing with only positional arguments.
- Keyword arguments with default values make it easy to add new behaviors to a function, especially when the function has existing callers.
- Optional keyword arguments should always be passed by keyword instead of by position.
20. Use None
and Docstrings to Specify Dynamic Default Arguments
- Default arguments are only evaluated once: during function definition at module load time. This can cause odd behaviors for dynamic values (like {} or []).
- Use
None
as the default value for keyword arguments that have a dynamic value. Document the actual default behavior in the function's docstring.
def decode(data, default=None):
"""Load JSON data from a string.
Args:
data: JSON data to decode.
default: Value to return if decoding fails.
Defaults to an empty dictionary.
"""
if default is None:
default = {}
try:
return json.loads(data)
except ValueError:
return default
21. Enforce Clarity with Keyword-Only Arguments
- Keyword arguments make the intention of a function call more clear.
- Use keyword-only arguments to force callers to supply keyword arguments for potentially confusing functions, especially those that accept multiple Boolean flags.
- Python 3 supports explicit syntax for keyword-only arguments in functions.
def division(number, divisor, *, ignore_overflow=False, ignore_zero_division=False):
try:
return number / divisor
except OverflowError:
if ignore_overflow:
return 0
else:
raise
except ZeroDivisionError:
if ignore_zero_division:
return float('inf')
else:
raise
- Python 2 can emulate keyword-only arguments for functions by using
**kwargs
and manually raisingTypeError
exceptions.