Table of content
- Overview of Python
- Basic syntax
- Datatypes
- 3.1 Bool
- 3.2 Numeric
- 3.3 List
- 3.4 String
- 3.5 Tuple
- 3.6 Dictionary
- 3.7 Set
- 3.8 Type conversion
- Control statements
- 4.1 IF statement
- 4.2 FOR loop
- 4.3 WHILE loop
- 4.4 ELSE clause
- 4.5 BREAK statement
- 4.6 CONTINUE statement
- 4.7 PASS statement
- Comprehension
- 5.1 List comprehension
- 5.2 Dict comprehension
- 5.3 Set comprehension
- Function
1. Overview of Python
- Python: High-level, interpreted, general and multi-paradigm programming language
Python is suitable for science, data analysis & Finance
- Well-designed syntax
- Trade-off between rapid development & performance
- Ecology of libraries for the fields
- Science: NumPy, SciPy, Statsmodels, iPython Notebook...
- Machine Learning: Scikit-learn, Theano, pylearn2...
- NLP & Text processing: NLTK, Gensim, Spacy...
- Finance: pandas, TA-lib, Zipline...
Some famous services using Python:
- Rackspace
- Spotify
- Dropbox
- Reddit...
- For full list of Python Success Stories: https://www.python.org/about/success/
Recently, Python is also used in trading
- Open platforms for Algorithmic Trading intensively supporting Python
- There are a large number of Quantitative Analyst, Developers & companies using Python for Finance
Lesson from Quora
- Traditional development process:
- Researchers using Matlab, R, SAS for experiment
- Engineers convert experimental code to Java or C for production
- Process with Python: a good intermediate option
- Do experiments with Python
- Perform some refactoring on experimental code for production
-
Ref:
10 more lessons learned from building Machine Learning systems, Xavier Amatriain@Quora, The Machine Learning Conference
2. Basic syntax
2.1 Your first program in Python
print('Hello world')
Hello world
2.2 Indentation
- Python uses leading spaces to distinguish level of codes, instead of BEGIN..END, { } or other markers as in other languages.
- The leading spaces for indentation may be SPACE or TAB.
a = 3
if a > 0:
print('Positive')
else:
print('Negative')
Positive
def sum(a, b):
result = a + b
return result
print(sum(5, 8))
13
2.3 Semi-colon is not required at the end of each command
a = 5
b = 9
c = a * b
print(c)
45
2.4 snack_case should be use for naming
width_of_rect = 5
height_of_rect = 4
area_of_rect = width_of_rect * height_of_rect
print(area_of_rect)
20
def get_area(width, height):
return width * height
print(get_area(5, 4))
20
2.5 Comments
- Use # to comment out one line of code
- Use triple-quoted string to comment out multiple lines
# Calculate area of a rectangle
area_of_rect = width_of_rect * height_of_rect
p = (width_of_rect + height_of_rect) * 2 # perimeter of the rectangle
'''
Calculate area of a rectangle.
It is the multiplication of the width & the height
'''
area_of_rect = width_of_rect * height_of_rect
"""
Calculate perimeter of a rectangle.
It is twice of the sum of width & height
"""
perimeter_of_rect = width_of_rect * height_of_rect
2.6 Operators
- Arithmetic operators
a = 5
b = 3
a + b # 8 (Addition)
a - b # 2 (Subtraction)
a * b # 15 (Multiplication)
a / b # 1 (Division)
4.3 / 2 # 2.15
a // b # 1 (floor division)
4.3 // 2 # 2.0
a % b # 2 (Modulus)
a ** b # 125 (Exponentiation)
125
- Comparison operators
a > b # True
a <= b # False
a < b # False
a >= b # True
a == b # False
a != b # True
True
- Bitwise operators
a = 27 #11011
b = 14 #01110
a & b #01010 = 10(Bitwise AND)
a | b #11111 = 31 (Bitwise OR)
a ^ b #10101 = 21 (Bitwise XOR)
~a #111..11100100 = -28 (Bitwise inversion)
-28
- Shift operators
a = 27 #0011011
a >> 2 #0000110 = 6 (Shift right)
a << 2 #1101100 = 108 (Shift left)
108
3. Datatypes
3.1 Bool
- Bool datatype in Python can take 2 value
True
andFalse
a = (5 == 6)
a
False
b = (5 == 5)
b
True
type(a)
bool
- Operations on boolean values
a and b
False
a or b
True
not a
True
3.2 Numeric
int
a = 100
type(a)
int
a = 10 ** 10
a
10000000000
type(a)
int
b = 10 ** 100
type(b)
long
import sys
sys.maxint
9223372036854775807
long
Python works hard to support unlimited integer values through long
datatype
unbound_value = 10 ** 100000
type(unbound_value)
long
another_unbound_value = 2 ** 10000000
# not enough space & time to print out this huge value
type(another_unbound_value)
long
float
a = 3.432
type(a)
float
b = a ** 334.32
b
1.1070478028957259e+179
type(b)
float
sys.float_info
sys.float_info(max=1.7976931348623157e+308, max_exp=1024, max_10_exp=308, min=2.2250738585072014e-308, min_exp=-1021, min_10_exp=-307, dig=15, mant_dig=53, epsilon=2.220446049250313e-16, radix=2, rounds=1)
10 / 3 # integer division
3
To retrieve float value from integer division
10.0 / 3
3.3333333333333335
10 / float(3)
3.3333333333333335
Infinity value
infinity = float("inf")
infinity / 3
inf
infinity * 100
inf
3.3 List
Creating list
- List is a sequence of other items
list_of_ints = [1,2,3,4,5,6,7,8]
list_of_strings = ['one', 'two', 'three']
- A list in Python may contain items from different types, and even nested list.
mixed_list = ['a', 1, 2.323, ['sublist', 3, ['nested_list', 2.5]]]
Accessing list
my_list = [1, 3, 7, 2, 10] # List in Python is 0-based
len(my_list)
5
my_list[0] # the first item
my_list[4] # the last item
10
- Access the last item
my_list[-1]
10
A sublist can be accessed through a slice operator :
-
my_list[i:j]
will extract a sublist with items from $i^{th}$ to $(j-1)^{th}$
# [1, 3, 7, 2, 10]
my_list[1:3] # Get items 1th -> 2th
[3, 7]
-
[:j]
returns a sublist starting at the beginning to $(j-1)^{th}$
my_list[:3] # Get items 0th -> 2th
[1, 3, 7]
-
[i:]
returns a sublist starting from $i^{th}$ to the end
# [1, 3, 7, 2, 10]
my_list[2:] # Get items 2th -> 4th (end of the llst)
[7, 2, 10]
- Omiting both of the index will return the whole list
# [1, 3, 7, 2, 10]
my_list[:]
[1, 3, 7, 2, 10]
- All items except the last one
my_list[:-1]
[1, 3, 7, 2]
Advanced slice operator:
-
[i:j:k]: slice from $i^{th}$ to $(j-1)^{th}$ with step k
-
Reverse an array
# [1, 3, 7, 2, 10]
my_list[::-1]
[10, 2, 7, 3, 1]
Check the existence of items in list
-
x in list
: check the existence (boolean) -
x not in list
: check the non-existence (boolean)
# [1, 3, 7, 2, 10]
10 in my_list
True
15 in my_list
False
Iterating list
# [1, 3, 7, 2, 10]
for item in my_list:
print(item)
1
3
7
2
10
sum = 0
for i in range(len(my_list)):
sum += my_list[i]
sum
23
- Note that a nested list is just an item when we traverse the parent list
parent_list = ['chicken', 'buffalo', ['cat', 'dog'], 'pork']
for (i, item) in enumerate(parent_list):
print(str(i) + ': ' + str(parent_list[i]))
0: chicken
1: buffalo
2: ['cat', 'dog']
3: pork
Delete items from list
-
del listVar[i]
: delete the item $i^{th}$ from list -
del listVar[i:j]
: delete the items from $i^{th}$ to $(j-1)^{th}$ from list
# [1, 3, 7, 2, 10]
del my_list[2]
my_list
[1, 3, 2, 10]
del my_list[1:3]
my_list
[1, 10]
Methods on list
-
list.append(value)
: appends element to end of the list -
list.count(x)
: counts the number of occurrences of x in the list -
list.index(x)
: returns the 1st index of x in the list -
list.insert(i, x)
: inserts x at location $ i^{th} $ -
list.pop()
: returns last element then removes it from the list -
list.remove(x)
: finds and removes first x from list -
list.reverse()
: reverses the elements in the list -
list.sort()
: sorts the list alphabetically in ascending order, or numerical in ascending order -
list1 + list2
: merge 2 list
my_list = [1, 3, 7, 2, 10]
my_list.append(3)
my_list
[1, 3, 7, 2, 10, 3]
my_list.count(3)
2
my_list.index(3)
# my_list.index(100) -> error because 100 does not exist in my_list
1
# [1, 3, 7, 2, 10, 3]
my_list.insert(3, 50)
my_list
[1, 3, 7, 50, 2, 10, 3]
a = my_list.pop()
a # my_list will be: [1, 3, 7, 50, 2, 10]
3
my_list.append(3) # [1, 3, 7, 50, 2, 10, 3]
my_list.remove(3) # remote the 1st item only
my_list
[1, 7, 50, 2, 10, 3]
my_list.reverse()
my_list
[3, 10, 2, 50, 7, 1]
my_list.sort()
my_list
[1, 2, 3, 7, 10, 50]
total_list = my_list + [3, 5, 4]
total_list
[1, 2, 3, 7, 10, 50, 3, 5, 4]
- Note that when we assign a list to a new variable, the two variables point to the same reference of the same list. So, every change on one variables will affect the other.
new_list = my_list
new_list
[1, 2, 3, 7, 10, 50]
my_list.remove(3)
new_list
[1, 2, 7, 10, 50]
- If we want to clone a totally new list, we should use the
list
function
new_list = list(my_list)
new_list
[1, 2, 7, 10, 50]
my_list.remove(10)
my_list
[1, 2, 7, 50]
new_list
[1, 2, 7, 10, 50]
3.4 String
String work as a sequence of characters like list
animal = 'tiger'
animal[3] # 3th character
'e'
len(animal)
5
animal[-1] # last character
'r'
Access characters of string through slice operator like list
animal[1:3] # ig
animal[:3] # tig
animal[1:] # iger
animal[:] # tiger
'tiger'
Scan through all characters of a string
for l in animal:
print(l)
t
i
g
e
r
Basic functions on string
- string.count('x') - counts the number of occurrences of 'x' in stringVar
- string.find('x') - returns the 1st position of character 'x'
- string.lower() - returns the stringVar in lowercase (this is temporary)
- string.upper() - returns the stringVar in uppercase (this is temporary)
- string.replace('a', 'b') - replaces all occurrences of a with b in the string
- string.strip() - removes leading/trailing white space from string
animal = 'elephant'
animal.count('e')
2
animal.find('e')
0
animal.upper()
'ELEPHANT'
'ELEPHANT'.lower()
'elephant'
animal.replace('e', 'a')
'alaphant'
animal = ' elephant '
animal.strip()
'elephant'
3.5 Tuple
- A tuple in Python is a sequence of values that are immutable. Values of a tuple are seperated by comma.
tup = ('one', 1) #2-tuple
tup
('one', 1)
tup = ('John', 25, 180) #3-tuple
tup
('John', 25, 180)
- Like
list
, values of a tuple can be accessed through index
tup[0]
'John'
tup[2]
180
- Get number of elements in tuple
len(tup)
3
- However, we cannot delete or update elements from a tuple because tuple is immutable
# del tup[1] -> error
# tup[1] = 26 -> error
- Loop through elements in a tuple
for item in tup:
print(item)
John
25
180
3.6 Dictionary
Dictionary in Python is a datatype that maps hashable values to objects.
Dictionary works as hashtable as in other language.
Creating dictionary
- Create dictionary from key/value pairs. If keys are string, quotes are required
my_dict = {'one': 1, 'two': 2, 'three': 3}
my_dict
{'one': 1, 'three': 3, 'two': 2}
- Dictionary keys may be integers or floats
another_dict = {1: 'one', 2: 'two', 3: 'three', 3.14: 'pi'}
another_dict
{1: 'one', 2: 'two', 3: 'three', 3.14: 'pi'}
- Create dictionary from
dict
function with named parameters.
my_dict = dict(one=1, two=2, three=3)
my_dict
{'one': 1, 'three': 3, 'two': 2}
-
dict
function also accept key/value pairs
my_dict = dict({'one': 1, 'two': 2, 'three': 3})
my_dict
{'one': 1, 'three': 3, 'two': 2}
- Create dictionary from a list of tuples
my_dict = dict([('one', 1), ('two', 2), ('three', 3)])
my_dict
{'one': 1, 'three': 3, 'two': 2}
Accessing dictionary values
- Get value from a key
my_dict['one']
1
- Check the existence of keys
'three' in my_dict
True
'four' in my_dict
False
- Get number of pairs in dictionary
len(my_dict)
3
Changing dictionary
- Update value of a key
my_dict['one'] = 'ichi'
my_dict
{'one': 'ichi', 'three': 3, 'two': 2}
- Add new key to a dictionary
my_dict['four'] = 4
my_dict
{'four': 4, 'one': 'ichi', 'three': 3, 'two': 2}
- Delete values from dictionary
del my_dict['two']
my_dict
{'four': 4, 'one': 'ichi', 'three': 3}
Iterating in dictionary
for k in my_dict:
print(k + ': ' + str(my_dict[k]))
four: 4
three: 3
one: ichi
for k,v in my_dict.items():
print(k + ': ' + str(v))
four: 4
three: 3
one: ichi
Functions on dictionary
-
dict.keys()
: get all keys of the dictionary (return list) -
dict.values()
: get all values of the dictionary (return list) -
dict.items()
: get a list of pairs -
dict.iteritems()
: get item iterator for the dictionary. This is used for iterating only. -
dict.iterkeys()
: get key iterator for the dictionary. This is used for iterating only. -
dict.itervalues()
: get value iterator for the dictionary. This is used for iterating only. -
dict.update(new_dict)
: update a dictionary with some new key/value paris -
dict.copy()
: clone dict1 —> the return dict is independent from the original one. -
dict.clear()
: delete all keys, or reset the dictionary
my_dict = {'one': 1, 'two': 2, 'three': 3}
my_dict.keys()
['three', 'two', 'one']
my_dict.values()
[3, 2, 1]
my_dict.items()
[('three', 3), ('two', 2), ('one', 1)]
for k, v in my_dict.iteritems():
print(k + ': ' + str(v))
three: 3
two: 2
one: 1
for k in my_dict.iterkeys():
print(k)
three
two
one
for v in my_dict.itervalues():
print v
3
2
1
my_dict.update(one='ichi', five='go')
my_dict
{'five': 'go', 'one': 'ichi', 'three': 3, 'two': 2}
my_dict.update({'two':'ni', 'six':'roku'})
my_dict
{'five': 'go', 'one': 'ichi', 'six': 'roku', 'three': 3, 'two': 'ni'}
- Note that when we assign a dict to a new variable, the two variables point to the same reference of the same dict. So, every change on one variables will affect the other.
new_dict = my_dict
my_dict['one'] = 'ichi'
my_dict
{'five': 'go', 'one': 'ichi', 'six': 'roku', 'three': 3, 'two': 'ni'}
new_dict
{'five': 'go', 'one': 'ichi', 'six': 'roku', 'three': 3, 'two': 'ni'}
- If we want to clone a totally new dict, we should use
copy
function
new_dict = my_dict.copy()
my_dict.clear()
my_dict
{}
new_dict
{'five': 'go', 'one': 'ichi', 'six': 'roku', 'three': 3, 'two': 'ni'}
3.7 Set
Set can be considered as dictionary without value
- Creating set
my_set = set()
my_set = {1, 5, 3, 8}
my_set = set([1, 5, 3, 8])
my_set
{1, 3, 5, 8}
- Add item to set
my_set.add(9)
my_set
{1, 3, 5, 8, 9}
- Remove an item from set
my_set.remove(8)
my_set
{1, 3, 5, 9}
- Note that set does not maintain items in order, so we cannot access set items through index.
# my_set[1] -> error
- Get number of items from set
len(my_set)
4
Operations on set
-
a & b
: return intersection of a & b, <=> a.intersection(b) -
a | b
: return union of a & b, <=> a.union(b) -
a - b
: return the difference of a & b, <=> a.difference(b) -
a ^ b
: return the exclusive of a & b, <=> a.symmetric_difference(b) -
a == b
: return whether a & b are the same set -
a <= b
: check whether a is a subset of b or not, <=> a.issubset(b) -
a >= b
: check whether a is a superset of b or not, <=> a.issuperset(b)
a = {1, 3, 4, 5}
b = {1, 3, 6}
a & b
{1, 3}
a | b
{1, 3, 4, 5, 6}
a - b
{4, 5}
a ^ b
{4, 5, 6}
a == b
False
# a = {1, 3, 4, 5}
c = {1, 3, 4, 5}
a == c
True
a = {1, 3, 4, 5}
b = {1, 3, 6}
a <= b
False
d = {1, 5, 4, 7, 3}
a <= d
True
a <= c
True
a > c
False
a >= c
True
3.8 Type conversion
int('5')
5
float('5.6')
5.6
str(5.7)
'5.7'
str(7)
'7'
str([1, 3, 5])
'[1, 3, 5]'
set([1, 3, 5, 3])
{1, 3, 5}
list({'one': 1, 'two': 2, 'three': 3})
['three', 'two', 'one']
list({'one', 'two', 'three'})
['one', 'three', 'two']
list(('one', 'two', 'three'))
['one', 'two', 'three']
tuple([1, 3, 5, 3])
(1, 3, 5, 3)
dict([('one', 1), ('two', 2), ('three', 3)])
{'one': 1, 'three': 3, 'two': 2}
4. Control statements
4.1 IF statement
if .. else
-
if .. elif .. else
: substitute forswitch .. case
statement in other languages
a = 5
if a == 5:
print('a if five')
else:
print('a if five')
a if five
a = 6
if a == 5:
print('a if five')
elif a > 5:
print('a is greater than five')
else:
print('a is less than five')
a is greater than five
- Logical phrases in conditional statement can be combined with logical operators
and
,or
,not
a = 5
b = 7
if a == 5 and b > 6:
print("Matched!")
else:
print("NOT Matched!")
Matched!
if not(a == 5) or (b <= 6):
print("Matched!")
else:
print("NOT Matched!")
NOT Matched!
4.2 FOR loop
- Loop through 0 -> n-1
n = 3
for i in range(n):
print(i)
0
1
2
-
range
function return list of items in a specific segment
range(5)
[0, 1, 2, 3, 4]
range(2, 5)
[2, 3, 4]
- Loop through items of a list
words = ['We', 'are', 'learning', 'Python']
for w in words:
print(w)
We
are
learning
Python
4.3 WHILE loop
i = 0
while i < 4:
print(i)
i += 1
0
1
2
3
4.4 ELSE clause
-
ELSE
clause inFOR
loop is executed after the last iteration
for w in words:
print(w)
else:
print('Out of word')
We
are
learning
Python
Out of word
-
ELSE
clause inWHILE
loop is executed after the last ieteration
i = 0
while i < 4:
print(i)
i += 1
else:
print('Finished while')
0
1
2
3
Finished while
4.5 BREAK statement
- Exit the current loop
for i in range(1, 5):
if i % 3 == 0:
break;
else:
print(i)
1
2
4.6 CONTINUE statement
- Ignore the rest of code in current iteration and jump to the next iteration
for i in range(5):
if i % 2 == 0:
continue;
else:
print(i)
# Do something more here
1
3
4.7 PASS statement
-
pass
is just a command that do nothing. It is used a place that requires at least 1 command, but we don't want to do nothing there
# Here, I want to demonstrate the syntax of FOR loop but I dont't want to output anything to save space
for i in range(10):
pass
5. Comprehension
5.1 List comprehension
x = [1, 2, 4, 5]
x_square = [i * i for i in x]
x_square
[1, 4, 16, 25]
Syntax of the above comprehension is very close to that of x_square definition in math:
$$x_{square} = {i^2\ |\ i \in x }$$
city = 'tokyo'
upper = [s.capitalize() for s in city]
upper
['T', 'O', 'K', 'Y', 'O']
- Using
if
in list comprehension. In case theif
clause is placed before thefor
clause,else
is required.
contents = ['農林水産省', '年度', '食料自給率','消費']
indices = [i if w == '消費' else None for i, w in enumerate(contents)]
print(indices)
[None, None, None, 3]
- In case the
if
clause is placed after thefor
clause,else
can be omitted.
indices = [i for i, w in enumerate(contents) if w == '消費']
print(indices)
[3]
5.2 Dict comprehension
upper_count = {s.capitalize(): city.count(s) for s in city}
upper_count
{'K': 1, 'O': 2, 'T': 1, 'Y': 1}
5.3 Set comprehension
upper_set = {s.capitalize() for s in city}
upper_set
{'K', 'O', 'T', 'Y'}
6. Function
- Use
def
to define function. The syntax as the following:
def function_name(parameters):
command block
Function without parameter
def simple_func():
print("First line from simple_func")
print("Second line from simple_func")
simple_func()
First line from simple_func
Second line from simple_func
Function with parameter
def sum(a, b):
return a + b
sum(10, 5)
15
Default parameter value
def sum_with_def(a, b=0):
return a + b
sum_with_def(10)
10
Positional arguments: order of parameters is used
def menu(drink, entree, dessert):
print('Drink: %s' % drink)
print('Entree: %s' % entree)
print('Dessert: %s' % dessert)
my_menu = menu('champagne', 'chicken', 'cake')
Drink: champagne
Entree: chicken
Dessert: cake
Keyword arguments: name of parameters is used
mime = menu(dessert='cake', drink='champagne', entree='chicken')
Drink: champagne
Entree: chicken
Dessert: cake
Gather positional arguments with *
def print_pos_argument(*args): # args will be a tuple
print(args)
print_pos_argument(1, 3, 'abc')
print_pos_argument('champagne', 'chicken', 'cake')
(1, 3, 'abc')
('champagne', 'chicken', 'cake')
Gather keyword arguments with **
def print_keyword_arguments(**args): # args will be a dictionary
print(args)
print_keyword_arguments(dessert='cake', drink='champagne', entree='chicken')
{'dessert': 'cake', 'drink': 'champagne', 'entree': 'chicken'}
Function as parameter
def add(a, b):
return a + b
def call_something(func, a, b):
return func(a, b)
value = call_something(add, 4, 5)
value
9
Inner function
def twice_of_sum(a, b):
def sum(c, d):
return c + d
return a + b + sum(a, b)
value = twice_of_sum(4, 5)
value
18
Lambda: anonymous function
g = lambda x: x ** 2
g(3)
9
def edit_story(words, func):
for w in words:
print(func(w))
stairs = ['thud', 'meow', 'thud', 'hiss']
edit_story(stairs, lambda x: x.capitalize() + '!') #Thud! ¥n Meow! ¥n Thud! ¥n Hiss!
Thud!
Meow!
Thud!
Hiss!
map(lambda x: x * x, [1, 2, 3, 4])
[1, 4, 9, 16]