4
6

Delete article

Deleted articles cannot be recovered.

Draft of this article would be also deleted.

Are you sure you want to delete this article?

More than 5 years have passed since last update.

Azure SQL Serverのデータテーブルをpyodbcで抽出して、numpy array/pandas dataframeにしてみる

Last updated at Posted at 2017-03-13

pyodbcというpythonライブラリで、Azure SQL Server内のデータテーブルを引っこ抜くまでが出来たところから、そのテーブルをnumpyarray形式、もしくはpandasDataFrame形式に変換するところのメモです。

ライブラリ、環境、関数の定義はこっちに書いてあります
、、ので、そちらをまずはご参照下さい。

array/DataFrame形式にするための準備

## libraries
import pyodbc
import numpy as np
import pandas as pd

## initial setting
## 適宜設定値は変えて下さい
server = 'abc_server.database.windows.net'  
database = 'abc_database'  
username = 'abc_user'  
password = 'abc_password'  

## 関数定義
### DB connectionを定義
def db_connection(sv=server, db=database, un=username, pw=password):    
    cnxn = pyodbc.connect('DRIVER={ODBC Driver 13 for SQL Server};SERVER='+sv+';DATABASE='+db+';UID='+un+';PWD='+ pw)
    return cnxn.cursor()

### SQLを発行
def query_output(sql):
    cursor.execute(sql)
    row = cursor.fetchone()
    while row:  
        print row[0]  
        row = cursor.fetchone()

→設定の意味などは、試行前提などをご参照下さい。

Case 2: テーブルの中身をpandas dataframeに突っ込む

### DB connectionしときます
cursor = db_connection()

### SQLの設定
sql = 'select * from SalesLT.Customer;'

cursor.execute(sql)
rows = cursor.fetchall()

この辺でrowsの中身を確認します:

>>> rows

[(1, False, u'Mr.', u'Orlando', u'N.', u'Gee', None, u'A Bike Store', u'adventure-works\\pamela0', u'orlando0@adventure-works.com', u'245-555-0173', u'L/Rlwxzp4w7RWmEgXX+/A7cXaePEPcp+KwQhl2fJL7w=', u'1KjXYs4=', u'3F5AE95E-B87D-4AED-95B4-C3797AFCB74F', datetime.datetime(2005, 8, 1, 0, 0)),
 (2, False, u'Mr.', u'Keith', None, u'Harris', None, u'Progressive Sports', u'adventure-works\\david8', u'keith0@adventure-works.com', u'170-555-0127', u'YPdtRdvqeAhj6wyxEsFdshBDNXxkCXn+CRgbvJItknw=', u'fs1ZGhY=', u'E552F657-A9AF-4A7D-A645-C429D6E02491', datetime.datetime(2006, 8, 1, 0, 0)),
 (3, False, u'Ms.', u'Donna', u'F.', u'Carreras', None, u'Advanced Bike Components', u'adventure-works\\jillian0', u'donna0@adventure-works.com', u'279-555-0130', u'LNoK27abGQo48gGue3EBV/UrlYSToV0/s87dCRV7uJk=', u'YTNH5Rw=', u'130774B1-DB21-4EF3-98C8-C104BCD6ED6D', datetime.datetime(2005, 9, 1, 0, 0)),
 :

という感じで、配列の中に配列(らしきもの。正確にはpyodbc.Rowという形式)なので、
試しにnumpyのarrayに突っ込んでみます

array_format = np.array(rows)
dataframe_format = pd.DataFrame(array_format)

するとnumpyarray形式は、、


>>>array_format
array([[1, False, u'Mr.', ..., u'1KjXYs4=',
        u'3F5AE95E-B87D-4AED-95B4-C3797AFCB74F',
        datetime.datetime(2005, 8, 1, 0, 0)],
       [2, False, u'Mr.', ..., u'fs1ZGhY=',
        u'E552F657-A9AF-4A7D-A645-C429D6E02491',
        datetime.datetime(2006, 8, 1, 0, 0)],
       [3, False, u'Ms.', ..., u'YTNH5Rw=',
        u'130774B1-DB21-4EF3-98C8-C104BCD6ED6D',
        datetime.datetime(2005, 9, 1, 0, 0)],
       ..., 
       [30116, False, u'Ms.', ..., u'HDCU1Bk=',
        u'EC409609-D25D-41B8-9D15-A1AA6E89FC77',
        datetime.datetime(2007, 7, 1, 0, 0)],
       [30117, False, u'Mr.', ..., u'iES3IZA=',
        u'6F08E2FB-1CD3-4F6E-A2E6-385669598B19',
        datetime.datetime(2005, 8, 1, 0, 0)],
       [30118, False, u'Ms.', ..., u'QhHP+y8=',
        u'2495B4EB-FE8B-459E-A1B6-DBA25C04E626',
        datetime.datetime(2006, 9, 1, 0, 0)]], dtype=object)
>>> array_format.shape
(847L, 15L)

また、pandasDataFrameの方は


>>>dataframe_format.head()
  0      1    2        3     4           5     6                           7   \
0  1  False  Mr.  Orlando    N.         Gee  None                A Bike Store   
1  2  False  Mr.    Keith  None      Harris  None          Progressive Sports   
2  3  False  Ms.    Donna    F.    Carreras  None    Advanced Bike Components   
3  4  False  Ms.    Janet    M.       Gates  None       Modular Cycle Systems   
4  5  False  Mr.     Lucy  None  Harrington  None  Metropolitan Sports Supply   

                         8                             9             10  \
0   adventure-works\pamela0  orlando0@adventure-works.com  245-555-0173   
1    adventure-works\david8    keith0@adventure-works.com  170-555-0127   
2  adventure-works\jillian0    donna0@adventure-works.com  279-555-0130   
3  adventure-works\jillian0    janet1@adventure-works.com  710-555-0173   
4      adventure-works\shu0     lucy0@adventure-works.com  828-555-0186   

                                             11        12  \
0  L/Rlwxzp4w7RWmEgXX+/A7cXaePEPcp+KwQhl2fJL7w=  1KjXYs4=   
1  YPdtRdvqeAhj6wyxEsFdshBDNXxkCXn+CRgbvJItknw=  fs1ZGhY=   
2  LNoK27abGQo48gGue3EBV/UrlYSToV0/s87dCRV7uJk=  YTNH5Rw=   
3  ElzTpSNbUW1Ut+L5cWlfR7MF6nBZia8WpmGaQPjLOJA=  nm7D5e4=   
4  KJqV15wsX3PG8TS5GSddp6LFFVdd3CoRftZM/tP0+R4=  cNFKU4w=   

                                     13                   14  
0  3F5AE95E-B87D-4AED-95B4-C3797AFCB74F  2005-08-01 00:00:00  
1  E552F657-A9AF-4A7D-A645-C429D6E02491  2006-08-01 00:00:00  
2  130774B1-DB21-4EF3-98C8-C104BCD6ED6D  2005-09-01 00:00:00  
3  FF862851-1DAA-4044-BE7C-3E85583C054D  2006-07-01 00:00:00  
4  83905BDC-6F5E-4F71-B162-C98DA069F38A  2006-09-01 00:00:00  

>>> dataframe_format.shape
(847, 15)

、、となって大丈夫そうですね。

Reference

4
6
0

Register as a new user and use Qiita more conveniently

  1. You get articles that match your needs
  2. You can efficiently read back useful information
  3. You can use dark theme
What you can do with signing up
4
6

Delete article

Deleted articles cannot be recovered.

Draft of this article would be also deleted.

Are you sure you want to delete this article?