0
1

Delete article

Deleted articles cannot be recovered.

Draft of this article would be also deleted.

Are you sure you want to delete this article?

Querying database in chunks

Posted at

Sometimes you want to fetch data from a table using SQL like "SELECT * FROM some_table WHERE some_column in ()". And you often have restrictions on how many records you can fetch at a time. For instance let's say you have a table of equity prices and you want to fetch data for 10K equities but you can fetch data only for 500 equities at a time. You will want to split your 10K into chunks of 500 each and run the query iteratively to fetch batch of 500 records and in the end concatenate them into a single dataframe.

Below is how you can achieve that.


def get_chunked_list(original_list, chunk_size: int = 400):
    chunked_list = [original_list[i:i + chunk_size] for i in range(0, len(original_list), chunk_size)]
    return chunked_list


def get_db_records_in_chunks(record_keys, query_func, con, chunk_size=500, *query_args):
    chunked_record_keys = get_chunked_list(record_keys, chunk_size=chunk_size)
    df_list = []
    for record_keys_chunk in chunked_record_keys:
        query = query_func(record_keys_chunk, *query_args)
        df_list.append(pd.read_sql(query, con))
    if len(df_list):
        return pd.concat(df_list)
0
1
0

Register as a new user and use Qiita more conveniently

  1. You get articles that match your needs
  2. You can efficiently read back useful information
  3. You can use dark theme
What you can do with signing up
0
1

Delete article

Deleted articles cannot be recovered.

Draft of this article would be also deleted.

Are you sure you want to delete this article?