Introduction
When we need to get an overview or summary of the data frame, the first panda's function we will try is 'describe.' It gives us a simple overview of our data. But we want more we need do some task manually.
So how can we do this better and on the fly?
Skimpy is the answer. It gives an extended report in one line of code.
Why Skimpy when we got 'pandas-profiling' ?
Most of us know that pandas-profiling is doing a great job when we need more dataset details. It's an excellent library for data science tasks. But, what if we need a more straightforward edition? A simple one. I think skimpy is helpful in that case.
Demo
Without discussing much, let's move to the demonstration part. First, you need to install the skimpy library.
!pip install skimpy
Now, let us use skimpy toy data generate function to make the data set for our demo.
from skimpy import skim, generate_test_data
df = generate_test_data()
df.head()
Generally, after creating the data frame, I used to run two functions, head() and describe(). So we can get an overview. describe() gives us a charming view of essential values.
df.describe()
Usually, after that, we need to look into the insight of the data. We need to EDA, create histograms, etc...
Let us see how skimpy help us to generate an extended summary.
skim(df)
Isn't that great? the skim function returns extended summary details. It includes histograms, missing data, basic statistics info, and also it produces data types and etc..
This is a new library. But seems very useful, so I wanted to share it with you.
You can get more details from https://github.com/aeturrell/skimpy
Have a nice day ...!!!
*本記事は @qualitia_cdevの中の一人、@nuwanさんが書いてくれました。
*This article is written by @nuwan a member of @qualitia_cdev.