Data analysis workflows and patterns for processing, analyzing, and visualizing data using Python data science libraries.
This skill provides guidance and tools for data analysis workflows using Python's data science ecosystem.
Use this skill when you need to:
This skill supports the following data formats:
| Format | Extension | Library |
|---|---|---|
| CSV | .csv | pandas |
| JSON | .json | pandas |
| Parquet | .parquet |
| pandas + pyarrow |
| Excel | .xlsx, .xls | pandas + openpyxl |
| SQL | Database connection | pandas + sqlalchemy |
A typical data analysis workflow follows these steps:
import pandas as pd
import matplotlib.pyplot as plt
# Load data
df = pd.read_csv('data.csv')
# Explore
print(df.info())
print(df.describe())
# Visualize
df.plot(kind='hist')
plt.show()