Pandas and Polars for tabular data: when to use which, indexing and dtypes, joins and reshaping, lazy Polars, I/O and Parquet, nulls, Arrow interop, and performance. Use for DataFrame/Series/LazyFrame code. Triggers on pandas, polars, DataFrame, LazyFrame, parquet, groupby, merge, join.
loc / iloc, assign, and dtypes; understand views vs copies and Copy-on-Write (2.x). See reference-pandas-core.md.scan_* + lazy() for big data; build with select / with_columns / filter and pl.col; collect at the boundary. See reference-polars.md.uv add pyarrow when the stack needs it. See reference-io-dtypes-nulls.md.from_pandas / to_pandas or Arrow; avoid ping-pong in inner loops. See reference-interop-performance.md.| Resource | Role |
|---|---|
| Python spec | Explicit pandas index/columns; lazy Polars for heavy queries |
| numpy-scientific | to_numpy(), dtypes, contiguous buffers |
| matplotlib-scientific | df.plot(ax=ax) and Polars .to_pandas() for plotting when needed |
| numpy-docstrings | Public API docstrings for DataFrame-returning functions |
| general-python | uv, ty, python-reviewer |
| Topic | File |
|---|---|
| pandas vs Polars, boundaries | reference-when-which.md |
Index, loc/iloc, dtypes, COW, assignment | reference-pandas-core.md |
| GroupBy, merge/join, concat, pivot | reference-pandas-group-join.md |
| LazyFrame, expressions, group, join | reference-polars.md |
| CSV/Parquet, nulls, dtypes, Arrow | reference-io-dtypes-nulls.md |
| Conversion, streaming, typing | reference-interop-performance.md |
Use uv add pandas, uv add polars, uv add pyarrow as needed; do not hand-edit version pins in pyproject.toml.