Fast, reproducible scientific Python environments with pixi - conda and PyPI unified
Master pixi, the modern package manager that unifies conda and PyPI ecosystems for fast, reproducible scientific Python development. Learn how to manage complex scientific dependencies, create isolated environments, and build reproducible workflows using pyproject.toml integration.
Official Documentation: https://pixi.sh GitHub: https://github.com/prefix-dev/pixi
# Installation must be performed separately
# On the server, load via lmod if not already in path
module load Dev/pixi
# Initialize new project with pyproject.toml
pixi init --format pyproject
# Initialize existing Python project
pixi init --format pyproject --import-environment
# Add dependencies
pixi add numpy scipy pandas # conda packages
pixi add --pypi pytest-cov # PyPI-only packages
pixi add --feature dev pytest ruff # dev environment
# Install all dependencies
pixi install
# Run commands in environment
pixi run python script.py
pixi run pytest
# Shell with environment activated
pixi shell
# Add tasks
pixi task add test "pytest tests/"
pixi task add docs "sphinx-build docs/ docs/_build"
# Run tasks
pixi run test
pixi run docs
# Update dependencies
pixi update numpy # update specific
pixi update # update all
# List packages
pixi list
pixi tree numpy # show dependency tree
Need compiled scientific libraries (NumPy, SciPy, GDAL)?
├─ YES → Use pixi (conda-forge has pre-built binaries)
└─ NO → Consider uv for pure Python projects
Need multi-language support (Python + R, Julia, C++)?
├─ YES → Use pixi (supports conda ecosystem)
└─ NO → uv sufficient for Python-only
Need multiple environments (dev, test, prod, GPU, CPU)?
├─ YES → Use pixi features for environment management
└─ NO → Single environment projects work with either
Need reproducible environments across platforms?
├─ CRITICAL → Use pixi (lockfiles include all platforms)
└─ LESS CRITICAL → uv also provides lockfiles
Want to use both conda-forge AND PyPI packages?
├─ YES → Use pixi (seamless integration)
└─ ONLY PYPI → uv is simpler and faster
Legacy conda environment files (environment.yml)?
├─ YES → pixi can import and modernize
└─ NO → Start fresh with pixi or uv
Pixi resolves dependencies from both conda-forge and PyPI in a single unified graph, ensuring compatibility:
[project]
name = "my-science-project"
dependencies = [
"numpy>=1.24", # from conda-forge (optimized builds)
"pandas>=2.0", # from conda-forge
]
[tool.pixi.pypi-dependencies]
my-custom-pkg = ">=1.0" # PyPI-only package
Why this matters for scientific Python:
Pixi generates pixi.lock with dependency specifications for all platforms (Linux, macOS, Windows, different architectures):
# pixi.lock includes:
# - linux-64
# - osx-64, osx-arm64
# - win-64
Benefits:
Create multiple environments using features without duplicating dependencies:
[tool.pixi.feature.test.dependencies]
pytest = ">=7.0"
pytest-cov = ">=4.0"
[tool.pixi.feature.gpu.dependencies]
pytorch-cuda = "11.8.*"
[tool.pixi.environments]
test = ["test"]
gpu = ["gpu"]
gpu-test = ["gpu", "test"] # combines features
Define reusable commands as tasks:
[tool.pixi.tasks]
test = "pytest tests/ -v"
format = "ruff format src/ tests/"
lint = "ruff check src/ tests/"
docs = "sphinx-build docs/ docs/_build"
analyse = { cmd = "python scripts/analyze.py", depends-on = ["test"] }
Pixi uses rattler (Rust-based conda resolver) for 10-100x faster resolution than conda:
Pixi reads standard Python project metadata from pyproject.toml, enabling:
# Create new project
mkdir climate-analysis && cd climate-analysis
pixi init --format pyproject
# Add scientific stack
pixi add python=3.11 numpy pandas matplotlib xarray
# Add development tools
pixi add --feature dev pytest ipython ruff
# Create analysis script
cat > analyze.py << 'EOF'
import pandas as pd
import matplotlib.pyplot as plt
# Your analysis code
data = pd.read_csv("data.csv")
data.plot()
plt.savefig("output.png")
EOF
# Run in pixi environment
pixi run python analyze.py
# Or activate shell
pixi shell
python analyze.py
# Initialize project
pixi init ml-project --format pyproject
cd ml-project
# Add base dependencies
pixi add python=3.11 numpy pandas scikit-learn matplotlib jupyter
# Add CPU PyTorch
pixi add --platform linux-64 --platform osx-arm64 pytorch torchvision cpuonly -c pytorch
# Create GPU feature
pixi add --feature gpu pytorch-cuda=11.8 -c pytorch -c nvidia
# Add development tools
pixi add --feature dev pytest black mypy
# Configure environments in pyproject.toml
cat >> pyproject.toml << 'EOF'
[tool.pixi.environments]
default = { solve-group = "default" }
gpu = { features = ["gpu"], solve-group = "default" }
dev = { features = ["dev"], solve-group = "default" }
EOF
# Install and run
pixi install
pixi run python train.py # uses default (CPU)
pixi run --environment gpu python train.py # uses GPU
Scenario: You have an existing project with requirements.txt or environment.yml
Solution:
# From requirements.txt
cd existing-project
pixi init --format pyproject
# Import from requirements.txt
while IFS= read -r package; do
# Skip comments and empty lines
[[ "$package" =~ ^#.*$ ]] || [[ -z "$package" ]] && continue
# Try conda first, fallback to PyPI
pixi add "$package" 2>/dev/null || pixi add --pypi "$package"
done < requirements.txt
# From environment.yml
pixi init --format pyproject --import-environment environment.yml
# Verify installation
pixi install
pixi run python -c "import numpy, pandas, scipy; print('Success!')"
Best Practice: Review generated pyproject.toml and organize dependencies:
[project.dependencies][tool.pixi.pypi-dependencies][tool.pixi.feature.dev.dependencies]Scenario: Different environments for development, testing, production, and GPU computing
Implementation:
[project]
name = "research-pipeline"
version = "0.1.0"
dependencies = [
"python>=3.11",
"numpy>=1.24",
"pandas>=2.0",
"xarray>=2023.1",
]
# Development tools
[tool.pixi.feature.dev.dependencies]
ipython = ">=8.0"
jupyter = ">=1.0"
ruff = ">=0.1"
[tool.pixi.feature.dev.pypi-dependencies]
jupyterlab-vim = ">=0.16"
# Testing tools
[tool.pixi.feature.test.dependencies]
pytest = ">=7.4"
pytest-cov = ">=4.1"
pytest-xdist = ">=3.3"
hypothesis = ">=6.82"
# GPU dependencies
[tool.pixi.feature.gpu.dependencies]
pytorch-cuda = "11.8.*"
cudatoolkit = "11.8.*"
[tool.pixi.feature.gpu.pypi-dependencies]
nvidia-ml-py = ">=12.0"
# Production optimizations
[tool.pixi.feature.prod.dependencies]
python = "3.11.*" # pin exact version
# Define environments combining features
[tool.pixi.environments]
default = { solve-group = "default" }
dev = { features = ["dev"], solve-group = "default" }
test = { features = ["test"], solve-group = "default" }
gpu = { features = ["gpu"], solve-group = "gpu" }
gpu-dev = { features = ["gpu", "dev"], solve-group = "gpu" }
prod = { features = ["prod"], solve-group = "prod" }
# Tasks for each environment
[tool.pixi.tasks]
dev-notebook = { cmd = "jupyter lab", env = { JUPYTER_CONFIG_DIR = ".jupyter" } }
test = "pytest tests/ -v --cov=src"
test-parallel = "pytest tests/ -n auto"
train-cpu = "python train.py --device cpu"
train-gpu = "python train.py --device cuda"
benchmark = "python benchmark.py"
Usage:
# Development
pixi run --environment dev dev-notebook
# Testing
pixi run --environment test test
# GPU training
pixi run --environment gpu train-gpu
# Production
pixi run --environment prod benchmark
Scenario: Developing a scientific Python package with proper packaging, testing, and documentation
Structure:
[build-system]
requires = ["hatchling"]
build-backend = "hatchling.build"
[project]
name = "mylib"
version = "0.1.0"
description = "Scientific computing library"
dependencies = [
"numpy>=1.24",
"scipy>=1.11",
]
[project.optional-dependencies]
viz = ["matplotlib>=3.7", "seaborn>=0.12"]
# Development dependencies
[tool.pixi.feature.dev.dependencies]
ipython = "*"
ruff = "*"
mypy = "*"
# Testing dependencies
[tool.pixi.feature.test.dependencies]
pytest = ">=7.4"
pytest-cov = ">=4.1"
pytest-benchmark = ">=4.0"
hypothesis = ">=6.82"
# Documentation dependencies
[tool.pixi.feature.docs.dependencies]
sphinx = ">=7.0"
sphinx-rtd-theme = ">=1.3"
numpydoc = ">=1.5"
sphinx-gallery = ">=0.14"
[tool.pixi.feature.docs.pypi-dependencies]
myst-parser = ">=2.0"
# Build dependencies
[tool.pixi.feature.build.dependencies]
build = "*"
twine = "*"
[tool.pixi.environments]
default = { features = [], solve-group = "default" }
dev = { features = ["dev", "test", "docs"], solve-group = "default" }
test = { features = ["test"], solve-group = "default" }
docs = { features = ["docs"], solve-group = "default" }
# Tasks for development workflow
[tool.pixi.tasks]
# Development
install-dev = "pip install -e ."
format = "ruff format src/ tests/"
lint = "ruff check src/ tests/"
typecheck = "mypy src/"
# Testing
test = "pytest tests/ -v"
test-cov = "pytest tests/ --cov=src --cov-report=html --cov-report=term"
test-fast = "pytest tests/ -x -v"
benchmark = "pytest tests/benchmarks/ --benchmark-only"
# Documentation
docs-build = "sphinx-build docs/ docs/_build/html"
docs-serve = { cmd = "python -m http.server 8000 -d docs/_build/html", depends-on = ["docs-build"] }
docs-clean = "rm -rf docs/_build docs/generated"
# Build and release
build = "python -m build"
publish-test = { cmd = "twine upload --repository testpypi dist/*", depends-on = ["build"] }
publish = { cmd = "twine upload dist/*", depends-on = ["build"] }
# Combined workflows
ci = { depends-on = ["format", "lint", "typecheck", "test-cov"] }
pre-commit = { depends-on = ["format", "lint", "test-fast"] }
Workflow:
# Initial setup
pixi install --environment dev
pixi run install-dev
# Development cycle
pixi run format # format code
pixi run lint # check style
pixi run typecheck # type checking
pixi run test # run tests
# Or run all checks
pixi run ci
# Build documentation
pixi run docs-build
pixi run docs-serve # view at http://localhost:8000
# Release workflow
pixi run build
pixi run publish-test # test on TestPyPI
pixi run publish # publish to PyPI
Scenario: Optimize dependency sources for performance and availability
Strategy:
[project]
dependencies = [
# Core scientific stack: prefer conda-forge (optimized builds)
"numpy>=1.24", # MKL or OpenBLAS optimized
"scipy>=1.11", # optimized BLAS/LAPACK
"pandas>=2.0", # optimized pandas
"matplotlib>=3.7", # compiled components
"scikit-learn>=1.3", # optimized algorithms
# Geospatial/climate: conda-forge essential (C/Fortran deps)
"xarray>=2023.1",
"netcdf4>=1.6",
"h5py>=3.9",
"rasterio>=1.3", # GDAL dependency
# Data processing: conda-forge preferred
"dask>=2023.1",
"numba>=0.57", # LLVM dependency
]
[tool.pixi.pypi-dependencies]
# Pure Python packages or PyPI-only packages
my-custom-tool = ">=1.0"
experimental-lib = { git = "https://github.com/user/repo.git" }
internal-pkg = { path = "../internal-pkg", editable = true }
Decision Rules:
Use conda-forge (pixi add) for:
Use PyPI (pixi add --pypi) for:
Scenario: Ensure research is reproducible across time and machines
Implementation:
[project]
name = "nature-paper-2024"
version = "1.0.0"
description = "Analysis for Nature Paper 2024"
requires-python = ">=3.11,<3.12" # pin Python version range
dependencies = [
"python=3.11.6", # exact Python version
"numpy=1.26.2", # exact versions for reproducibility
"pandas=2.1.4",
"scipy=1.11.4",
"matplotlib=3.8.2",
"scikit-learn=1.3.2",
]
[tool.pixi.pypi-dependencies]
# Pin with exact hashes for ultimate reproducibility
seaborn = "==0.13.0"
# Analysis environments
[tool.pixi.feature.analysis.dependencies]
jupyter = "1.0.0"
jupyterlab = "4.0.9"
[tool.pixi.feature.analysis.pypi-dependencies]
jupyterlab-vim = "0.16.0"
# Environments
[tool.pixi.environments]
default = { solve-group = "default" }
analysis = { features = ["analysis"], solve-group = "default" }
# Reproducible tasks
[tool.pixi.tasks]
# Data processing pipeline
download-data = "python scripts/01_download.py"
preprocess = { cmd = "python scripts/02_preprocess.py", depends-on = ["download-data"] }
analyze = { cmd = "python scripts/03_analyze.py", depends-on = ["preprocess"] }
visualize = { cmd = "python scripts/04_visualize.py", depends-on = ["analyze"] }
full-pipeline = { depends-on = ["download-data", "preprocess", "analyze", "visualize"] }
# Notebook execution
run-notebooks = "jupyter nbconvert --execute --to notebook --inplace notebooks/*.ipynb"
Best Practices:
# Generate lockfile
pixi install
# Commit lockfile to repository
git add pixi.lock pyproject.toml
git commit -m "Lock environment for reproducibility"
# Anyone can recreate exact environment
git clone https://github.com/user/nature-paper-2024.git
cd nature-paper-2024
pixi install # installs exact versions from pixi.lock
# Run complete pipeline
pixi run full-pipeline
# Archive for long-term preservation
pixi list --export environment.yml # backup as conda format
Scenario: Team members on Linux, macOS (Intel/ARM), and Windows
Configuration:
[project]
name = "cross-platform-science"
dependencies = [
"python>=3.11",
"numpy>=1.24",
"pandas>=2.0",
]
# Platform-specific dependencies
[tool.pixi.target.linux-64.dependencies]
# Linux-specific optimized builds
mkl = "*"
[tool.pixi.target.osx-arm64.dependencies]
# Apple Silicon optimizations
accelerate = "*"
[tool.pixi.target.win-64.dependencies]
# Windows-specific packages
pywin32 = "*"
# Tasks with platform-specific behavior
[tool.pixi.tasks]
test = "pytest tests/"
[tool.pixi.target.linux-64.tasks]
test-gpu = "pytest tests/ --gpu"
[tool.pixi.target.win-64.tasks]
test = "pytest tests/ --timeout=30" # slower on Windows CI
Platform Selectors:
# Supported platforms
[tool.pixi.platforms]
linux-64 = true
linux-aarch64 = true
osx-64 = true
osx-arm64 = true
win-64 = true
Scenario: Complex scientific workflows with data dependencies
Implementation:
[tool.pixi.tasks]
# Data acquisition
download-raw = "python scripts/download.py --source=api"
validate-raw = { cmd = "python scripts/validate.py data/raw/", depends-on = ["download-raw"] }
# Data processing pipeline
clean-data = { cmd = "python scripts/clean.py", depends-on = ["validate-raw"] }
transform = { cmd = "python scripts/transform.py", depends-on = ["clean-data"] }
feature-engineering = { cmd = "python scripts/features.py", depends-on = ["transform"] }
# Analysis
train-model = { cmd = "python scripts/train.py", depends-on = ["feature-engineering"] }
evaluate = { cmd = "python scripts/evaluate.py", depends-on = ["train-model"] }
visualize = { cmd = "python scripts/visualize.py", depends-on = ["evaluate"] }
# Testing at each stage
test-cleaning = "pytest tests/test_clean.py"
test-transform = "pytest tests/test_transform.py"
test-features = "pytest tests/test_features.py"
test-model = "pytest tests/test_model.py"
# Combined workflows
all-tests = { depends-on = ["test-cleaning", "test-transform", "test-features", "test-model"] }
full-pipeline = { depends-on = ["download-raw", "validate-raw", "clean-data", "transform", "feature-engineering", "train-model", "evaluate", "visualize"] }
pipeline-with-tests = { depends-on = ["all-tests", "full-pipeline"] }
# Parallel execution where possible
[tool.pixi.task.download-supplementary]
cmd = "python scripts/download_supplement.py"
[tool.pixi.task.process-all]
depends-on = ["download-raw", "download-supplementary"] # run in parallel
Running Workflows:
# Run entire pipeline
pixi run full-pipeline
# Run with testing
pixi run pipeline-with-tests
# Check what will run
pixi task list --summary
# Visualize task dependencies
pixi task info full-pipeline
Scenario: Use pixi for complex dependencies, uv for fast pure Python workflows
Hybrid Approach:
[project]
name = "hybrid-project"
dependencies = [
# Heavy scientific deps via pixi/conda
"python>=3.11",
"numpy>=1.24",
"scipy>=1.11",
"gdal>=3.7", # complex C++ dependency
"netcdf4>=1.6", # Fortran dependency
]
[tool.pixi.pypi-dependencies]
# Pure Python packages
requests = ">=2.31"
pydantic = ">=2.0"
typer = ">=0.9"
[tool.pixi.feature.dev.dependencies]
ruff = "*"
mypy = "*"
[tool.pixi.feature.dev.pypi-dependencies]
pytest = ">=7.4"
[tool.pixi.tasks]
# Use uv for fast pure Python operations
install-dev = "uv pip install -e ."
sync-deps = "uv pip sync requirements.txt"
add-py-dep = "uv pip install"
Workflow:
# Pixi manages environment with conda packages
pixi install
# Activate pixi environment
pixi shell
# Inside pixi shell, use uv for fast pure Python operations
uv pip install requests httpx pydantic # fast pure Python installs
uv pip freeze > requirements-py.txt
# Or define as tasks
pixi run install-dev
When to use this pattern:
Scenario: Reproducible testing in GitHub Actions, GitLab CI, etc.
GitHub Actions Example:
# .github/workflows/test.yml