Extract classes, functions, or code blocks from large modules into separate files. This skill should be used during refactoring to break up monolithic files while maintaining import compatibility and test coverage.
Safely extract code from large modules into smaller, focused files while preserving imports, tests, and backwards compatibility.
Large Python modules become difficult to maintain, test, and understand. This skill provides a systematic approach to extracting components into separate modules while:
Use this skill when:
Before extracting, understand what you're working with:
# Read the module and identify:
# 1. Classes/functions to extract
# 2. Internal dependencies (what they use from the same module)
# 3. External dependencies (imports from other modules)
# 4. Dependents (what uses the code to extract)
# Example analysis for prism/core/telescope.py
# Target: Extract TelescopeConfig class
# Internal deps: Uses Grid from same module
# External deps: torch, dataclasses
# Dependents: Telescope class, tests/test_telescope.py
Create the destination file with proper structure:
# prism/core/telescope_config.py
"""Telescope configuration dataclass.
This module was extracted from prism/core/telescope.py.
"""
from __future__ import annotations
# Standard library
from dataclasses import dataclass, field
from typing import Optional
# Third-party
import torch
# Local imports (if any internal deps)
# from .grid import Grid # If Grid was also extracted
@dataclass
class TelescopeConfig:
"""Configuration for telescope simulation.
Parameters
----------
n_pixels : int
Number of pixels in aperture grid
aperture_radius_pixels : float
Aperture radius in pixels
...
"""
n_pixels: int = 512
aperture_radius_pixels: float = 25.0
# ... rest of class
Modify the original module to re-export the extracted component:
# prism/core/telescope.py (AFTER extraction)
"""Telescope simulation for PRISM.
Note: TelescopeConfig has been moved to telescope_config.py
but is re-exported here for backwards compatibility.
"""
from __future__ import annotations
# Re-export for backwards compatibility
from .telescope_config import TelescopeConfig
# ... rest of module continues to use TelescopeConfig normally
Ensure the new module is accessible:
# prism/core/__init__.py
from .telescope import Telescope, TelescopeConfig # TelescopeConfig still works
from .telescope_config import TelescopeConfig as TelescopeConfig # Also direct import
# Explicit __all__ for documentation
__all__ = [
"Telescope",
"TelescopeConfig",
# ...
]
Find and update any direct imports within the codebase:
# Find all imports of the extracted component
grep -r "from.*telescope.*import.*TelescopeConfig" prism/
grep -r "from prism.core.telescope import" prism/
Update if needed (though re-exports should handle most cases):
# If a module imported directly, it still works:
from prism.core.telescope import TelescopeConfig # Still works (re-export)
# Or use new location:
from prism.core.telescope_config import TelescopeConfig # Direct import
Ensure tests still pass and update imports if needed:
# tests/unit/test_telescope.py
# Both import styles should work:
from prism.core.telescope import TelescopeConfig # Via re-export
from prism.core.telescope_config import TelescopeConfig # Direct
def test_telescope_config_defaults():
"""Test TelescopeConfig default values."""
config = TelescopeConfig()
assert config.n_pixels == 512
assert config.aperture_radius_pixels == 25.0
# 1. Type check the affected modules
uv run mypy prism/core/telescope.py prism/core/telescope_config.py
# 2. Run related tests
uv run pytest tests/unit/test_telescope.py -v
# 3. Check for import errors
uv run python -c "from prism.core.telescope import TelescopeConfig; print('OK')"
uv run python -c "from prism.core.telescope_config import TelescopeConfig; print('OK')"
# 4. Run full test suite
uv run pytest tests/ -v
Always re-export extracted components from the original location:
# GOOD: Maintains backwards compatibility
# original_module.py
from .new_module import ExtractedClass
# Any code using `from original_module import ExtractedClass` still works
Follow this import order in new modules:
"""Module docstring."""
from __future__ import annotations
# Standard library
import os
from dataclasses import dataclass
from pathlib import Path
from typing import Optional, Tuple
# Third-party
import numpy as np
import torch
from torch import Tensor
# Local imports (prism package)
from prism.config.constants import nm, um
from prism.utils.transforms import fft, ifft
# Relative imports (same package)
from .grid import Grid
from .propagator import Propagator
When extracting, watch for circular imports:
# PROBLEM: Circular import
# module_a.py imports ClassB from module_b.py
# module_b.py imports ClassA from module_a.py
# SOLUTION 1: Move shared code to third module
# shared.py contains common dependencies
# Both module_a and module_b import from shared
# SOLUTION 2: Use TYPE_CHECKING for type hints only
from typing import TYPE_CHECKING
if TYPE_CHECKING:
from .module_b import ClassB # Only for type hints
def function(b: "ClassB") -> None: # String annotation
pass
Use TYPE_CHECKING for expensive imports needed only for hints:
from __future__ import annotations
from typing import TYPE_CHECKING
if TYPE_CHECKING:
# These imports only happen during type checking
from torch import Tensor
from .heavy_module import HeavyClass
def function(tensor: Tensor, obj: HeavyClass) -> None:
# At runtime, annotations are strings (due to __future__.annotations)
pass
Most common extraction - configuration classes:
BEFORE:
telescope.py (600 lines)
- TelescopeConfig (50 lines)
- Telescope (550 lines)
AFTER:
telescope_config.py (60 lines)
- TelescopeConfig
telescope.py (560 lines)
- from .telescope_config import TelescopeConfig (re-export)
- Telescope
Extract abstract base for multiple implementations:
BEFORE:
trainers.py (800 lines)
- AbstractTrainer (abstract, 100 lines)
- ProgressiveTrainer (350 lines)
- EpochalTrainer (350 lines)
AFTER:
trainers/
__init__.py (re-exports all)
base.py (110 lines)
- AbstractTrainer
progressive.py (360 lines)
- ProgressiveTrainer
epochal.py (360 lines)
- EpochalTrainer
Move helper functions to utils:
BEFORE:
measurement_system.py (500 lines)
- MeasurementSystem class
- _compute_overlap() helper
- _validate_centers() helper
- _normalize_masks() helper
AFTER:
measurement_system.py (400 lines)
- MeasurementSystem class
- from .measurement_utils import compute_overlap, validate_centers, normalize_masks
measurement_utils.py (120 lines)
- compute_overlap()
- validate_centers()
- normalize_masks()
Move constants and enumerations:
BEFORE:
optics.py (700 lines)
- PropagationMethod enum
- DEFAULT_WAVELENGTH constant
- SUPPORTED_REGIMES dict
- ... optical classes
AFTER:
optics_constants.py (50 lines)
- PropagationMethod
- DEFAULT_WAVELENGTH
- SUPPORTED_REGIMES
optics.py (660 lines)
- from .optics_constants import PropagationMethod, DEFAULT_WAVELENGTH
- ... optical classes
Before starting:
During extraction:
from __future__ import annotations__init__.py if neededAfter extraction:
Extracting ExperimentConfig from prism/config/experiment.py:
# 1. Analyze
grep -n "class ExperimentConfig" prism/config/experiment.py
grep -r "ExperimentConfig" prism/ tests/
# 2. Create new file
# prism/config/experiment_config.py with ExperimentConfig class
# 3. Update source
# Add: from .experiment_config import ExperimentConfig
# 4. Update __init__.py
# Verify: from .experiment import ExperimentConfig works
# 5. Validate
uv run mypy prism/config/experiment.py prism/config/experiment_config.py
uv run pytest tests/unit/test_config.py -v
uv run python -c "from prism.config import ExperimentConfig; print('OK')"