Evaluation of BaSiC illumination correction caching - NOT RECOMMENDED for sparse markers. Trigger: optimizing BaSiC, caching illumination correction
| Item | Details |
|---|---|
| Date | 2025-12-11 |
| Goal | Evaluate whether caching BaSiC illumination profiles across cycles/channels improves processing speed |
| Environment | CuPy GPU acceleration, multiplex immunofluorescence data |
| Status | FAILED - Caching NOT recommended |
BaSiC (Background and Shading Correction) computes flatfield and darkfield correction profiles for each image stack. The hypothesis was that similar channels across cycles might share illumination profiles, allowing cached profiles to skip computation.
Caching BaSiC profiles causes 15-20% intensity errors in sparse markers.
Sparse markers have unique illumination profiles: Channels with few positive cells (e.g., rare immune markers) have very different intensity distributions than dense markers (e.g., DAPI)
Cross-channel variation: Even the same marker across cycles shows illumination variation due to:
Error compounds: Using wrong flatfield introduces systematic bias that propagates through all downstream analysis
# Test setup
channels_tested = ["DAPI", "CD3", "CD20", "CD68"] # Dense to sparse
# Compute per-channel profiles
profiles_per_channel = {}
for ch in channels_tested:
images = load_channel(ch)
flatfield, darkfield = basic_correct(images)
profiles_per_channel[ch] = (flatfield, darkfield)
# Test cross-application
for ch1 in channels_tested:
for ch2 in channels_tested:
if ch1 != ch2:
# Apply ch1's profile to ch2's images
corrected = apply_correction(
load_channel(ch2),
profiles_per_channel[ch1]
)
error = compute_error_vs_ground_truth(corrected, ch2)
print(f"{ch1} -> {ch2}: {error:.1%} error")
| Source Profile | Applied To | Error Rate |
|---|---|---|
| DAPI | CD3 | 8.2% |
| DAPI | CD20 | 12.4% |
| DAPI | CD68 (sparse) | 18.7% |
| CD3 | CD68 (sparse) | 15.3% |
| Same channel | Same channel | 0% (baseline) |
| Attempt | Why it Failed | Lesson Learned |
|---|---|---|
| Cache by channel name | Same marker varies across cycles | Each acquisition is unique |
| Cache by intensity histogram | Sparse markers have distinct histograms | Can't match on statistics |
| Interpolate between profiles | Non-linear relationship | No simple interpolation works |
| Use DAPI as universal reference | DAPI is dense, others are sparse | Density matters for BaSiC |
Since caching doesn't work, focus on these GPU optimizations:
# Use n-dimensional DCT instead of sequential 1D
from cupyx.scipy.fft import dctn, idctn
# Old (slower)
dct_result = dct(dct(image, axis=0), axis=1)
# New (faster)
dct_result = dctn(image, axes=(0, 1))
# Process multiple z-planes in parallel
with ThreadPoolExecutor(max_workers=4) as executor:
futures = [executor.submit(basic_correct, plane) for plane in z_planes]
results = [f.result() for f in futures]
# Avoid repeated allocations
class BaSiCGPU:
def __init__(self, image_shape):
self.buffer = cp.empty(image_shape, dtype=cp.float32)
self.fft_buffer = cp.empty(image_shape, dtype=cp.complex64)
Even in these cases, validate carefully before using cached profiles.
BaSiC_Caching_Validation_Test.ipynb