Parse Flow Cytometry Standard (FCS) files v2.0–3.1 and extract events/metadata for preprocessing workflows (e.g., when you need NumPy arrays, channel info, or CSV/DataFrame export from cytometry files).
ndarray with shape (events, channels).python >= 3.9flowio (install via pip/uv; version depends on your environment)numpy >= 1.20pandas >= 1.5"""
End-to-end example:
1) Read an FCS file (metadata + events)
2) Convert to a Pandas DataFrame and export CSV
3) Filter events and write a new FCS file
4) Handle multi-dataset files
"""
from pathlib import Path
import numpy as np
import pandas as pd
from flowio import (
FlowData,
create_fcs,
read_multiple_data_sets,
MultipleDataSetsError,
FCSParsingError,
DataOffsetDiscrepancyError,
)
FCS_PATH = "sample.fcs"
def read_fcs_safely(path: str) -> FlowData:
try:
return FlowData(path)
except DataOffsetDiscrepancyError:
# Common workaround for files with inconsistent offsets
return FlowData(path, ignore_offset_discrepancy=True)
except FCSParsingError:
# Looser mode if the file is malformed
return FlowData(path, ignore_offset_error=True)
def main() -> None:
# --- 1) Read file (single dataset) ---
try:
flow = read_fcs_safely(FCS_PATH)
except MultipleDataSetsError:
# --- 4) Multi-dataset handling ---
datasets = read_multiple_data_sets(FCS_PATH)
flow = datasets[0] # pick the first dataset for this demo
print("File:", getattr(flow, "name", Path(FCS_PATH).name))
print("FCS version:", flow.version)
print("Events:", flow.event_count)
print("Channels:", flow.channel_count)
print("PnN labels:", flow.pnn_labels)
# Metadata (TEXT segment)
print("Instrument ($CYT):", flow.text.get("$CYT", "N/A"))
print("Acquisition date ($DATE):", flow.text.get("$DATE", "N/A"))
# --- 2) Events -> NumPy -> DataFrame -> CSV ---
events = flow.as_array(preprocess=True) # default preprocessing behavior
df = pd.DataFrame(events, columns=flow.pnn_labels)
df.to_csv("events.csv", index=False)
print("Wrote CSV:", "events.csv")
# --- 3) Filter and write a new FCS ---
# Example: threshold on first scatter channel if available, else channel 0
fsc_idx = flow.scatter_indices[0] if getattr(flow, "scatter_indices", []) else 0
threshold = np.percentile(events[:, fsc_idx], 50) # median threshold
mask = events[:, fsc_idx] > threshold
filtered = events[mask]
create_fcs(
"filtered.fcs",
filtered,
flow.pnn_labels,
opt_channel_names=flow.pns_labels,
metadata={**flow.text, "$SRC": "Filtered via FlowIO example"},
)
print("Wrote FCS:", "filtered.fcs")
# --- Metadata-only read (memory efficient) ---
meta_only = FlowData(FCS_PATH, only_text=True)
print("Metadata-only read: $DATE =", meta_only.text.get("$DATE", "N/A"))
if __name__ == "__main__":
main()
An FCS file is organized into segments:
$DATE, $CYT, $PnN, $PnS, $PnR, $PnG, $PnE).In FlowIO, these are exposed via FlowData attributes such as:
flow.header (HEADER info)flow.text (TEXT keyword dictionary)flow.analysis (ANALYSIS keyword dictionary, if present)flow.as_array(...) (decoded event matrix)as_array(preprocess=True))When preprocessing is enabled, FlowIO applies common FCS transformations:
value = a * 10^(b * raw_value) where PnE = "a,b".To disable all transformations and obtain raw decoded values:
flow.as_array(preprocess=False)FlowIO provides convenience indices for common channel types:
flow.scatter_indices (e.g., FSC/SSC)flow.fluoro_indices (fluorescence channels)flow.time_index (time channel index or None)These indices can be used to slice the event matrix:
events[:, flow.scatter_indices]events[:, flow.fluoro_indices]Some files contain inconsistent offsets between HEADER and TEXT:
ignore_offset_discrepancy=True to tolerate HEADER/TEXT offset mismatch.use_header_offsets=True to prefer HEADER offsets.ignore_offset_error=True to bypass offset-related failures more aggressively.To exclude known null/empty channels during parsing:
FlowData(path, null_channel_list=[...])If a file contains multiple datasets, constructing FlowData(path) may raise MultipleDataSetsError. Use:
read_multiple_data_sets(path) to load all datasets, orFlowData(path, nextdata_offset=...) to load a specific dataset using $NEXTDATA offsets.Two common patterns:
flow.write_fcs("out.fcs", metadata={...})create_fcs(...) to generate a new file (FlowIO does not modify event data in-place).39:["$","$L41",null,{"content":"$42","frontMatter":{"name":"flowio","description":"Parse Flow Cytometry Standard (FCS) files v2.0–3.1 and extract events/metadata for preprocessing workflows (e.g., when you need NumPy arrays, channel info, or CSV/DataFrame export from cytometry files).","license":"MIT","author":"aipoch","source":"aipoch","source_url":"https://github.com/aipoch/medical-research-skills"}}]