Parse FCS (Flow Cytometry Standard) files v2.0-3.1. Extract events as NumPy arrays, read metadata/channels, convert to CSV/DataFrame, for flow cytometry data preprocessing.
FlowIO is a lightweight Python library for reading and writing Flow Cytometry Standard (FCS) files. Parse FCS metadata, extract event data, and create new FCS files with minimal dependencies. The library supports FCS versions 2.0, 3.0, and 3.1, making it ideal for backend services, data pipelines, and basic cytometry file operations.
This skill should be used when:
Related Tools: For advanced flow cytometry analysis including compensation, gating, and FlowJo/GatingML support, recommend FlowKit library as a companion to FlowIO.
uv pip install flowio
Requires Python 3.9 or later.
from flowio import FlowData
# Read FCS file
flow_data = FlowData('experiment.fcs')
# Access basic information
print(f"FCS Version: {flow_data.version}")
print(f"Events: {flow_data.event_count}")
print(f"Channels: {flow_data.pnn_labels}")
# Get event data as NumPy array
events = flow_data.as_array() # Shape: (events, channels)
import numpy as np
from flowio import create_fcs
# Prepare data
data = np.array([[100, 200, 50], [150, 180, 60]]) # 2 events, 3 channels
channels = ['FSC-A', 'SSC-A', 'FL1-A']
# Create FCS file
create_fcs('output.fcs', data, channels)
The FlowData class provides the primary interface for reading FCS files.
Standard Reading:
from flowio import FlowData
# Basic reading
flow = FlowData('sample.fcs')
# Access attributes
version = flow.version # '3.0', '3.1', etc.
event_count = flow.event_count # Number of events
channel_count = flow.channel_count # Number of channels
pnn_labels = flow.pnn_labels # Short channel names
pns_labels = flow.pns_labels # Descriptive stain names
# Get event data
events = flow.as_array() # Preprocessed (gain, log scaling applied)
raw_events = flow.as_array(preprocess=False) # Raw data
Memory-Efficient Metadata Reading:
When only metadata is needed (no event data):
# Only parse TEXT segment, skip DATA and ANALYSIS
flow = FlowData('sample.fcs', only_text=True)
# Access metadata
metadata = flow.text # Dictionary of TEXT segment keywords
print(metadata.get('$DATE')) # Acquisition date
print(metadata.get('$CYT')) # Instrument name
Handling Problematic Files:
Some FCS files have offset discrepancies or errors:
# Ignore offset discrepancies between HEADER and TEXT sections
flow = FlowData('problematic.fcs', ignore_offset_discrepancy=True)
# Use HEADER offsets instead of TEXT offsets
flow = FlowData('problematic.fcs', use_header_offsets=True)
# Ignore offset errors entirely
flow = FlowData('problematic.fcs', ignore_offset_error=True)
Excluding Null Channels:
# Exclude specific channels during parsing
flow = FlowData('sample.fcs', null_channel_list=['Time', 'Null'])
FCS files contain rich metadata in the TEXT segment.
Common Metadata Keywords:
flow = FlowData('sample.fcs')
# File-level metadata
text_dict = flow.text
acquisition_date = text_dict.get('$DATE', 'Unknown')
instrument = text_dict.get('$CYT', 'Unknown')
data_type = flow.data_type # 'I', 'F', 'D', 'A'
# Channel metadata
for i in range(flow.channel_count):
pnn = flow.pnn_labels[i] # Short name (e.g., 'FSC-A')
pns = flow.pns_labels[i] # Descriptive name (e.g., 'Forward Scatter')
pnr = flow.pnr_values[i] # Range/max value
print(f"Channel {i}: {pnn} ({pns}), Range: {pnr}")
Channel Type Identification:
FlowIO automatically categorizes channels:
# Get indices by channel type
scatter_idx = flow.scatter_indices # [0, 1] for FSC, SSC
fluoro_idx = flow.fluoro_indices # [2, 3, 4] for FL channels
time_idx = flow.time_index # Index of time channel (or None)
# Access specific channel types
events = flow.as_array()
scatter_data = events[:, scatter_idx]
fluorescence_data = events[:, fluoro_idx]
ANALYSIS Segment:
If present, access processed results:
if flow.analysis:
analysis_keywords = flow.analysis # Dictionary of ANALYSIS keywords
print(analysis_keywords)
Generate FCS files from NumPy arrays or other data sources.
Basic Creation:
import numpy as np
from flowio import create_fcs
# Create event data (rows=events, columns=channels)
events = np.random.rand(10000, 5) * 1000
# Define channel names
channel_names = ['FSC-A', 'SSC-A', 'FL1-A', 'FL2-A', 'Time']
# Create FCS file
create_fcs('output.fcs', events, channel_names)
With Descriptive Channel Names:
# Add optional descriptive names (PnS)
channel_names = ['FSC-A', 'SSC-A', 'FL1-A', 'FL2-A', 'Time']
descriptive_names = ['Forward Scatter', 'Side Scatter', 'FITC', 'PE', 'Time']
create_fcs('output.fcs',
events,
channel_names,
opt_channel_names=descriptive_names)
With Custom Metadata:
# Add TEXT segment metadata
metadata = {
'$SRC': 'Python script',
'$DATE': '19-OCT-2025',
'$CYT': 'Synthetic Instrument',
'$INST': 'Laboratory A'
}
create_fcs('output.fcs',
events,
channel_names,
opt_channel_names=descriptive_names,
metadata=metadata)
Note: FlowIO exports as FCS 3.1 with single-precision floating-point data.
Modify existing FCS files and re-export them.
Approach 1: Using write_fcs() Method:
from flowio import FlowData
# Read original file
flow = FlowData('original.fcs')
# Write with updated metadata
flow.write_fcs('modified.fcs', metadata={'$SRC': 'Modified data'})
Approach 2: Extract, Modify, and Recreate:
For modifying event data:
from flowio import FlowData, create_fcs
# Read and extract data
flow = FlowData('original.fcs')
events = flow.as_array(preprocess=False)
# Modify event data
events[:, 0] = events[:, 0] * 1.5 # Scale first channel
# Create new FCS file with modified data
create_fcs('modified.fcs',
events,
flow.pnn_labels,
opt_channel_names=flow.pns_labels,
metadata=flow.text)
Some FCS files contain multiple datasets in a single file.
Detecting Multi-Dataset Files:
from flowio import FlowData, MultipleDataSetsError