Bulk RNA-seq QC Pipeline

Step 1: Load Count Matrix

Load the bulk RNA-seq count matrix from the data path using scanpy (sc.read_h5ad). Print the shape (genes x samples). Show the first few rows and sample metadata (adata.obs).

Step 2: Basic QC Metrics

import pandas as pd
import numpy as np

total_counts = counts.sum(axis=0)
detected_genes = (counts > 0).sum(axis=0)
print(f"Samples: {counts.shape[1]}, Genes: {counts.shape[0]}")
print(f"Median total counts/sample: {total_counts.median():.0f}")
print(f"Median detected genes/sample: {detected_genes.median():.0f}")

Step 3: Normalize Counts

Apply CPM (counts per million) normalization followed by log2(x+1) transform. Store the normalized matrix. Print the shape.skip_save = True for this instruction.

Bulk RNA-seq QC Pipeline

Step 1: Load Count Matrix

Load the bulk RNA-seq count matrix from the data path using scanpy (sc.read_h5ad). Print the shape (genes x samples). Show the first few rows and sample metadata (adata.obs).

Step 2: Basic QC Metrics

import pandas as pd
import numpy as np

total_counts = counts.sum(axis=0)
detected_genes = (counts > 0).sum(axis=0)
print(f"Samples: {counts.shape[1]}, Genes: {counts.shape[0]}")
print(f"Median total counts/sample: {total_counts.median():.0f}")
print(f"Median detected genes/sample: {detected_genes.median():.0f}")

Step 3: Normalize Counts

Apply CPM (counts per million) normalization followed by log2(x+1) transform. Store the normalized matrix. Print the shape.skip_save = True for this instruction.

Bulk Rna Qc

Bulk RNA-seq QC Pipeline

Step 1: Load Count Matrix

Step 2: Basic QC Metrics

Step 3: Normalize Counts

Bulk Rna Qc

Bulk RNA-seq QC Pipeline

Step 1: Load Count Matrix

Step 2: Basic QC Metrics

Step 3: Normalize Counts

Step 4: PCA Plot

Clickhouse Io

Clickhouse Io

Claude Devfleet

Clickhouse Io

Ai First Engineering

Postgres Patterns