Extend scRNA-seq developmental trajectories with BulkTrajBlend by generating intermediate cells from bulk RNA-seq, training beta-VAE and GNN models, and interpolating missing states.
Invoke this skill when users need to bridge gaps in single-cell developmental trajectories using matched bulk RNA-seq. It follows t_bulktrajblend.ipynb, showcasing how BulkTrajBlend deconvolves PDAC bulk samples, identifies overlapping communities with a GNN, and interpolates "interrupted" cell states.
omicverse as ov, scanpy as sc, scvelo as scv, and helper functions like from omicverse.utils import mde; run ov.plot_set().scv.datasets.dentategyrus()) and raw bulk counts with ov.utils.read(...) followed by ov.bulk.Matrix_ID_mapping(...) for gene ID harmonisation.ov.bulk2single.BulkTrajBlend(bulk_seq=bulk_df, single_seq=adata, bulk_group=['dg_d_1','dg_d_2','dg_d_3'], celltype_key='clusters').bulk_group names correspond to raw bulk columns and the method expects unscaled counts.bulktb.vae_configure(cell_target_num=100) (or pass a dictionary) to define expected cell counts per cluster. Mention that omitting the argument triggers TAPE-based estimation.bulktb.vae_train(batch_size=512, learning_rate=1e-4, hidden_size=256, epoch_num=3500, vae_save_dir='...', vae_save_name='dg_btb_vae', generate_save_dir='...', generate_save_name='dg_btb').bulktb.vae_load('.../dg_btb_vae.pth') and the need to regenerate cells with consistent random seeds for reproducibility.bulktb.vae_generate(leiden_size=25) and inspect compositions with ov.bulk2single.bulk2single_plot_cellprop(...).adata.write_h5ad).bulktb.gnn_configure(max_epochs=2000, use_rep='X', neighbor_rep='X_pca', gpu=0, ...) to set hyperparameters.bulktb.gnn_train(); reload checkpoints with bulktb.gnn_load('save_model/gnn.pth').bulktb.gnn_generate().bulktb.nocd_obj.adata.obsm['X_mde'] = mde(bulktb.nocd_obj.adata.obsm['X_pca']).sc.pl.embedding(..., color=['clusters','nocd_n'], palette=ov.utils.pyomic_palette()) and filtered subsets excluding synthetic labels with hyphens.bulktb.interpolation('OPC') (replace with target lineage) to synthesise continuity, then preprocess the interpolated AnnData (HVG selection, scaling, PCA).mde, visualise with ov.utils.embedding, and compare to the original atlas.ov.single.pyVIA on both original and interpolated data to derive pseudotime, followed by get_pseudotime, sc.pp.neighbors, ov.utils.cal_paga, and ov.utils.plot_paga for topology validation.learning_rate or reduce hidden_size.gnn_train; regenerating cells changes the graph and can break checkpoint loading.cell_target_num thresholds or a smaller leiden_size filter to retain rare populations.t_bulktrajblend.ipynbomicverse_guide/docs/Tutorials-bulk2single/data/reference.md