Name: Janelia Cluster
Author: kristinbranson

Janelia Cluster | Skills Pool

Queue	GPU	VRAM	Price/GPU/hr	Slots/GPU	RAM/slot
gpu_a100	A100	80GB	$0.20	12	40GB
gpu_l4	L4	24GB	$0.10	8	15GB
gpu_l4_16	L4	24GB	$0.10	16	15GB
gpu_l4_large	L4	24GB	$0.10	64	15GB
gpu_h100	H100	80GB	$0.50	12	40GB
gpu_h200	H200	141GB	$0.80	12	40GB
gpu_t4	T4	16GB	$0.10	48	15GB
gpu_short	All	-	$0.10	8	15GB

Queue	Runtime Limit	Description
interactive	Default 8h, max 48h	GUI/interactive apps. Limit: 128 slots or 4 jobs per user
local	14 days	Default for jobs without runtime. CPU-optimized nodes. Limit: 5999 slots per user
short	1 hour	Jobs < 1 hour. No slot limit per user. Gets priority scheduling

Option	Description
`-J <name>`	Job name (avoid: usernames, spaces, "spark", "janelia", "master", "int")
`-n <slots>`	Number of slots (1-128). Env var: `LSB_DJOB_NUMPROC`
`-o <file>`	Stdout file (suppresses email notification)
`-e <file>`	Stderr file
`-W <min>`	Hard runtime limit (minutes or HH:MM)
`-We <min>`	Runtime estimate (helps scheduler, won't kill job)

Setting	Description	Janelia Notes
`num=num_gpus`	Number of GPUs	Max = GPUs per host
`mode=shared\|exclusive_process`	GPU sharing mode	Default: exclusive_process
`mps=yes\|no`	Multi-Process Service	Default: no (bugs in the past)
`j_exclusive=yes\|no`	Exclusive GPU access	Do not change; always exclusive
`gmodel=full_model_name`	Request specific GPU model	Only needed for gpu_short; use full model name
`gmem=mem_value`	Minimum GPU memory	Use with gpu_short only; e.g. `gmem=16G`
`nvlink=yes`	Require NVLink	Not needed; A100/H100/H200 always have nvlink

# Single-threaded
bsub -n 1 -J <name> -o /dev/null 'command > output'

# Multi-threaded
bsub -n <1-128> -J <name> -o /dev/null 'command > output'

bsub -n <slots> -J "jobname[1-n]" -o /dev/null 'command file.$LSB_JOBINDEX > output.$LSB_JOBINDEX'

bsub -J "myArray[1-1000]%15" /path/to/mybinary input.$LSB_JOBINDEX

Variable	Description
`$LSB_JOBID`	Job ID number
`$LSB_JOBINDEX`	Array Task Index
`$LSB_JOBINDEX_STEP`	Array step value
`$LSB_BATCH_JID`	Combined JobID and Array Index
`$LSB_DJOB_NUMPROC`	Value of `-n` (slots)
`$LSB_JOBNAME`	Value of `-J` (job name)

lsload -gpuload <hostname>

Path	Backed up	Notes
`/groups/`	Yes (nightly, 30-day offsite)	Primary storage for scientific data
`/nrs/`	No	Cheaper tier for computationally reproducible data
`/scratch/$USER/`	No	Node-local SSD, ~25GB/slot, clean up after job
`/tmp/`	No	Do not use; use `/scratch/` instead

Type	Description
Batch	Single segment, executed once
Array	Parallel independent tasks with same workload
Parallel	Cooperating tasks (MPI), must run simultaneously
Interactive	User login to compute node

Janelia Cluster

Version History