技能檔案

Fry Python Tool

Name: Fry Python Tool
Author: whitehead

Find and use existing GPU-powered computational biology tools for the Whitehead fry cluster, or scaffold a new one. Single-purpose Python tools that produce outputs (embeddings, segmentations, predictions) for downstream analysis.

whitehead0 星標2026年3月27日

職業
分類: 生物信息學

技能內容

Fry Python Tools

Single-purpose, GPU-powered Python tools for computational biology on the Whitehead fry cluster. Each tool wraps one model or library, takes a YAML config, and produces an output (embeddings, segmentations, predictions) that you use downstream in your analysis. Install with conda+uv, run with sbatch.

Available Tools

Install and run any of these on fry right now:

Tool	What it produces	Wraps	Repo
goudacell	Cell segmentation masks	Cellpose	cheeseman-lab/goudacell
emmentalembed	Protein embeddings	ESM	cheeseman-lab/emmentalembed

相關技能

Fry Python Tool | Skills Pool

git clone https://github.com/cheeseman-lab/TOOL.git
cd TOOL
conda create -n TOOL -c conda-forge python=3.11 uv pip -y
conda activate TOOL
uv pip install -e ".[gpu]"

cp configs/example_config.yaml my_config.yaml
# Edit my_config.yaml with your paths
sbatch scripts/run.sh my_config.yaml

TOOL/
├── src/TOOL/
│   ├── __init__.py          # __version__ = "0.1.0"
│   ├── cli.py               # typer CLI entry point
│   └── config.py            # dataclass-backed YAML config
├── configs/
│   └── example_config.yaml  # ship a working example
├── scripts/
│   ├── run.sh               # sbatch GPU job script
│   └── jupyter_gpu.sh       # interactive GPU notebook
├── tests/
│   ├── conftest.py
│   ├── test_install.py      # smoke tests
│   └── test_config.py       # config validation
├── pyproject.toml
├── README.md
└── CLAUDE.md

[build-system]
requires = ["setuptools>=61.0"]
build-backend = "setuptools.build_meta"

[project]
name = "TOOL"
version = "0.1.0"
description = "One-line: what output does this produce?"
readme = "README.md"
requires-python = ">=3.10"
license = { text = "MIT" }
dependencies = [
    "pyyaml>=6.0",
    "typer>=0.9.0",
    "rich>=13.0.0",
]

[project.optional-dependencies]
gpu = [
    "torch==2.7.0",
    # The model/library this tool wraps
]
dev = [
    "pytest>=8.0.0",
    "ruff>=0.4.0",
]

[project.scripts]
TOOL = "TOOL.cli:app"

[tool.setuptools]
package-dir = {"" = "src"}
packages = ["TOOL"]

[tool.ruff]
line-length = 100

[tool.ruff.lint]
select = ["E", "F", "I", "D"]
pydocstyle = { convention = "google" }

[tool.ruff.lint.per-file-ignores]
"tests/*.py" = ["D100", "D103"]

[tool.pytest.ini_options]
testpaths = ["tests"]

# src/TOOL/cli.py
import typer
from rich.console import Console

app = typer.Typer(help="TOOL — one-line description")
console = Console()

@app.command()
def run(config: str = typer.Argument(..., help="Path to config YAML")):
    """Run the tool."""
    from TOOL.config import Config
    cfg = Config.from_yaml(config)
    # ... load model, process input, write output
    console.print("[green]Done![/green]")

@app.command()
def version():
    """Show version and environment info."""
    from TOOL import __version__
    console.print(f"TOOL v{__version__}")

if __name__ == "__main__":
    app()

# src/TOOL/config.py
from dataclasses import dataclass, asdict
import yaml

@dataclass
class Config:
    """Tool configuration."""
    input_dir: str = "."
    output_dir: str = "./output"
    gpu: bool = False
    # ... tool-specific params (model name, batch size, etc.)

    @classmethod
    def from_yaml(cls, path: str) -> "Config":
        with open(path) as f:
            data = yaml.safe_load(f)
        return cls(**data)

    def to_yaml(self, path: str) -> None:
        with open(path, "w") as f:
            yaml.dump(asdict(self), f, default_flow_style=False)

#!/bin/bash
#SBATCH --job-name=TOOL
#SBATCH --nodes=1
#SBATCH --ntasks=1
#SBATCH --cpus-per-task=4
#SBATCH --mem=32gb
#SBATCH --time=04:00:00
#SBATCH --partition=YOUR_GPU_PARTITION
#SBATCH --gres=gpu:1
#SBATCH --output=TOOL-%j.out

# Usage: sbatch scripts/run.sh /path/to/config.yaml

set -e

if [ -z "$1" ]; then
    echo "Usage: sbatch scripts/run.sh /path/to/config.yaml"
    exit 1
fi

CONFIG_PATH="$(realpath "$1")"
CONFIG_DIR="$(dirname "$CONFIG_PATH")"
cd "$CONFIG_DIR"

source ~/.bashrc
conda activate TOOL

echo "================================================"
echo "TOOL — $(date)"
echo "Host: $(hostname)"
echo "Config: ${CONFIG_PATH}"
echo "GPU: $(nvidia-smi --query-gpu=name --format=csv,noheader 2>/dev/null || echo 'none')"
echo "================================================"

TOOL run "${CONFIG_PATH}"

echo "Completed: $(date)"

#!/bin/bash
#SBATCH --job-name=TOOL_jupyter
#SBATCH --nodes=1
#SBATCH --ntasks=1
#SBATCH --cpus-per-task=4
#SBATCH --mem=32gb
#SBATCH --time=04:00:00
#SBATCH --partition=YOUR_GPU_PARTITION
#SBATCH --gres=gpu:1
#SBATCH --output=TOOL_jupyter-%j.out

source ~/.bashrc
conda activate TOOL
unset XDG_RUNTIME_DIR

NOTEBOOK_DIR="${SLURM_SUBMIT_DIR:-$(pwd)}"
jupyter-lab \
    --no-browser \
    --port-retries=0 \
    --ip=0.0.0.0 \
    --port=$(shuf -i 8900-10000 -n 1) \
    --notebook-dir="${NOTEBOOK_DIR}"

"""Smoke tests: does the package install and import correctly?"""

def test_import():
    import TOOL

def test_version():
    from TOOL import __version__
    parts = __version__.split(".")
    assert len(parts) == 3

def test_cli_entry_point():
    import subprocess
    result = subprocess.run(["TOOL", "--help"], capture_output=True, text=True)
    assert result.returncode == 0

def test_dependencies_importable():
    import yaml
    import typer
    import rich

from TOOL.config import Config

def test_load_example_config(example_config):
    cfg = Config.from_yaml(str(example_config))
    assert cfg.input_dir is not None

def test_config_roundtrip(tmp_path):
    cfg = Config()
    path = tmp_path / "test_config.yaml"
    cfg.to_yaml(str(path))
    cfg2 = Config.from_yaml(str(path))
    assert cfg == cfg2

Fry Python Tool

Fry Python Tools

Available Tools

Fry Python Tool

Fry Python Tools

Available Tools

Quick install (any tool)

Building a New Tool

When to use this

Project structure

pyproject.toml

CLI (typer + rich)

YAML Config (dataclass-backed)

SLURM GPU Scripts

Smoke Tests

README

Instructions

If the user wants to use an existing tool:

If the user wants to build a new tool:

If restructuring an existing script into a tool:

Key Principles

Nanoclaw Repl

Bioinformatics

Smart Explore

Vector Database Engineer

Skin Health Analyzer

Scanpy