Name: Build And Dependency
Author: NVIDIA

스킬 검색.../

Build And Dependency | Skills Pool

# Derive host from your 'gitlab' remote:
GITLAB_HOST=$(git remote get-url gitlab | sed 's/.*@\(.*\):.*/\1/')

docker pull ${GITLAB_HOST}/adlr/megatron-lm/mcore_ci_dev:main

# dev image (default)
docker build \
  --target main \
  --build-arg FROM_IMAGE_NAME=$(cat docker/.ngc_version.dev) \
  --build-arg IMAGE_TYPE=dev \
  -f docker/Dockerfile.ci.dev \
  -t megatron-lm:local .

# lts image
docker build \
  --target main \
  --build-arg FROM_IMAGE_NAME=$(cat docker/.ngc_version.lts) \
  --build-arg IMAGE_TYPE=lts \
  -f docker/Dockerfile.ci.dev \
  -t megatron-lm:local-lts .

docker run --rm --gpus all \
  -v $(pwd):/workspace \
  -w /workspace \
  megatron-lm:local \
  bash -c "<your command>"

srun \
  --nodes=1 --gpus-per-node=8 \
  --container-image megatron-lm:local \
  --container-mounts $(pwd):/workspace \
  --container-workdir /workspace \
  --pty bash

enroot import -o megatron-lm.sqsh dockerd://megatron-lm:local
srun \
  --nodes=1 --gpus-per-node=8 \
  --container-image $(pwd)/megatron-lm.sqsh \
  --container-mounts $(pwd):/workspace \
  --container-workdir /workspace \
  --pty bash

Group	Purpose
`training`	Runtime training extras
`dev`	Full dev environment (TransformerEngine, ModelOpt, …)
`lts`	LTS-safe subset (no ModelOpt)
`test`	pytest, coverage, nemo-run
`linting`	ruff, black, isort, pylint
`build`	Cython, pybind11, nvidia-mathdx

# Full dev + test environment
uv sync --locked --group dev --group test

# Linting only
uv sync --locked --only-group linting

# LTS environment
uv sync --locked --group lts --group test

Acquire a container image — see Step 1 above.
Launch the container interactively — see Step 2 above.

Update the lock file inside the container, then commit it:

# Inside the container:
uv add <package>          # adds to pyproject.toml and resolves
uv lock                   # regenerates uv.lock
# Exit the container, then on the host:
git add pyproject.toml uv.lock
git commit -S -s -m "build: add <package> dependency"

git checkout origin/main -- uv.lock   # take main's version as the base
# then inside the container:
uv lock                               # re-resolve on top of your pyproject.toml changes

# Check mode (no changes applied)
BASE_REF=main CHECK_ONLY=true SKIP_DOCS=false bash tools/autoformat.sh

# Fix mode
BASE_REF=main CHECK_ONLY=false bash tools/autoformat.sh

Problem	Cause	Fix
`uv sync --locked` fails	Dependency conflict or stale `uv.lock`	Re-run `uv lock` inside the container and commit updated lock
`ModuleNotFoundError` after pip install	pip installed outside the uv-managed venv	Use `uv add` and `uv sync`, never bare `pip install`
`uv: command not found` inside container	Wrong container image	Use the `megatron-lm` image built from `Dockerfile.ci.dev`
`No space left on device` during uv ops	Cache fills container's `/root/.cache/`	Mount a host cache dir via `-v $HOME/.cache/uv:/root/.cache/uv`
Pre-commit fails with linting errors	Code style violations	Run `BASE_REF=main CHECK_ONLY=false bash tools/autoformat.sh`
`docker build` fails with secret-related error	`Dockerfile.ci.dev` has a `jet` stage that requires an internal secret	Add `--target main` to stop before the `jet` stage

Build And Dependency

Build & Dependency Guide

Why Containers

Step 1 — Acquire an Image

Build And Dependency

Build & Dependency Guide

Why Containers

Step 1 — Acquire an Image

Step 2 — Launch the Container

Dependency Management

uv Dependency Groups

Adding a New Dependency

Resolving a merge conflict in uv.lock

Linting

Common Pitfalls

Github

Openclaw Parallels Smoke

Update Screenshots

Azure Pipelines

Deployment Patterns

Deployment Patterns