Name: Quests
Author: nlz25

搵技能.../

Quests | Skills Pool

active_learning.py

python skills/quests/active_learning.py filter-by-entropy <iter_confs> [options]

Flag	Type	Default	Description
`iter_confs`	paths	required	One or more candidate structure files (positional, space-separated, any ASE-readable format)
`--reference`	paths	`[]`	Reference structure files already in the dataset (used to compute baseline entropy; excluded from selection)
`--chunk-size`	int	`10`	Structures added per iteration
`--k`	int	`32`	Number of nearest neighbours for descriptor
`--cutoff`	float	`5.0`	Cutoff radius in Å
`--batch-size`	int	`1000`	Batch size for entropy computation
`--h`	float	`0.015`	Bandwidth parameter h
`--max-sel`	int	`50`	Maximum structures to select

# Select up to 50 diverse structures from a pool
python skills/quests/active_learning.py filter-by-entropy candidates.extxyz --max-sel 50

# Select relative to an existing reference/training dataset
python skills/quests/active_learning.py filter-by-entropy \
    new_structures.extxyz --reference training_set.extxyz --max-sel 100

# Multiple candidate files, tighter bandwidth
python skills/quests/active_learning.py filter-by-entropy \
    pool1.extxyz pool2.extxyz --h 0.01 --cutoff 6.0 --max-sel 200

# Parse the output path from JSON
result=$(python skills/quests/active_learning.py filter-by-entropy candidates.extxyz --max-sel 50)
echo "$result" | python -c "import sys,json; d=json.load(sys.stdin); print(d['selected_atoms'])"

quests <subcommand> [OPTIONS] <positional args>

Command	Purpose
`active_learning`	Iterative active-learning loop: sample, score, select new structures
`entropy`	Compute the entropy of a structure dataset
`dH`	Compute per-frame entropy gain (ΔH) of a test set relative to a reference
`approx_dH`	Fast approximate ΔH using graph-based nearest-neighbour index
`make_descriptors`	Compute and export per-atom QUEST descriptors
`bandwidth`	Estimate a good bandwidth h from the mean atomic volume
`compress`	Compress a dataset
`entropy_sampler`	Sample structures by entropy
`learning_curve`	Compute learning curve statistics
`mcmc`	Monte Carlo structure generation
`overlap`	Compute overlap between two datasets

quests entropy [OPTIONS] FILE

Option	Default	Description
`-c`, `--cutoff`	`5.0`	Neighbour-list cutoff (Å)
`-k`, `--nbrs`	`32`	Number of neighbours for descriptor
`-b`, `--bandwidth`	`0.015`	Kernel bandwidth h
`-j`, `--jobs`	all	Parallel jobs
`--batch_size`	`20000`	Distance batch size
`-o`, `--output`	—	Path to JSON output file
`--overwrite`	—	Overwrite existing output

quests entropy dataset.extxyz -o entropy.json

quests dH [OPTIONS] TEST REFERENCE

Option	Default	Description
`-c`, `--cutoff`	`5.0`	Neighbour-list cutoff (Å)
`-k`, `--nbrs`	`32`	Number of neighbours for descriptor
`-b`, `--bandwidth`	`0.015`	Kernel bandwidth h
`-j`, `--jobs`	all	Parallel jobs
`--batch_size`	`20000`	Distance batch size
`-o`, `--output`	—	Path to JSON output file
`--overwrite`	—	Overwrite existing output

# Score new candidates against the existing training set
quests dH candidates.extxyz training_set.extxyz -o scores.json

quests approx_dH [OPTIONS] TEST REFERENCE

quests approx_dH candidates.extxyz training_set.extxyz -o scores.json

quests active_learning [OPTIONS] REFERENCE

Option	Default	Description
`-s`, `--structures`	—	Number of structures to sample from reference
`-n`, `--n_steps`	`1000`	Monte Carlo steps
`-t`, `--target`	`30`	Target ΔH for new structure generation
`-g`, `--generations`	`10`	Number of active-learning generations
`-c`, `--cutoff`	`5.0`	Neighbour-list cutoff (Å)
`-k`, `--nbrs`	`32`	Number of neighbours for descriptor
`-b`, `--bandwidth`	`0.015`	Kernel bandwidth h
`-j`, `--jobs`	all	Parallel jobs
`--batch_size`	`20000`	Distance batch size
`-o`, `--output`	—	Path to JSON output file
`--overwrite`	—	Overwrite existing output
`--full`	—	Output = union of original + new dataset

quests active_learning training_set.extxyz -g 5 -t 20 -o new_structures.json

quests make_descriptors [OPTIONS] FILE

Option	Default	Description
`-c`, `--cutoff`	`5.0`	Neighbour-list cutoff (Å)
`-k`, `--nbrs`	`32`	Number of neighbours
`-j`, `--jobs`	all	Parallel jobs
`-r`, `--reshape`	—	Reshape to `(n_frames, n_atoms, d)` — requires uniform atom count
`-o`, `--output`	—	Output file path

quests make_descriptors dataset.extxyz -o descriptors.npy

quests bandwidth [OPTIONS] ATOMIC_VOLUME

quests bandwidth 20.5   # pass mean atomic volume in Å³

Parameter	Typical range	Effect
`--h` / `-b` (bandwidth)	0.005 – 0.05	Smaller → finer discrimination; larger → broader diversity measure. Use `quests bandwidth` to estimate.
`--cutoff` / `-c`	4.0 – 8.0 Å	Larger → more environment context, slower
`--k` / `-k` (nbrs)	16 – 64	More neighbours → richer descriptor, slower
`--max-sel`	—	Hard cap on selected structures; set to your labelling budget

Option	Default	Description
`-n`, `--uq_nbrs`	`3`	Neighbours for UQ descriptor
`-g`, `--graph_nbrs`	`10`	Neighbours for the graph index

Quests

Overview

— optimal structure pool filtering

Quests

Overview

— optimal structure pool filtering

Command: `filter-by-entropy`

`quests` CLI reference

Available commands

`quests entropy` — dataset entropy

`quests dH` — entropy gain ΔH

`quests approx_dH` — fast approximate ΔH

`quests active_learning` — iterative selection loop

`quests make_descriptors` — export descriptors

`quests bandwidth` — estimate bandwidth

Key parameters guide

Healthcare Cdss Patterns

Drug Discovery

Qmd

Attack Tree Construction

Azure Ai Anomalydetector Java

Viboscope

Quests

Overview

— optimal structure pool filtering

Quests

Overview

— optimal structure pool filtering

Command: filter-by-entropy

quests CLI reference

Available commands

quests entropy — dataset entropy

quests dH — entropy gain ΔH

quests approx_dH — fast approximate ΔH

quests active_learning — iterative selection loop

quests make_descriptors — export descriptors

quests bandwidth — estimate bandwidth

Key parameters guide

Healthcare Cdss Patterns

Drug Discovery

Qmd

Attack Tree Construction

Azure Ai Anomalydetector Java

Viboscope

Command: `filter-by-entropy`

`quests` CLI reference

`quests entropy` — dataset entropy

`quests dH` — entropy gain ΔH

`quests approx_dH` — fast approximate ΔH

`quests active_learning` — iterative selection loop

`quests make_descriptors` — export descriptors

`quests bandwidth` — estimate bandwidth