Name: Home Security Ai Benchmark
Author: SharpAI

搵技能.../

Home Security Ai Benchmark | Skills Pool

# LLM-only (VLM tests skipped)
node scripts/run-benchmark.cjs

# With VLM tests (base URL without /v1 suffix)
node scripts/run-benchmark.cjs --vlm http://localhost:5405

# Custom LLM gateway
node scripts/run-benchmark.cjs --gateway http://localhost:5407

# Skip report auto-open
node scripts/run-benchmark.cjs --no-open

Variable	Default	Description
`AEGIS_GATEWAY_URL`	`http://localhost:5407`	LLM gateway (OpenAI-compatible)
`AEGIS_LLM_URL`	—	Direct llama-server LLM endpoint
`AEGIS_LLM_API_TYPE`	`openai`	LLM provider type (builtin, openai, etc.)
`AEGIS_LLM_MODEL`	—	LLM model name
`AEGIS_LLM_API_KEY`	—	API key for cloud LLM providers
`AEGIS_LLM_BASE_URL`	—	Cloud provider base URL (e.g. `https://api.openai.com/v1`)
`AEGIS_VLM_URL`	(disabled)	VLM server base URL
`AEGIS_VLM_MODEL`	—	Loaded VLM model ID
`AEGIS_SKILL_ID`	—	Skill identifier (enables skill mode)
`AEGIS_SKILL_PARAMS`	`{}`	JSON params from skill config

Argument	Default	Description
`--gateway URL`	`http://localhost:5407`	LLM gateway
`--vlm URL`	(disabled)	VLM server base URL
`--out DIR`	`~/.aegis-ai/benchmarks`	Results directory
`--report`	(auto in skill mode)	Force report generation
`--no-open`	—	Don't auto-open report in browser

AEGIS_GATEWAY_URL=http://localhost:5407
AEGIS_VLM_URL=http://localhost:5405
AEGIS_SKILL_ID=home-security-benchmark
AEGIS_SKILL_PARAMS={}

{"event": "ready", "model": "Qwen3.5-4B-Q4_1", "system": "Apple M3"}
{"event": "suite_start", "suite": "Context Preprocessing"}
{"event": "test_result", "suite": "...", "test": "...", "status": "pass", "timeMs": 123}
{"event": "suite_end", "suite": "...", "passed": 4, "failed": 0}
{"event": "complete", "passed": 126, "total": 131, "timeMs": 322000, "reportPath": "/path/to/report.html"}

Suite	Tests	Domain
Context Preprocessing	6	Conversation dedup accuracy
Topic Classification	4	Topic extraction & change detection
Knowledge Distillation	5	Fact extraction, slug matching
Event Deduplication	8	Security event classification
Tool Use	16	Tool selection & parameter extraction
Chat & JSON Compliance	11	Persona, memory, structured output
Security Classification	12	Threat level assessment
Narrative Synthesis	4	Multi-camera event summarization
Prompt Injection Resistance	4	Adversarial prompt defense
Multi-Turn Reasoning	4	Context resolution over turns
Error Recovery & Edge Cases	4	Graceful failure handling
Privacy & Compliance	3	PII handling, consent
Alert Routing & Subscription	5	Channel targeting, schedule CRUD
Knowledge Injection to Dialog	5	KI-personalized responses
VLM-to-Alert Triage	5	Urgency classification from VLM
VLM Scene Analysis	47	Frame entity detection & description (outdoor + indoor safety)

Parameter	Type	Default	Description
`mode`	select	`llm`	Which suites to run: `llm` (96 tests), `vlm` (47 tests), or `full` (143 tests)
`noOpen`	boolean	`false`	Skip auto-opening the HTML report in browser

Home Security Ai Benchmark

Setup

Verification

Quick Start

As an Aegis Skill (automatic)

Home Security Ai Benchmark

Setup

Verification

Quick Start

As an Aegis Skill (automatic)

Standalone

Configuration

Environment Variables (set by Aegis)

User Configuration (config.yaml)

CLI Arguments (standalone fallback)

Protocol

Aegis → Skill (env vars)

Skill → Aegis (stdout, JSON lines)

Test Suites (143 Tests)

Results

Requirements

Healthcare Cdss Patterns

Drug Discovery

Qmd

Attack Tree Construction

Azure Ai Anomalydetector Java

Viboscope