URL-driven website cloning with visual + functional verification
<Use_When>
$visual-verdict for screenshot-only tasks)
</Use_When><Do_Not_Use_When>
$visual-verdict directly<Scope_Limits> v1 scope: Single page clone of the provided URL.
Included:
Excluded:
Legal notice: Only clone sites you own or have explicit permission to replicate. Respect copyright and trademarks. </Scope_Limits>
ToolSearch("browser") or ToolSearch("playwright") to discover available browser tools.Playwright MCP is required. Configure it:
codex mcp add playwright npx "@playwright/mcp@latest"
browser_navigate, browser_snapshot, browser_take_screenshot, browser_evaluate, browser_wait_for. Optional: browser_click, browser_network_requests.
</Prerequisites>
<Tool_Usage>
ToolSearch("browser") or ToolSearch("playwright") to discover deferred Playwright MCP tools.browser_snapshot (accessibility tree) for structural understanding — it is far more token-efficient than screenshots.browser_take_screenshot only when visual verification is needed (Pass 1 baseline, Pass 4 comparison).browser_evaluate for DOM/style extraction — pass the scripts from this skill EXACTLY as written (do not modify them).state_write / state_read for web-clone state persistence between iterations.<State_Management> Persist extraction and progress data so the pipeline can resume if interrupted.
.omx/state/{scope}/web-clone-extraction.json containing:
target_url, extracted_at timestampscreenshot_path (path to target-full.png)landmark_count (number of nav, main, footer, form elements)interactive_count (number of detected interactive elements)extraction_size_kb (approximate size of DOM extraction data).omx/state/{scope}/web-clone-verdicts.json.visual portion of the composite verdict to .omx/state/{scope}/ralph-progress.json for ralph compatibility, mapping visual.score → top-level score and visual.verdict → top-level verdict.completed_at or failed_at timestamp.
</State_Management><Context_Budget> Pass 1 extraction can produce very large data. Apply these limits proactively:
depth parameter from 8 to 4 and re-extract. Focus on top-level structure.isVisible: true).Capture the target page's structure, styles, interactions, and visual baseline.
browser_navigate to target_url.browser_wait_for with appropriate condition (network idle or timeout of 5s) to ensure full render including lazy-loaded content.browser_snapshot — captures the semantic tree (roles, names, values, interactive states). This is your primary structural reference.browser_take_screenshot with fullPage: true — save as reference baseline target-full.png.browser_evaluate with the following script. COPY THIS SCRIPT EXACTLY — do not modify it:
(() => {
const walk = (el, depth = 0) => {
if (depth > 8 || !el.tagName) return null;
const cs = window.getComputedStyle(el);
return {
tag: el.tagName.toLowerCase(),
id: el.id || undefined,
classes: [...el.classList].slice(0, 5),
styles: {
display: cs.display, position: cs.position,
width: cs.width, height: cs.height,
padding: cs.padding, margin: cs.margin,
fontSize: cs.fontSize, fontFamily: cs.fontFamily,
fontWeight: cs.fontWeight, lineHeight: cs.lineHeight,
color: cs.color, backgroundColor: cs.backgroundColor,
border: cs.border, borderRadius: cs.borderRadius,
flexDirection: cs.flexDirection, justifyContent: cs.justifyContent,
alignItems: cs.alignItems, gap: cs.gap,
gridTemplateColumns: cs.gridTemplateColumns,
},
text: el.childNodes.length === 1 && el.childNodes[0].nodeType === 3
? el.textContent?.trim().slice(0, 100) : undefined,
children: [...el.children].map(c => walk(c, depth + 1)).filter(Boolean),
};
};
return walk(document.body);
})()
browser_evaluate to catalog all interactable elements. COPY THIS SCRIPT EXACTLY — do not modify it:
(() => {
const results = [];
document.querySelectorAll(
'button, a[href], input, select, textarea, [role="button"], ' +
'[onclick], [aria-haspopup], [aria-expanded], details, dialog'
).forEach(el => {
results.push({
tag: el.tagName.toLowerCase(),
type: el.type || el.getAttribute('role') || 'interactive',
text: (el.textContent || '').trim().slice(0, 80),
href: el.href || undefined,
ariaLabel: el.getAttribute('aria-label') || undefined,
isVisible: el.offsetParent !== null,
});
});
return results;
})()
Keep all extraction results in working memory for Pass 2.
Analyze extraction results and decompose into a component plan.
Identify page regions: From DOM tree + accessibility snapshot, identify major sections:
Map components: For each region, define:
Create interaction map: From interactive elements list:
href<form> with inputs, labels, validationExtract design tokens: Identify recurring values:
Define file structure:
{output_dir}/
├── index.html (or App.tsx / App.vue)
├── styles/
│ ├── globals.css (reset + tokens)
│ └── components.css (or scoped styles)
├── scripts/
│ └── interactions.js (toggle, modal, dropdown logic)
└── assets/ (placeholder images)
Adapt to tech_stack if specified (React components, Vue SFCs, etc.).
Implement the clone from the plan. Work component-by-component.
<img> for external images<a> tags (to # anchors or stubs for v1)<label>, input types, placeholder textCompare the clone against the original across three dimensions.
Serve the clone: Start a local server for the generated project:
npx serve {output_dir} -l 3456 --no-clipboard
If npx serve is unavailable, fall back to: python3 -m http.server 3456 -d {output_dir}.
The clone will be accessible at http://localhost:3456.
Visual verification:
browser_navigate to clone URL.$visual-verdict with: reference_images=["target-full.png"], generated_screenshot="clone-full.png", category_hint="web-clone".Structural verification: Compare landmark counts:
<nav>, <main>, <footer>, <form>, <button>, <a> in both original and clone.Functional spot-check: Test 2–3 detected interactions via Playwright:
browser_click and browser_snapshot to verify state changes.Emit composite verdict:
{
"visual": {
"score": 82,
"verdict": "revise",
"category_match": true,
"differences": ["Header spacing tighter than original"],
"suggestions": ["Increase nav gap to 24px"]
},
"functional": {
"tested": 3,
"passed": 2,
"failures": ["Dropdown does not open on click"]
},
"structure": {
"landmark_match": true,
"missing": [],
"extra": []
},
"overall_verdict": "revise",
"priority_fixes": [
"Fix dropdown toggle interaction",
"Increase header nav spacing"
]
}
Fix highest-impact issues and re-verify.
priority_fixes. Do not refactor working code.overall_verdict is pass OR max 5 iterations reached.<Output_Contract> After each verification pass, emit a composite web-clone verdict JSON:
{
"visual": {
"score": 0,
"verdict": "revise",
"category_match": false,
"differences": ["..."],
"suggestions": ["..."],
"reasoning": "short explanation"
},
"functional": {
"tested": 0,
"passed": 0,
"failures": ["..."]
},
"structure": {
"landmark_match": false,
"missing": ["..."],
"extra": ["..."]
},
"overall_verdict": "revise",
"priority_fixes": ["..."]
}
Rules:
visual follows the VisualVerdict shape from $visual-verdictfunctional.tested/passed are counts; failures list specific interaction failuresstructure.landmark_match is true when all major HTML landmarks (nav, main, footer, forms) are presentoverall_verdict: pass when visual.score >= 85 AND functional.failures is empty AND structure.landmark_match is truepriority_fixes: ordered by impact, drives the next iteration
</Output_Contract><Iteration_Thresholds>
<Error_Handling>
browser_wait_for with extended timeout) and retry once.Pass 1: Navigate to HN. Extract: table-based layout, orange (#ff6600) nav bar, story list with links + points + comments, footer. Screenshot saved.
Pass 2: Regions: nav bar (logo + links), story table (30 rows × title + meta), footer. Tokens: orange #ff6600, gray #828282, Verdana font, 10pt base. Interaction map: story links (external), comment links, "more" pagination.
Pass 3: Generate index.html with HN-style table layout, CSS matching extracted colors/fonts, working <a> tags for stories.
Pass 4: Visual score=78 (font size off, spacing between stories too tight). Functional 2/2 (links work). Structure match=true.
Pass 5 iteration 1: Fix font to Verdana 10pt, increase row padding → score=88. Functional 2/2. Structure match. → overall_verdict: pass. Done.
</Example>
<Final_Checklist>
output_diroverall_verdict is pass, or max 5 iterations reached with best-effort reportralph-progress.jsonweb-clone-extraction.json
</Final_Checklist>browser_network_requests — note XHR/fetch calls for reference. Do not attempt to replicate backends.