AML/fraud detection and financial anomaly investigation via GDS geometric analysis — 3-phase process with 25 typology recipes. Use when user asks to "detect fraud", "screen for money laundering", "find suspicious accounts", "run AML scan", "investigate suspicious transactions", "typology detection", "passive scan for fraud", "detect insider trading", "find anomalous billing", "investigate suspicious transactions", or "financial crime detection". Use this skill for ANY financial anomaly investigation, not just AML. Requires hypertopos MCP server with a financial transaction sphere (account + pair + chain patterns).
Specialized AML/fraud investigation using GDS geometric navigation. Requires an active hypertopos MCP session with a financial transaction sphere.
GDS fraud investigation is a 3-phase process: build right, scan passively, verify actively. Active navigation does NOT increase recall — it only helps verify suspects and eliminate false positives. The real detection power comes from multi-source passive scanning across all geometry layers.
Before using graph traversal tools (find_geometric_path, discover_chains), verify the
sphere has an edge table for the relevant event pattern:
edge_stats(pattern_id="<event_pattern>")
If this returns edge counts and degree stats, graph tools are available. If it errors or
returns empty, the pattern lacks from/to FK structure — fall back to find_counterparties
and extract_chains instead.
All typology recipes use relative thresholds based on population geometry, not hardcoded amounts. This makes them dataset-agnostic and currency-independent.
delta_rank_pct > N or conformal_p < X instead of fixed amountsdelta_rank_pct > 95)is_anomaly = true)When a recipe says "total_amount in top percentile for population", interpret as: "total_amount in top percentile
for this population". Use get_polygon → anomaly_dimensions to see which dims drive
the anomaly — the geometry already normalizes for scale.
One call screens the entire population:
passive_scan("<anchor_line>", threshold=2)
This reads all geometry layers (account, pair, chain patterns) and returns entities
flagged by 2+ independent sources. threshold=2 = multi-source confirmed.
passive_scan source types:
geometry (default) — anomaly flag from pattern geometryborderline — near-threshold entities (rank >= threshold, not flagged)points — entity column rules without geometry (e.g., multi-currency filter)compound — geometry expansion intersected with column rulesExample — multi-source scan with mixed types:
passive_scan(home_line_id="<anchor_line>", sources='[
{"type": "geometry", "pattern_id": "<anchor_pattern>"},
{"type": "borderline", "pattern_id": "<anchor_pattern>", "rank_threshold": 80},
{"type": "points", "line_id": "<anchor_line>", "rules": {"<suspicious_property>": [">=", 2]}, "combine": "AND"}
]')
Auto-discover with borderline (also auto-detects graph contagion source if edge table exists):
passive_scan(home_line_id="<anchor_line>", include_borderline=true, borderline_rank_threshold=80)
After passive_scan, batch-score the top suspects by neighborhood contamination:
contagion_score_batch(suspect_keys, pattern_id)
Entities with high contagion ratio are network hubs, not isolated actors — prioritize them.
As-of reconstruction: contagion_score, contagion_score_batch, entity_flow, degree_velocity, propagate_influence, and find_counterparties all accept an optional timestamp_cutoff parameter (Unix seconds). When set, only edges with timestamp <= cutoff are considered — use this to reconstruct what an entity's neighborhood looked like at the time of a known incident, or to validate a detection recipe retroactively against a prior snapshot of the graph.
If passive_scan is not available, manually combine:
1. find_anomalies("<anchor_pattern>", top_n=50) → anomalous entities
2. For pair patterns: execute EVERY profiling_alert call (rank_by_property on each alerted dim)
If total_found >> 50: aggregate_anomalies("<pair_pattern>", group_by="<anchor_key>")
3. find_anomalies("<chain_pattern>", top_n=50) → anomalous chains → expand chain_keys
4. Intersect: entities in 2+ sources = high confidence
When using find_anomalies for fraud screening, set fdr_alpha=0.05 to apply Benjamini-Hochberg FDR control — false positives waste investigator time and erode trust in the alert pipeline, so controlling the false discovery rate is critical. When requesting K>10 results, use select="diverse" to surface different fraud typologies (structuring, layering, round-tripping) instead of 50 variants of the same high-volume pattern; this leverages submodular facility location to maximize typological coverage in the result set. Both parameters also apply to attract_boundary, find_hubs, and find_drifting_entities.
For each suspect from Phase 1:
cross_pattern_profile(suspect_key, line_id="<anchor_line>")
Interpret the result:
source_count >= 3 → investigate immediately (anomalous in all layers)source_count == 2 → high prioritysource_count == 1 → likely false positive, deprioritizerisk_score > 0.5 → high anomaly density across patternsconnected_risk > 80 → counterparties are also anomalous (network signal)Sort suspects by risk_score descending for investigation order.
Note: Recipes below use angle-bracket placeholders (e.g.
<anchor_line>,<event_pattern>) for sphere-specific names. Replace them with the actual line, pattern, and column names from your sphere before calling the tools.
Full investigation of a single suspect:
1. cross_pattern_profile(pk) → multi-source risk overview
2. goto(pk, "<anchor_line>") then
get_polygon("<anchor_pattern>") → anomaly_dimensions (WHY anomalous)
3. find_counterparties(pk, "<event_line>",
"<from_col>", "<to_col>",
pattern_id="<anchor_pattern>") → WHO they transact with
4. contagion_score(pk, "<event_pattern>") → what fraction of neighbors are anomalous
5. find_witness_cohort(pk, "<anchor_pattern>") → peers with similar anomaly profile
6. find_novel_entities("<event_pattern>",
top_n=10, sample_size=1000) → neighborhood deviation screen
7. dive_solid(pk, "<anchor_pattern>") → WHEN behavior changed
8. investigation_coverage(pk, "<event_pattern>",
explored_keys=checked) → coverage check, add unexplored to leads
Graph confirmation chain (steps 4→5→6):
contagion_score > 0.3 → neighborhood is infected, not an isolated outlierwitness_cohort_size > 3 → anomaly signature is shared by non-connected peersfind_novel_entities surfaces entities whose geometry deviates from what their neighbors predict — catches entities that contagion and witness missHigh contagion (>0.3) + large witness cohort + high novelty = confirmed network pattern. Low contagion (<0.2) + empty cohort = isolated anomaly, deprioritize. Borderline contagion (0.2–0.3) = expand cautiously to one hop only before deciding.
Key signals:
anomaly_dimensions + bregman_contribution shows WHICH behavioral features drive the detection
kind: poisson dim anomalous = count/frequency structure deviation (structuring, burst)kind: bernoulli dim anomalous = binary flag triggered (cross-border, FX flag, unusual channel)kind: gaussian dim extreme = magnitude anomaly (large amount, high velocity)pct_of_total in the Bregman breakdown, not just highest abs_deltais_anomaly=true = network confirmationFrom a confirmed suspect, expand the investigation network:
1. find_counterparties(suspect) → list of transaction partners
2. Filter anomalous counterparties → subset with is_anomaly=true
3. For each anomalous counterparty:
cross_pattern_profile(cp_key) → are THEY multi-source flagged?
4. discover_chains(suspect, pattern_id="<event_pattern>",
time_window_hours=72, max_hops=5) → runtime chain discovery (preferred)
5. Filter: chains with cyclic structure → round-trip chains
discover_chains vs extract_chains: discover_chains runs temporal BFS at query
time on the edge table — no pre-built chain_lines required. Use it as the primary chain
discovery tool. Fall back to extract_chains(seed_nodes=[suspect]) when you need
population-level chain statistics or when the sphere has pre-built chain patterns.
Tracing connections between two suspects:
find_geometric_path(from_key=suspect_A, to_key=suspect_B,
pattern_id="<event_pattern>", scoring="anomaly")
This traces how two suspects are connected through the transaction graph. scoring="anomaly"
prioritizes paths through anomalous intermediaries — the most suspicious route between them.
Use scoring="geometric" to find paths through geometrically unusual entities, or
scoring="shortest" for the most direct connection.
Before closing an alert, check for exculpatory evidence:
1. find_similar_entities(suspect, "<anchor_pattern>",
filter_expr="is_anomaly = false", top_n=20)
2. If 10+ normal entities have identical shape → suspect is likely
a legitimate high-activity entity, not fraudulent
3. Check: are counterparties ALL normal? → further FP evidence
Metric and dimension focus for FP elimination:
metric="cosine" — compare anomaly profile shape ignoring magnitude. Entities with same pattern but different scale will be cosine-close. Use when "same type of activity" matters more than "same scale."dim_mask=[<dims from anomaly_dimensions>] — focus similarity on the dimensions that drive the anomaly, ignoring irrelevant ones. Read the target entity's anomaly_dimensions first, then pass those labels as the mask.find_anomalies(metric="Linf") — rank by max single-dimension spike. Catches entities with one extreme dimension that L2 norm dilutes. Use for single-behavior typologies.find_anomalies(metric="bregman") — rank by Bregman divergence. Better than L2 on patterns mixing counts (poisson), amounts (gaussian), and flags (bernoulli). Check dimension_kinds in sphere_overview — if mixed kinds, try bregman first.25 AML typology recipes are documented in references/typologies.md. Load it when the user requests a specific typology or asks for a comprehensive AML scan.
Quick reference:
| Pattern type | Key signal | Investigation tool |
|---|---|---|
| FAN-OUT | high out_degree + many dest_banks | find_counterparties → check all targets; edge_stats confirms degree distribution |
| FAN-IN | high in_sources | find_counterparties → who feeds this account?; edge_stats for inbound concentration |
| CYCLE | pair anomalies + chain is_cyclic | discover_chains(direction="outgoing") → filter cyclic; find_geometric_path(scoring="anomaly") traces the ring |
| STACK | delta_rank_pct 90-95 (borderline) | discover_chains(min_hops=3) → check chain membership at runtime |
| BIPARTITE | pair anomalies + community | aggregate(group_by_property="community_id") |
| RANDOM | chain anomalies | discover_chains(max_hops=6) → longest reachable chains |
| BRIDGE | entity straddles two communities | cluster_bridges(pattern_id) → bridges with anomalous status on both sides are high-risk connectors |
Use these when the sphere has event patterns with edge tables (edge_stats returns has_edge_table: true).
| Recipe | Signal | When to use |
|---|---|---|
| R1 Mirror Transaction | A→B and B→A same amount same day | Circular flow detection |
| R2 Pass-Through | receive + send within 2 hours | Rapid pass-through / layering |
| R3 Burst Detection | many tx to same target in 24h | Structuring / smurfing |
| R4 Weighted Reciprocity | balanced in/out with same counterparty | Round-tripping / wash trading |
| R5 Financial Profile | entity total in/out/net flow | Risk profiling / mule detection |
| R6 Concentration Risk | single counterparty dominates flow | Over-reliance / control |
| R7 Benford's Law | first-digit distribution of amounts | Fabricated transactions |
| R8 Witness Cohort | high witness overlap + trajectory convergence, no existing edge | Fraud cohort expansion — surface peers sharing the target's anomaly signature |
Pattern: Entity A sends to B, and B sends back to A the same (or similar) amount within the same day. Classic circular flow indicator.
Tool sequence:
discover_chains(primary_key, pattern_id, direction="both", time_window_hours=24, min_hops=2) — find short loopskeys[0] == keys[-1] (cyclic) or where the chain returns to a known counterpartyanomalous_edges(from_key, to_key, pattern_id) — inspect individual transactions between the mirror pairabs(edge_a.amount - edge_b.amount) / max(amounts) < 0.05 → strong mirror signalInterpretation: Mirror ratio > 0.95 with same-day timing is a strong indicator. Check if both edges are individually anomalous (is_anomaly=true in event geometry).
Pattern: Entity receives funds and sends within 2 hours. The entity is a conduit, not a destination.
Tool sequence:
discover_chains(primary_key, pattern_id, time_window_hours=2, min_hops=2, max_chains=50) — find rapid chainsentity_flow(primary_key, pattern_id) — check if net_flow ≈ 0 (pass-through entities have balanced flow)anomalous_edges(from_key, to_key, pattern_id) — inspect individual transactions at the bottleneck hopInterpretation: net_flow near zero + chains with tight time windows = layering. degree_velocity showing acceleration confirms recent ramp-up.
Pattern: Many transactions to the same target within 24 hours, each below a reporting threshold.
Tool sequence:
discover_chains(primary_key, pattern_id, time_window_hours=24, max_chains=100) — find all outgoing activityanomalous_edges(from_key, target_key, pattern_id, top_n=50) — get all edges to the repeated targetInterpretation: 5+ transactions to same target in 24h with amounts clustered just below a round reporting threshold is a strong structuring signal.
Pattern: Balanced bidirectional flow between two entities — min(out, in) / max(out, in) close to 1.0.
Tool sequence:
entity_flow(primary_key, pattern_id) — get per-counterparty net flowreciprocity = min(out, in) / max(out, in)anomalous_edges(from_key, counterparty_key, pattern_id) — inspect the transactionsInterpretation: Reciprocity > 0.8 between two entities = suspicious round-tripping. Cross-reference with contagion_score — if the counterparty is also contagious, the pair is high-priority.
Pattern: Entity's total flow reveals its role: source (high out, low in), sink (high in, low out), or mule (high both, near-zero net).
Tool sequence:
entity_flow(primary_key, pattern_id) — get totalscross_pattern_profile(primary_key, line_id) — anomaly status across all patternsnet_flow > 0.7 * outgoing_total → source; net_flow < -0.7 * incoming_total → sink; else → mule candidateInterpretation: Mule candidates (balanced flow, multiple patterns flagged) warrant propagate_influence to map the network they serve.
Pattern: A single counterparty dominates an entity's flow — potential control relationship.
Tool sequence:
entity_flow(primary_key, pattern_id, top_n=5) — get top counterparties by abs(net_flow)concentration = abs(top_1_net_flow) / (outgoing_total + incoming_total)contagion_score(primary_key, pattern_id) — check if the concentrated counterparty is anomalousInterpretation: Concentration > 0.6 means one counterparty controls >60% of flow. If that counterparty is anomalous (contagion), the entity is at high risk.
Pattern: Natural financial data follows Benford's Law for first digits. Fabricated transactions often don't.
Tool sequence:
find_counterparties(primary_key, line_id, from_col, to_col, pattern_id) — get top counterpartiesanomalous_edges(from_key, to_key, pattern_id, top_n=50) — collect per-transaction amountsedge.amount values across counterparties, compute first-digit distributionNote: discover_chains returns only total_amount per chain (aggregate sum), not individual transaction amounts. Use anomalous_edges to get per-transaction amounts needed for Benford analysis.
Interpretation: Chi-squared test against Benford expected frequencies. p-value < 0.05 = amounts are likely not organic. Most effective with 100+ transactions.
Pattern: A confirmed launderer X has a ring of accomplices. Some are already in X's counterparty network (visible via find_counterparties). Others share X's anomaly signature — same witness dimensions, drifting in the same geometric direction — but are NOT yet connected to X via transactions. find_witness_cohort surfaces these geometric peers, ranked by composite witness/delta/trajectory/anomaly score, with already-connected entities filtered out.
Honest scope: This is investigative cohort expansion, NOT edge forecasting. The function does NOT predict that X and the cohort members will transact in the future. It surfaces existing peers worth investigating, not future connections.
Tool sequence:
X (from typology recipes R1–R7 or external intel)find_witness_cohort(X, anchor_pattern_id, top_n=10) — top peers excluding existing counterpartiesmembers[] — focus on entries with witness_overlap >= 0.5 AND is_anomaly == trueY: cross_pattern_profile(Y, line_id) to verify multi-pattern confirmationfind_witness_cohort(Y, anchor_pattern_id) — recursive expansion to map the cohortInterpretation: A cohort member with witness_overlap = 1.0 and trajectory_alignment > 0.95 is strong — shares the SAME structural anomaly signature and the same geometric drift direction as X. The lack of an existing edge is the agent-guidance value: existing counterparties are skipped (often legitimate), so the cohort is denser in unknown peers worth investigating.
False positive guard: Two competitors or two unrelated entities can also share witness profiles. Use cohort members as INVESTIGATIVE RANKING, not as evidence. Combine with domain context before escalation. Many cohort members will not be laundering even when the seed is — the function narrows the search space, it does not eliminate the need for human verification.
Why this is unique vs other tools:
find_similar_entities + is_anomaly = true: returns shape twins via plain ANN. find_similar_entities does not exclude existing counterparties (often legitimate), does not score witness overlap, and does not weight trajectory alignmentMaintain three lists throughout the investigation session:
checked[] — entities where Entity 360 is complete (polygon + explain + counterparties done). Never re-investigate.leads[] — entities flagged by tools but not yet investigated. Each lead carries a lead_score (see Decision Scoring). Sources: find_anomalies, passive_scan, find_witness_cohort, propagate_influence, investigation_coverage.unexplored_anomalous.dead_ends[] — entities investigated and found uninteresting for this thread (delta_rank_pct < 70, no contagion, no temporal signal). Never revisit.Protocol:
checked[] and dead_ends[]. Skip if present.leads[] to checked[].leads[] (deduplicating against all three lists).investigation_coverage(pk, pattern_id, explored_keys=checked) after every deep-dive. If coverage_pct < 0.5 and unexplored_anomalous is non-empty, add those to leads[].checked[] as context.Proactive limits to prevent runaway investigations:
| Guard | Threshold | Action |
|---|---|---|
| Depth limit | 3 hops from seed entity | Stop expanding, summarize findings |
| Strength gate | delta_rank_pct < 70 | Skip entity UNLESS contagion_score > 0.3 or in witness cohort |
| Contagion gate | contagion_score < 0.2 | Do NOT proceed to network expansion — entity is isolated |
| Consecutive call limit | 3 calls to same tool on same entity | Move to next lead |
| Stale lead expiry | Lead untouched for 10+ tool calls | Demote below fresh leads |
| Force-switch | 5 consecutive calls with no new anomalous entities | STOP current thread, switch to highest-scoring lead |
Phase 1 Risk Triage sorts by risk_score for initial suspect selection. Once investigation begins and graph/temporal data becomes available, lead_score supersedes risk_score as the authoritative ordering.
Rank leads by composite score to decide what to investigate next:
lead_score = 0.35 × anomaly_strength
+ 0.25 × graph_support
+ 0.25 × temporal_signal
+ 0.15 × novelty_bonus
| Component | Source | Value |
|---|---|---|
anomaly_strength | delta_rank_pct / 100 | 0.0–1.0 |
graph_support | contagion_score | 0.0–1.0 (0 if unchecked) |
temporal_signal | appears in find_drifting_entities or detect_trajectory_anomaly | 0.0 or 1.0 |
novelty_bonus | appears in find_novel_entities or find_witness_cohort | 0.0 or 1.0 |
Triage levels (anomaly_confidence available for populations <= 50K):
>= 0.7 AND anomaly_confidence >= 0.9 — CRITICAL: investigate immediately>= 0.7 AND anomaly_confidence >= 0.7 — HIGH: investigate in current session>= 0.4 AND anomaly_confidence >= 0.5 — MEDIUM: investigate if time permits< 0.4 OR anomaly_confidence < 0.5 — LOW: skip unless explicitly askedProtocol:
graph_support)."Next: <entity> (score X.XX) | Queue: N leads remaining"is_anomaly alone misses most fraud — check each pattern separately and combine signalsfind_similar_entities returns shape twins, not new suspects — use find_counterparties for network expansiondelta_norm can mean legitimate high-activity entity — cross-reference with business contextextract_chains without seed_nodes causes hub monopolization — prefer discover_chains which takes a single primary_keyfind_geometric_path with scoring="shortest" finds direct connections but misses suspicious intermediaries — use scoring="anomaly" for fraud investigationdiscover_chains time_window_hours defaults broadly — narrow it (e.g. 24-72h) to focus on rapid layering patternsUser says: "Screen this sphere for money laundering"
Actions:
passive_scan("accounts", threshold=2) — multi-source screeningcross_pattern_profile(pk, "accounts") — triage by source_count and risk_scoreget_polygon(pk, "account_pattern") — read anomaly_dimensionsfind_counterparties(pk, "transactions", "from_account", "to_account", pattern_id="account_pattern") — network checkResult: "Screened 515K accounts. 847 flagged by 2+ sources. Top suspect: account 800737690 (source_count=3, risk_score=2.1, connected_risk=87). Anomaly driven by: n_currencies_out (4.1 sigma), burst_tx_out (3.8 sigma)."
User says: "Check for round-tripping patterns"
Actions:
discover_chains(suspect, pattern_id="transaction_pattern", time_window_hours=24, max_hops=5, min_hops=2) — runtime chain discoveryn_distinct_categories >= 2cross_pattern_profile(first_key) — multi-source confirmationfind_geometric_path(from_key=A, to_key=A, scoring="anomaly") — trace the ring path through anomalous intermediariesResult: "Found 12 cyclic chains under 24h. 3 chains with source_count >= 2 flagged as ROUND_TRIPPING_3PARTY. Top chain: A→B→C→A, total_amount top 1%, 3 currencies involved."
User says: "Is account 8013C4030 really suspicious?"
Actions:
cross_pattern_profile("8013C4030", "accounts") — source_count=1 (single source)find_similar_entities("8013C4030", "account_pattern", filter_expr="is_anomaly = false", top_n=20) — 18 normal accounts with identical shapefind_counterparties("8013C4030", ...) — all counterparties normalResult: "Likely false positive. Single-source only, 18 normal geometric twins, all counterparties clean. Recommend close alert."
passive_scan not availableCause: Tool may not be configured in this MCP version.
Solution: Manually combine find_anomalies across all patterns (account, pair, chain) and intersect results. See Phase 1 manual fallback recipe.
extract_chains returns empty or times outCause: No chain pattern built in sphere, or missing seed_nodes parameter (full BFS hangs on hubs).
Solution: Use discover_chains(primary_key, pattern_id, max_hops=5) instead — it runs
temporal BFS on the edge table at query time and does not require pre-built chain_lines.
If discover_chains is not available, pass seed_nodes=[suspect_list] to extract_chains.
Verify edge table exists via edge_stats(pattern_id).
Cause: Sphere has only one pattern (account only). Pair and chain patterns not built.
Solution: Rebuild sphere with composite_lines (pairs) and chain_lines (chains). Single-pattern detection has significantly lower recall than multi-pattern.
find_counterparties returns too many resultsCause: Hub account with hundreds of counterparties.
Solution: Focus on anomalous counterparties only (filter by is_anomaly=true in results). Use cross_pattern_profile for quick triage instead of enumerating all counterparties.