Deepfake detection and media safety — detect AI-generated audio, images, video, and text, trace synthesis sources, apply watermarks, verify speaker identity, and analyze media intelligence using Resemble AI
Analyze audio, image, video, and text for synthetic manipulation, AI-generated content, watermarks, speaker identity, and media intelligence using the Resemble AI platform.
"NEVER DECLARE MEDIA AS REAL OR FAKE WITHOUT A COMPLETED DETECTION RESULT."
Do not guess, infer, or speculate about media authenticity. Every authenticity claim must be backed by a completed Resemble detect job with a returned label, score, and status: "completed". If the detection is still processing, wait. If it failed, say so — do not substitute your own judgment.
Use this skill whenever the user's request involves any of these:
Do NOT use for text-to-speech generation, voice cloning, or speech-to-text transcription — those are separate Resemble capabilities.
| User wants to... | Use this | API endpoint |
|---|---|---|
| Check if media is AI-generated / deepfake | Deepfake Detection | POST /detect |
| Know which AI platform made fake audio | Audio Source Tracing | POST /detect with flag |
| Get speaker info, emotion, transcription from media | Intelligence | POST /intelligence |
| Ask questions about a completed detection | Detect Intelligence | POST /detects/{uuid}/intelligence |
| Apply an invisible watermark to media | Watermark Apply | POST /watermark/apply |
| Check if media contains a watermark | Watermark Detect | POST /watermark/detect |
| Verify a speaker's identity against known profiles | Identity Search | POST /identity/search |
| Check if text is AI-generated | Text Detection | POST /text_detect |
| Create a voice identity profile for future matching | Identity Create | POST /identity |
When multiple capabilities apply (e.g., user wants deepfake detection AND intelligence), combine them in a single POST /detect call using the intelligence: true flag rather than making separate requests.
https://app.resemble.ai/api/v2Authorization: Bearer <RESEMBLE_API_KEY>If the user provides a local file path instead of a URL, inform them the file must be hosted at a public HTTPS URL first. Do not attempt to upload local files to the API.
When the Resemble MCP server is connected, use these tools instead of raw API calls:
| Tool | Purpose |
|---|---|
resemble_docs_lookup | Get comprehensive docs for any detect sub-topic |
resemble_search | Search across all documentation |
resemble_api_endpoint | Get exact OpenAPI spec for any endpoint |
resemble_api_search | Find endpoints by keyword |
resemble_get_page | Read specific documentation pages |
resemble_list_topics | List all available topics |
Tool usage pattern: Use resemble_docs_lookup with topic "detect" to get the full picture, then resemble_api_endpoint for exact request/response schemas before making API calls.
The core capability. Submit any audio, image, or video for AI-generated content analysis.
POST /detect
Content-Type: application/json
Authorization: Bearer <API_KEY>
{
"url": "https://example.com/media.mp4",
"visualize": true,
"intelligence": true,
"audio_source_tracing": true
}
Parameters:
| Parameter | Type | Required | Description |
|---|---|---|---|
url | string | Yes | HTTPS URL to audio, image, or video file |
callback_url | string | No | Webhook URL for async completion notification |
visualize | boolean | No | Generate heatmap/visualization artifacts |
intelligence | boolean | No | Run multimodal intelligence analysis alongside detection |
audio_source_tracing | boolean | No | Identify which AI platform synthesized fake audio |
frame_length | integer | No | Audio/video analysis window size in seconds (1–4, default 2) |
start_region | number | No | Start of segment to analyze (seconds) |
end_region | number | No | End of segment to analyze (seconds) |
model_types | string | No | "image" or "talking_head" (for face-swap detection) |
use_reverse_search | boolean | No | Enable reverse image search (image only) |
use_ood_detector | boolean | No | Enable out-of-distribution detection |
zero_retention_mode | boolean | No | Auto-delete media after detection completes |
Supported formats:
Detection is asynchronous. Poll GET /detect/{uuid} until status is "completed" or "failed".
GET /detect/{uuid}
Authorization: Bearer <API_KEY>
Polling best practice: Start at 2s intervals, back off to 5s, then 10s. Most detections complete within 10–60 seconds depending on media length.
Audio results — in metrics:
{
"label": "fake",
"score": ["0.92", "0.88", "0.95"],
"consistency": "0.91",
"aggregated_score": "0.92",
"image": "https://..."
}
label: "fake" or "real" — the verdictscore: Per-chunk prediction scores (array)aggregated_score: Overall confidence (0.0–1.0, higher = more likely synthetic)consistency: How consistent the prediction is across chunksimage: Visualization heatmap URL (if visualize: true)Image results — in image_metrics:
{
"type": "ImageAnalysis",
"label": "fake",
"score": 0.87,
"image": "https://...",
"ifl": { "score": 0.82, "heatmap": "https://..." },
"reverse_image_search_sources": [
{ "url": "...", "title": "...", "verdict": "known_fake", "similarity": 0.95 }
]
}
label / score: Verdict and confidenceifl: Invisible Frequency Layer analysis with heatmapreverse_image_search_sources: Known sources found online (if use_reverse_search: true)Video results — in video_metrics:
{
"label": "fake",
"score": 0.89,
"certainty": 0.91,
"children": [
{
"type": "VideoResult",
"conclusion": "Fake",
"score": 0.89,
"timestamp": 2.5,
"children": [...]
}
]
}
timestamp, score, certainty, and may have nested childrenmetrics (audio) and video_metrics (visual)| Score Range | Interpretation |
|---|---|
| 0.0 – 0.3 | Strong indication of authentic/real media |
| 0.3 – 0.5 | Inconclusive — recommend additional analysis |
| 0.5 – 0.7 | Likely synthetic — flag for review |
| 0.7 – 1.0 | High confidence synthetic/AI-generated |
Always present scores with context. Say "The detection returned a score of 0.87, indicating high confidence that this audio is AI-generated" — never just "it's fake."
Analyze media for rich structured insights independent of or alongside detection.
POST /intelligence
Content-Type: application/json
Authorization: Bearer <API_KEY>
{
"url": "https://example.com/audio.mp3",
"json": true
}
Parameters:
| Parameter | Type | Required | Description |
|---|---|---|---|
url | string | One of | HTTPS URL to media file |
media_token | string | One of | Token from secure upload (alternative to URL) |
detect_id | string | No | UUID of existing detect to associate |
media_type | string | No | "audio", "video", or "image" (auto-detected) |
json | boolean | No | Return structured fields (default: false for audio/video, true for image) |
callback_url | string | No | Webhook for async mode |
Audio/Video structured response (json: true):
speaker_info — speaker description (age, gender)language / dialect — detected languageemotion — detected emotional statespeaking_style — conversational, formal, etc.context — inferred context of the speechmessage — content summaryabnormalities — anomalies detected in the mediatranscription — full transcripttranslation — translation if non-Englishmisinformation — misinformation analysisImage structured response:
scene_description — what the image showssubjects — people/objects identifiedauthenticity_analysis — visual authenticity assessmentcontext_and_setting — environment descriptionabnormalities — visual anomaliesmisinformation — misinformation analysisAfter a detection completes, ask natural-language questions about it:
POST /detects/{detect_uuid}/intelligence
Content-Type: application/json
Authorization: Bearer <API_KEY>
{
"query": "How confident is the model that this audio is fake?"
}
This returns a question UUID. Poll GET /detects/{detect_uuid}/intelligence/{question_uuid} until status is "completed" to get the answer.
Good questions to suggest:
Status flow: pending → processing → completed (or failed)
Prerequisite: The detection must have status: "completed". Submitting a question against a processing or failed detection returns a 422 error.
When audio is detected as synthetic (label: "fake"), identify which AI platform generated it.
Enable it by setting audio_source_tracing: true in the POST /detect request.
Result appears in the detection response under audio_source_tracing:
{
"label": "elevenlabs",
"error_message": null
}
Known source labels include: resemble_ai, elevenlabs, real, and others as the model expands.
Important: Source tracing only runs when audio is labeled as "fake". If the audio is "real", no source tracing result will appear.
Standalone query:
GET /audio_source_tracings — list all source tracing reportsGET /audio_source_tracings/{uuid} — get specific reportApply invisible watermarks to media for provenance tracking, or detect existing watermarks.
POST /watermark/apply
Content-Type: application/json
Authorization: Bearer <API_KEY>
Prefer: wait
{
"url": "https://example.com/image.png",
"strength": 0.3,
"custom_message": "my-organization"
}
| Parameter | Type | Required | Description |
|---|---|---|---|
url | string | Yes | HTTPS URL to media file |
strength | number | No | Watermark strength 0.0–1.0 (image/video only, default 0.2) |
custom_message | string | No | Custom message to embed (image/video only, default "resembleai") |
Prefer: wait header for synchronous responseGET /watermark/apply/{uuid}/resultwatermarked_media URL to download the watermarked filePOST /watermark/detect
Content-Type: application/json
Authorization: Bearer <API_KEY>
Prefer: wait
{
"url": "https://example.com/suspect-image.png"
}
Audio detection result:
{ "has_watermark": true, "confidence": 0.95 }
Image/Video detection result:
{ "has_watermark": true }
Create voice identity profiles and match incoming audio against them.
Beta feature — requires joining the preview program. Inform the user if they encounter access errors.
POST /identity
Content-Type: application/json
Authorization: Bearer <API_KEY>
{
"audio_url": "https://example.com/known-speaker.wav",
"name": "Jane Doe"
}
POST /identity/search
Content-Type: application/json
Authorization: Bearer <API_KEY>
{
"audio_url": "https://example.com/unknown-speaker.wav",
"top_k": 5
}
Response:
{
"success": true,
"item": [
{ "uuid": "...", "name": "Jane Doe", "confidence": 0.92, "distance": 0.08 }
]
}
Lower distance = closer match. Higher confidence = stronger match.
Detect whether text content is AI-generated or human-written.
Beta feature — requires the
detect_beta_userrole or a billing plan that includes thedfd_textproduct.
POST /text_detect
Content-Type: application/json
Authorization: Bearer <API_KEY>
Add the Prefer: wait header for a synchronous (blocking) response. Without it, the job runs asynchronously — poll or use a callback.
Parameters:
| Parameter | Type | Required | Description |
|---|---|---|---|
text | string | Yes | Text to analyze (max 100,000 characters) |
thinking | string | No | Always use "low" (default) |
threshold | float | No | Decision threshold 0.0–1.0 (default: 0.5) |
callback_url | string | No | Webhook URL for async completion notification |
privacy_mode | boolean | No | If true, text content is not stored after analysis |
Response:
{
"success": true,
"item": {
"uuid": "abc-123",
"status": "completed",
"prediction": "ai",
"confidence": 0.91,
"text_content": "This is some text to analyze.",
"privacy_mode": false,
"created_at": "...",
"updated_at": "..."
}
}
prediction: "ai" or "human" — the verdictconfidence: 0.0–1.0, higher = more confident in the predictionstatus: "processing", "completed", or "failed"If you did not use Prefer: wait, poll until status is "completed" or "failed":
GET /text_detect/{uuid}
Authorization: Bearer <API_KEY>
GET /text_detect
Authorization: Bearer <API_KEY>
Returns paginated text detections for the team.
If callback_url was provided, a POST is sent on completion:
{ "success": true, "item": { ... } }
On failure:
{ "success": false, "item": { ... }, "error": "Error message here" }
For a comprehensive analysis, combine all capabilities:
{
"url": "https://example.com/suspect.mp4",
"visualize": true,
"intelligence": true,
"audio_source_tracing": true,
"use_reverse_search": true
}
status: "completed"metrics / image_metrics / video_metrics for the verdictintelligence.description for structured media analysis"fake", check audio_source_tracing.label for the source platformPOST /watermark/detect if provenance is relevantFor a fast pass/fail:
{ "url": "..." }label and aggregated_score (audio) or label and score (image/video)For creators who want to prove their content is authentic:
POST /watermark/applyPOST /watermark/detect against any copy"fake" label with score 0.51 means something very different from score 0.95"fake"zero_retention_mode for sensitive media — Always suggest this flag when the user indicates the media is sensitive or privateintelligence: true and audio_source_tracing: true on the detection call instead of separate requestsWhen presenting results to users:
| Error | Cause | Resolution |
|---|---|---|
| 400 | Invalid request body or missing url | Check required parameters |
| 401 | Invalid or missing API key | Verify RESEMBLE_API_KEY |
| 404 | Detection UUID not found | Verify the UUID from the creation response |
| 422 | Detection not completed (for Intelligence) | Wait for detection to reach completed status |
| 429 | Rate limited | Back off and retry with exponential delay |
| 500 | Server error | Retry once, then report to user |
zero_retention_mode: true to auto-delete media after analysis. The URL is redacted and media_deleted is set to true post-completion.privacy_mode: true on text detection to prevent text content from being stored after analysis.callback_url, ensure the endpoint is HTTPS and authenticated on the receiving end.