Objective

Prevent data hallucination and inefficiency by mandating that agents inspect, validate, and fully extract data from provided reference files before attempting web searches or generating synthetic data. When extraction is incomplete, use fallback strategies before making assumptions.

Critical Rule

If a reference file is provided in the task context, it is the source of truth. Do not fabricate data or search the web for information that may exist within the provided attachments. If extraction appears incomplete, attempt alternative methods before proceeding.

Workflow Steps

1. Scan Context for Attachments

At the start of every task, explicitly list all files provided in the context window or attachment panel.

Check for spreadsheets (.xlsx, .csv), documents (.pdf, .docx, .pptx), or data dumps (.json, , , ).

Objective

Critical Rule

Workflow Steps

1. Scan Context for Attachments

At the start of every task, explicitly list all files provided in the context window or attachment panel.

Check for spreadsheets (.xlsx, .csv), documents (.pdf, .docx, .pptx), or data dumps (.json, , , ).

Tool	Known Limitations	Fallback Strategy
`read_file`	May truncate at ~1000-5000 chars depending on format	Use `execute_code_sandbox` with format-specific library
`execute_code_sandbox`	May have missing dependencies or sandbox errors	Use `shell_agent` or `run_shell` for CLI tools
`shell_agent`	Slower, but more flexible with system tools	Use for `pdftotext`, `unzip` XML extraction, `in2csv`, etc.

Resilient Context Extraction

Objective

Critical Rule

Workflow Steps

1. Scan Context for Attachments

Resilient Context Extraction

Objective

Critical Rule

Workflow Steps

1. Scan Context for Attachments

2. Evaluate Relevance

3. Extract Data with Validation

4. Apply Fallback Extraction Strategies

4a. Alternative Library/Tool Approach

4b. Shell-Based Extraction

4c. Targeted Re-Reading

4d. Document What's Missing

5. Cite Source Explicitly

6. Fallback to Search (Last Resort Only)

Checklist

Example Usage

Warnings

Tool-Specific Notes

Recovery Protocol

Fallback extraction strategies for common file formats

Use when read_file produces incomplete results

Usage: ./extraction_fallback.sh <command> <file>

Commands: docx_raw, xlsx_csv, pdf_text, raw_strings

Feishu Doc

Summarize

Nano Pdf

Diffs

Customs Trade Compliance

Nutrient Document Processing