Name: Pdf Processing
Author: Void0312Aurora

Preferred workflow:

Extract full text into a stable workspace artifact such as artifacts/raw_text.txt.
Recover figure-adjacent context into a smaller artifact such as artifacts/figure_context.txt.
Render the figure-relevant pages into image artifacts under artifacts/page_previews/.
Crop or isolate the target figure into a stable reference image such as artifacts/figure_reference.png.
Turn recovered evidence into explicit intermediate files:
- caption.txt
- equations.txt or equations.md
- parameters.json
- plot_semantics.json
- figure_reference_context.json
If direct text extraction is weak, use OCR on rendered pages or cropped figure regions and persist the OCR outputs under artifacts/ocr/.
Preserve exact commands and intermediate files so reviewer can audit provenance.

For paper figure reproduction, visual evidence is mandatory:

Preferred workflow:

Extract full text into a stable workspace artifact such as artifacts/raw_text.txt.
Recover figure-adjacent context into a smaller artifact such as artifacts/figure_context.txt.
Render the figure-relevant pages into image artifacts under artifacts/page_previews/.
Crop or isolate the target figure into a stable reference image such as artifacts/figure_reference.png.
Turn recovered evidence into explicit intermediate files:
- caption.txt
- equations.txt or equations.md
- parameters.json
- plot_semantics.json
- figure_reference_context.json
If direct text extraction is weak, use OCR on rendered pages or cropped figure regions and persist the OCR outputs under artifacts/ocr/.
Preserve exact commands and intermediate files so reviewer can audit provenance.

For paper figure reproduction, visual evidence is mandatory:

Pdf Processing