Detect image duplication and tampering in manuscript figures using computer vision algorithms
ID: 195
Uses Computer Vision (CV) algorithms to scan all images in paper manuscripts to detect potential duplication or local tampering (PS traces).
# Scan single PDF file
python scripts/main.py --input paper.pdf --output report.json
# Scan image folder
python scripts/main.py --input ./images/ --output report.json
# Specify similarity threshold (default 0.85)
python scripts/main.py --input paper.pdf --threshold 0.90 --output report.json
# Enable tampering detection
python scripts/main.py --input paper.pdf --detect-tampering --output report.json
# Generate visualization report
python scripts/main.py --input paper.pdf --visualize --output report.json
| Parameter | Type | Default | Required | Description |
|---|---|---|---|---|
--input |
| string |
| - |
| Yes |
| Input PDF file or image folder path |
--output | string | report.json | No | Output report path |
--threshold | float | 0.85 | No | Similarity threshold (0-1), higher is stricter |
--detect-tampering | flag | false | No | Enable tampering/PS trace detection |
--visualize | flag | false | No | Generate visualization comparison images |
--temp-dir | string | ./temp | No | Temporary file directory |
{
"summary": {
"total_images": 12,
"duplicates_found": 2,
"tampering_detected": 1,
"processing_time": "3.5s"
},
"duplicates": [
{
"group_id": 1,
"similarity": 0.98,
"images": [
{"page": 2, "index": 1, "path": "..."},
{"page": 5, "index": 3, "path": "..."}
]
}
],
"tampering": [
{
"image": "page_3_img_2.png",
"suspicious_regions": [
{"x": 120, "y": 80, "width": 50, "height": 50, "confidence": 0.92}
]
}
]
}
opencv-python>=4.8.0
numpy>=1.24.0
Pillow>=10.0.0
PyPDF2>=3.0.0
pdf2image>=1.16.0
imagehash>=4.3.0
scikit-image>=0.21.0
matplotlib>=3.7.0
from scripts.main import ImageDuplicationDetector
detector = ImageDuplicationDetector(
threshold=0.85,
detect_tampering=True
)
results = detector.scan("paper.pdf")
detector.save_report(results, "report.json")
| Risk Indicator | Assessment | Level |
|---|---|---|
| Code Execution | Python/R scripts executed locally | Medium |
| Network Access | No external API calls | Low |
| File System Access | Read input files, write output files | Medium |
| Instruction Tampering | Standard prompt guidelines | Low |
| Data Exposure | Output files saved to workspace | Low |
# Python dependencies
pip install -r requirements.txt