Name: Ocr Pic
Author: Maggot4703

Ocr Pic | Skills Pool

pip install pypdf Pillow pytesseract numpy

|<-- 100px colour band -->|<-- 100px white text area -->|
  NAVY   = cyan (R≈0, G≈255, B≈255)
  MARINE = red  (R≈255, G≈0,   B≈0)

python ocr-pic.py --pdf file.pdf --pages 37 38 --output-dir /tmp/out
python ocr-pic.py --pdf file.pdf --pages 37-40
python ocr-pic.py --pdf file.pdf --pages 37,39-41 --output-dir /tmp/out

python ocr-pic.py <pdf_path> <page1> [page2 ...] <output_dir>

python ocr-pic.py
# → Tk file-chooser opens for PDF selection
# → Terminal prompt for pages (accepts 37, 37,38, or 37-40)

from pypdf import PdfReader
import io
from PIL import Image

reader = PdfReader(pdf_path)
imgs = list(reader.pages[page_number - 1].images)
for img_obj in imgs:
    raw = Image.open(io.BytesIO(img_obj.data))
    print(f"{img_obj.name}: {raw.size} px")

# Page 37

--- R2C2 | NAVY | X5.png ---
  Unit ID: 0304 Chronor
  Ship Code: Co01
  ...
  TL: TL 13

--- R3C3 | NAVY | X9.png ---
  ...

Line	Label	Example
1	Unit ID	`0304 Chronor`
2	Ship Code	`C001` (1 letter + 3 digits)
3	Squadron Type	`Cruiser`
4	Jump	`J-2`
5	Streamlining	`USL` (CAPS)
6	DF	`DF 7`
7	AF	`AF 3`
8	BF	`BF 0`
9	TF	`TF 7`
10	TL	`TL 13`

Line	Label	Example
1	Unit ID	`0304 Chronor`
2	Code	`0003` (4 digits)
3	Size	`2-Regiment`
4	Quality	`1-Elite`
5	Mobility	`1-Armoured`
6	Mobility	`GRAV` or blank
7	Type	`0-Infantry`
8	CF	`CF 20`
9	TF	`TF 5`
10	TL	`TL 13`

Field	Raw OCR	Corrected	Reason
DF, TF	`DF ?`	`DF 7`	trailing `?` = misread `7`
BF	`BF O`	`BF 0`	trailing `O` = misread `0`
TL	`Tl 13`	`TL 13`	`Tl`/`TI` prefix → `TL` (cosmetic)
TL	`TI if`	`TL`	`if` = unrecoverable — value stripped

OCR output	Actual	Position
`?`	`7`	DF, TF number
`O`	`0`	BF number
`Tl` / `TI`	`TL`	TL label prefix
`o`	`0`	Ship Code digits (e.g. `Co01` = `C001`)
`1?`	`12`	TL trailing digit (verify against PDF)

Ocr Pic

ocr-pic Skill

Purpose

When to Use

Dependencies

Ocr Pic

ocr-pic Skill

Purpose

When to Use

Dependencies

Counter Structure

Field Layouts

NAVY (10 lines)

MARINE (10 lines)

Usage

Named flags (recommended)

Legacy positional (backward-compatible)

Fully interactive (no arguments)

Procedure

1. Confirm inputs

2. Inspect images on the page (read-only check)

3. View a raw image

4. Preprocessing (right-half crop)

5. Type detection

6. Post-processing corrections

7. Output format

8. Flag residual noise

Known Font Confusions (confirmed)

Reference Script

Feishu Doc

Summarize

Nano Pdf

Diffs

Customs Trade Compliance

Nutrient Document Processing