Skill-Datei

Commons Upload — Wikimedia Commons Image Pipeline

Name: Commons Upload — Wikimedia Commons Image Pipeline
Author: michi-onl

Evaluate, curate, and prepare images for upload to Wikimedia Commons. Use this skill whenever the user wants to upload photos to Wikimedia Commons, contribute images to Wikipedia, prepare stock photos for Commons, assess image quality for Commons, create Wikimedia description boxes, generate {{Information}} templates, rename images for Commons conventions, or categorize photos for Wikimedia. Also trigger when the user mentions "Commons upload", "Wikimedia", "Commons-tauglich", "Bilder hochladen", "Commons-Beschreibung", or has a batch of images they want to filter and prepare for contribution. This skill covers the full pipeline from raw image directory to upload-ready files with descriptions — not just one step.

michi-onl0 Sterne02.04.2026

Beruf
Kategorien: Wissensdatenbank

Skill-Inhalt

Take a directory of images and upload them to Wikimedia Commons: technically vetted, visually reviewed, deduplicated, renamed, described, categorized, metadata-stripped, and uploaded via Pywikibot.

Pipeline Overview

Ten steps. Steps 1-7 produce the upload-ready set. Steps 8-10 handle the actual upload.

Raw images
  → 1. Resolution check (drop sub-2MP)
  → 2. EXIF extraction (flag technical issues)
  → 3. Format duplicate pruning (.JPG vs .jpeg)
  → 4. Gather location context for image clusters
  → 5. Visual review + tier classification + near-duplicate flagging
  → 6. Resolve near-duplicates (pick best per group)
  → 7. Copy+rename to upload/, strip metadata, generate descriptions
  → 8. Dry-run preview (user confirmation gate)
  → 9. Upload to Commons via Pywikibot
  → 10. Post-upload verification + log

Before Starting

Use these defaults unless the user explicitly overrides them for a given session:

Wikimedia Commons username: Mike is Michi

Verwandte Skills

Commons Upload — Wikimedia Commons Image Pipeline | Skills Pool

Skill-Datei

Commons Upload — Wikimedia Commons Image Pipeline

michi-onl0 Sterne02.04.2026

Beruf
Kategorien: Wissensdatenbank

Skill-Inhalt

Take a directory of images and upload them to Wikimedia Commons: technically vetted, visually reviewed, deduplicated, renamed, described, categorized, metadata-stripped, and uploaded via Pywikibot.

Pipeline Overview

Ten steps. Steps 1-7 produce the upload-ready set. Steps 8-10 handle the actual upload.

Raw images
  → 1. Resolution check (drop sub-2MP)
  → 2. EXIF extraction (flag technical issues)
  → 3. Format duplicate pruning (.JPG vs .jpeg)
  → 4. Gather location context for image clusters
  → 5. Visual review + tier classification + near-duplicate flagging
  → 6. Resolve near-duplicates (pick best per group)
  → 7. Copy+rename to upload/, strip metadata, generate descriptions
  → 8. Dry-run preview (user confirmation gate)
  → 9. Upload to Commons via Pywikibot
  → 10. Post-upload verification + log

Before Starting

Use these defaults unless the user explicitly overrides them for a given session:

Wikimedia Commons username: Mike is Michi

Verwandte Skills

Metric	How to get it
Resolution (already have from step 1)	—
File size	stat / os.path.getsize
ISO	EXIF ISOSpeedRatings / kMDItemISOSpeed
Shutter speed	EXIF ExposureTime / kMDItemExposureTimeSeconds
Aperture	EXIF FNumber / kMDItemFNumber
Focal length	EXIF FocalLength / kMDItemFocalLength
Date taken	EXIF DateTimeOriginal / kMDItemContentCreationDate
Camera model	EXIF Model (used to set ISO threshold: phone vs camera)
Lens ID	EXIF LensModel (used to detect front camera — see step 2b)

import urllib.request, json, time

clusters = [("A", lat, lon), ...]  # from clustering step

for label, lat, lon in clusters:
    url = (f"https://nominatim.openstreetmap.org/reverse?"
           f"lat={lat}&lon={lon}&format=json&zoom=16&addressdetails=1")
    req = urllib.request.Request(url,
        headers={"User-Agent": "commons-upload-pipeline/1.0"})
    resp = urllib.request.urlopen(req)
    data = json.loads(resp.read())
    addr = data.get("address", {})
    suburb = addr.get("suburb") or addr.get("neighbourhood") or addr.get("quarter") or ""
    road = addr.get("road", "")
    city = addr.get("city") or addr.get("town") or ""
    print(f"Cluster {label}: {suburb}, {road}, {city}")
    time.sleep(1.1)  # Nominatim requires max 1 req/sec

Criterion	What to look for
Subject matter	Is it identifiable? Would a Wikipedia article use it?
Composition	Clean framing, no distracting elements
Focus / sharpness	Is the subject in focus?
Lighting	Blown highlights, crushed shadows, harsh midday light
Obstructions	Cables, poles, fingers, watermarks, logos
People	Recognizable faces → model release concern. Flag for user.
Redundancy	Near-duplicate of another image in the batch — mark the group

for f in upload/*.JPG upload/*.jpg; do
  xattr -d com.apple.metadata:kMDItemComment "$f" 2>/dev/null
  xattr -d com.apple.metadata:kMDItemDescription "$f" 2>/dev/null
done

exiftool -overwrite_original \
  -UserComment= \
  -Description= \
  -ImageDescription= \
  -XMP-dc:Description= \
  upload/*.JPG

IMG_0326.JPG → Wilder Kaiser massif panorama Tyrol 2026.JPG
IMG_0594.JPG → Naviglio Grande canal Milan 2026.JPG
IMG_0742.JPG → Port Hercule Monaco with yachts 2026.JPG

# One loop, all subjects at once. Run this BEFORE writing descriptions.
for q in "Notre-Dame+Garde+Marseille" "Vieux-Port+Marseille" "Cours+Julien+Marseille" \
         "Invader+mosaics" "trams+Marseille"; do
  echo "=== $q ==="
  curl -s "https://commons.wikimedia.org/w/api.php?action=query&list=search\
&srsearch=${q}&srnamespace=14&srlimit=5&format=json" \
    | python3 -c "
import sys, json
data = json.load(sys.stdin)
for r in data['query']['search']:
    print(r['title'])"
done

cd upload/
python upload_to_commons.py --dry-run

python3 -m venv .venv
.venv/bin/pip install pywikibot

family = 'commons'
mylang = 'commons'
usernames['commons']['commons'] = 'Mike is Michi'
password_file = 'user-password.py'

('Mike is Michi', BotPassword('commons-upload', 'PASTE_BOT_PASSWORD_HERE'))

cd upload/
.venv/bin/python upload_to_commons.py --delay 5

curl -s "https://commons.wikimedia.org/w/api.php?action=query\
&titles=File:Example+Name.JPG&format=json" | python3 -c "
import sys, json
pages = json.load(sys.stdin)['query']['pages']
for pid, p in pages.items():
    print('EXISTS' if int(pid) > 0 else 'MISSING', p['title'])"

1. Resolution check          — drop sub-2MP
2. EXIF extraction            — flag technical issues
   2b. Front-camera filter    — flag likely selfies, skip in visual review
3. Format duplicate pruning   — keep best version per base name
4. Gather location context    — reverse geocode via Nominatim
5. Visual review + tiers      — classify (skip selfies + sample personal clusters)
6. Resolve near-duplicates    — pick best per group
7. Copy+rename+describe       — upload/ folder with descriptions
   7a. Strip metadata         — exiftool + xattr cleanup
   7b. Generate descriptions  — wikimedia_descriptions.txt (3+ categories each)
8. Dry-run preview            — user reviews before upload
9. Upload to Commons          — venv + pywikibot + bot password (config in upload/)
10. Post-upload verification  — API check (wait for propagation) + log review

Flag	Condition	Why
High ISO	> 1250 (phone sensor) / > 3200 (dedicated camera)	Noise
Slow shutter	> 1/30s handheld	Motion blur risk
Small file	< 500 KB for a multi-MP image	Over-compressed

Commons Upload — Wikimedia Commons Image Pipeline

Pipeline Overview

Before Starting

Commons Upload — Wikimedia Commons Image Pipeline

Pipeline Overview

Before Starting

Step 1: Resolution Check

Step 2: EXIF Extraction

Step 2b: Front-camera / selfie pre-filter

Flag thresholds

Output of step 2

Step 3: Format Duplicate Pruning

Step 4: Gather Location Context

Clustering

Reverse geocoding via Nominatim

Output

Step 5: Visual Review + Tier Classification

Viewing strategy

Tier classification (assigned during review, not after)

Step 6: Resolve Near-Duplicates

Step 7: Copy+Rename to upload/ and Generate Descriptions

Strip app-specific metadata

Filename conventions

Category discovery

Description generation

Step 8: Dry-Run Preview

Step 9: Upload to Commons

Venv setup

Pywikibot config files

Running the upload

Step 10: Post-Upload Verification

Full Pipeline Summary

Edge Cases

Notion

Feishu Wiki

Gemini

Obsidian Vault Maintainer

Openclaw Pr Maintainer

Wiki Maintainer