Collect company intelligence for sourcing or research by automating Google Chrome against LinkedIn, company websites, and Google Maps. Use when Codex needs to gather non-China companies for one or more industry keywords, extract a strict set of company profile fields, find contact email addresses with source attribution, enforce per-keyword quotas, and keep long-running collection work observable with OpenClaw heartbeat monitoring.
Collect company profiles in a repeatable way when the user wants lead discovery or market scanning from LinkedIn first, then enrichment from official websites and Google Maps.
Read references/output-schema.md before collecting. Read references/heartbeat-and-browser.md before opening the browser or starting a long run.
Run run_linkedin_company_scout.py when the task is a straight collection run and the environment is macOS with Google Chrome plus Selenium available.
Run run_full_pipeline.py when the user wants the full automation chain:
5 companies per keyword unless the user specifies another number.openclaw system heartbeat enable
Use this command for the common case:
python3 /Users/m1/.codex/skills/linkedin-company-scout/scripts/run_linkedin_company_scout.py \
--keywords "industrial design,hardware design,smart wearable" \
--count 5 \
--output-dir /Users/m1/Documents/Playground/linkedin-company-scout-output
Use this command when the user wants deep collection such as 100 companies per keyword with pagination:
python3 /Users/m1/.codex/skills/linkedin-company-scout/scripts/run_linkedin_company_scout.py \
--keywords "industrial design,hardware design,smart wearable" \
--count 100 \
--max-search-pages 20 \
--output-dir /Users/m1/Documents/Playground/linkedin-company-scout-output-100
Useful flags:
--no-heartbeat: skip OpenClaw heartbeat enablement--chrome-profile-dir <path>: keep a dedicated Chrome profile with a persistent LinkedIn login--debug-port <port>: change the Chrome debugging port if 9222 is occupied--linkedin-wait-seconds <n>: allow more time for manual login--max-search-pages <n>: scan additional LinkedIn result pages when one page is not enoughUse this one command when the user asks for complete execution flow:
python3 /Users/m1/.codex/skills/linkedin-company-scout/scripts/run_full_pipeline.py \
--keywords "industrial design" \
--count 200 \
--output-dir /Users/m1/Documents/Playground/linkedin-company-scout-output-industrial-200-verified \
--db-path /Users/m1/Documents/Playground/linkedin-company-scout-output-industrial-200-verified/results.db \
--max-search-pages 400 \
--no-heartbeat \
--send-email \
--send-backend imap-smtp-email \
--smtp-password "<SMTP_PASSWORD>" \
--refresh-dashboard
Key behavior:
--allow-resend, so records with status='sent' in prior campaigns are skipped.通用模版 in template DB.linkedin-dashboard/dashboard-data.js.Expected outputs:
linkedin_company_scout_results.jsonlinkedin_company_scout_results.csvrun_metadata.jsonContact, About, Footer, Legal, Imprint, Support, Team.hello@, info@, contact@, support@, or a departmental email clearly relevant to external inquiries.official_website.google_maps.not_found.keywordcompany_namecompany_websitecompany_introindustrylocationlinkedin_urlemailemail_sourcenotes.official_website, google_maps, or not_found.[email protected] unless the exact address is shown publicly.