Trace company affiliation for each repo using script (known mapping + GitHub API) and LLM Web Search for remaining unknowns
Determine which repos are backed by commercial companies. Uses a 3-layer strategy: known mapping → GitHub API → LLM.
output/repo_exp.csv — expanded repo list (from step ⑧)python3 scripts/trace_companies.py output/repo_exp.csv -o output/company.csv --summary
The script applies 3 layers automatically:
huggingface → Hugging Face, alibaba → 阿里巴巴). Confidence = S.GET /orgs/{owner} for the , , fields. Uses field directly, or infers from blog domain. Confidence = A.companyblogdescriptioncompanycompany_name=unknown, trace_method=待LLM研究.Check the summary output:
If all repos are resolved (unknown = 0), skip to Step 4.
For repos still marked unknown, use Web Search to determine company affiliation. Do NOT guess.
"{org_name}" company or "{project_name}" developed bycompany_name — company name, or unknown if truly independentevidence — actual URLs and facts from searchconfidence — S/A/B/CunknownunknownShow results grouped by company for review. The user may correct assignments.
After user confirmation, update output/company.csv.
Print: