Name: Discover Components
Author: jctanner

Discover Components | Skills Pool

--entry-repo=<name>

ls -1 {checkouts_dir}/

# Correct: exclude vendor and testdata
grep -r --exclude-dir=vendor --exclude-dir=testdata "include.release.openshift.io" {sample_repo}/manifests/ {sample_repo}/install/ 2>/dev/null
# WRONG: scanning all YAML will hit vendor/ and testdata/
grep -r "include.release.openshift.io" {sample_repo}/ 2>/dev/null

grep -r --exclude-dir=vendor --exclude-dir=testdata "include.release.openshift.io\|release.openshift.io\|operator.openshift.io/managed" {sample_repo}/manifests/ {sample_repo}/install/ 2>/dev/null

grep -r --exclude-dir=vendor --exclude-dir=testdata "capability.openshift.io/name\|operator.openshift.io/capability" {sample_repo}/manifests/ {sample_repo}/install/ 2>/dev/null

# Find the operator's Deployment manifest and check for capability annotation
grep -l "kind: Deployment" {repo}/manifests/ {repo}/install/ 2>/dev/null | xargs grep -l "capability.openshift.io/name" 2>/dev/null

ls {sample_repo}/manifests/image-references {sample_repo}/install/image-references 2>/dev/null

ls {sample_repo}/bundle/manifests/*.clusterserviceversion.yaml 2>/dev/null

ls {sample_repo}/charts/ {sample_repo}/Chart.yaml 2>/dev/null

Tier	Criteria	Example
`core_platform`	Has release inclusion annotations and NO capability annotation on its Deployment	`cluster-etcd-operator`, `cluster-kube-apiserver-operator`
`optional_platform`	Has release inclusion annotations AND capability annotation on its Deployment	`cluster-samples-operator` (capability: openshift-samples)
`payload_component`	Has `image-references` but no release/capability annotations	Supporting images shipped in payload
`ecosystem`	No release signals at all	`aws-account-operator`

grep -r "image:" {entry_repo}/config/ {entry_repo}/manifests/ {entry_repo}/bundle/

grep -r "role:" {entry_repo}/
grep -r "collection:" {entry_repo}/

find {entry_repo} -name "requirements*.txt" -o -name "pyproject.toml"
cat {found_files}

find {entry_repo} -name "go.mod"
cat {found_files}

cat {entry_repo}/.gitmodules

find {entry_repo} -path "*/.github/workflows/*.yml" -o -path "*/.gitlab-ci.yml"
cat {found_files}

# Example: console-operator's image-references lists "console" as an operand
cat {operator_repo}/manifests/image-references
# - name: console-operator    ← the operator itself
# - name: console             ← the operand (separate repo!)

{
  "cluster-etcd-operator": ["api", "library-go", "client-go"],
  "cluster-kube-apiserver-operator": ["api", "library-go", "client-go", "apiserver-library-go"],
  "cluster-network-operator": ["api", "library-go", "client-go"],
  ...
}

{
  "awx-operator": ["installer"],                          # 1 consumer
  "eda-operator": ["installer"],                          # 1 consumer
  "awx-api": ["awx-operator"],                           # 1 consumer
  "django-ansible-base": ["awx-operator", "eda-operator", "automation-hub-operator"]  # 3 consumers!
}

# Count references to the repo's types in core platform code
grep -r "Gateway\|HTTPRoute\|GRPCRoute" {platform_repo}/pilot/ | wc -l
# If this is in the hundreds across core packages, it's architectural

Examine the repo at {checkout_path}. Determine whether this repo is a shipped
platform component based on its STRUCTURE. Look for:
- Dockerfile/Containerfile (builds a container image?)
- Kubernetes manifests, Helm charts, kustomize overlays (deployed to a cluster?)
- Operator patterns: main.go/cmd/, controller-runtime imports, CRD definitions
- Service patterns: API server code, gRPC/REST endpoints, daemon entrypoints
- Asset patterns: static content only, no running code

Return a JSON object with exactly these fields:
{
  "vote": "include" | "exclude" | "unsure",
  "suggested_type": "operator" | "controller" | "service" | "ui" | "asset" | "shared_library" | "other",
  "rationale": "<one sentence explaining your reasoning>"
}

Examine the repo at {checkout_path}. Determine whether this repo is a shipped
platform component based on its RELATIONSHIPS to other components. Check:
- Is this repo's name referenced as a container image in any included operator's
  manifests, CSV, or source code? (Search the operator repos for image refs matching
  this repo name)
- Does this repo's go.mod / requirements.txt import or get imported by included components?
- Is this repo referenced in CI/CD configs of included components?
- Does this repo define CRDs that included operators reconcile?

The included operators are: {list of already-included component keys}

Return a JSON object with exactly these fields:
{
  "vote": "include" | "exclude" | "unsure",
  "suggested_type": "operator" | "controller" | "service" | "ui" | "asset" | "shared_library" | "other",
  "rationale": "<one sentence explaining your reasoning>",
  "referenced_by": ["<list of components that reference this repo, if any>"]
}

Examine the repo at {checkout_path}. Determine whether this repo is a shipped
platform component based on its FUNCTION — what does it actually do at runtime?
- Read the README, top-level docs, and main entrypoint to understand the repo's purpose
- Is this a production runtime workload (serves traffic, processes data, manages resources)?
- Is this a development/build tool (used during CI/CD but not deployed to production)?
- Is this a test utility, documentation repo, or helper script collection?
- Is this a serving runtime or model server (deployed by an operator on demand)?

Return a JSON object with exactly these fields:
{
  "vote": "include" | "exclude" | "unsure",
  "suggested_type": "operator" | "controller" | "service" | "ui" | "asset" | "shared_library" | "other",
  "rationale": "<one sentence explaining your reasoning>"
}

Votes	Decision	Confidence
3/3 include	Include the repo	`"high"`
3/3 exclude	Exclude the repo	`"high"`
2/3 include	Include the repo	`"medium"`
2/3 exclude	Exclude the repo	`"medium"`
3-way split or all unsure	Include the repo, flag for human review	`"disputed"`

{
  "confidence": "high|medium|disputed",
  "consensus": {
    "votes": {"include": 2, "exclude": 1},
    "reviewers": {
      "structural": {"vote": "include", "type": "service", "rationale": "Has Dockerfile and kustomize manifests for production deployment"},
      "relational": {"vote": "include", "type": "service", "rationale": "Image referenced by data-science-pipelines-operator CSV"},
      "functional": {"vote": "exclude", "type": "other", "rationale": "Operand binary only, no standalone deployment lifecycle"}
    }
  }
}

"excluded": {
  "some-repo": {
    "reason": "consensus_exclude",
    "confidence": "medium",
    "consensus": {
      "votes": {"include": 1, "exclude": 2},
      "reviewers": {
        "structural": {"vote": "include", "type": "service", "rationale": "..."},
        "relational": {"vote": "exclude", "type": "other", "rationale": "..."},
        "functional": {"vote": "exclude", "type": "other", "rationale": "..."}
      }
    }
  }
}

ls {checkouts_dir}/{repo_name}/GENERATED_ARCHITECTURE.md

{
  "metadata": {
    "platform": "{platform}",
    "discovery_method": "breadcrumb|release_payload_signals",
    "entry_point": "{entry_repo or 'multiple'}",
    "discovered_at": "{ISO timestamp}",
    "checkouts_dir": "{checkouts_dir}",
    "total_repos_scanned": {count},
    "components_discovered": {count},
    "components_excluded": {count}
  },
  "components": {
    "{component-key}": {
      "key": "{component-key}",
      "repo_org": "{org}",
      "repo_name": "{repo-name}",
      "ref": "main",
      "source_folder": "config",
      "checkout_path": "{full-path}",
      "has_architecture": false,
      "type": "operator|controller|service|ui|installer|asset|shared_library|api_specification",
      "tier": "core_platform|optional_platform|payload_component|ecosystem",
      "discovered_via": "release_payload_signal|operator_operand|operator_bundle|container_image|dependency|installer",
      "referenced_by": ["installer"],
      "shipped": true,
      "architecturally_significant": true,
      "consumer_count": 3,
      "consumers": ["awx-operator", "eda-operator", "hub-operator"],
      "capability": "optional-capability-name-if-applicable",
      "confidence": "high|medium|disputed",
      "consensus": {
        "votes": {"include": 2, "exclude": 1},
        "reviewers": {
          "structural": {"vote": "include", "type": "service", "rationale": "..."},
          "relational": {"vote": "include", "type": "service", "rationale": "..."},
          "functional": {"vote": "exclude", "type": "other", "rationale": "..."}
        }
      }
    }
  },
  "dependency_graph": {
    "{repo}": ["{dep1}", "{dep2}"]
  },
  "excluded": {
    "{repo-name}": "{reason}",
    "{repo-name-reviewed}": {
      "reason": "consensus_exclude",
      "confidence": "high|medium",
      "consensus": {
        "votes": {"include": 0, "exclude": 3},
        "reviewers": {
          "structural": {"vote": "exclude", "type": "other", "rationale": "..."},
          "relational": {"vote": "exclude", "type": "other", "rationale": "..."},
          "functional": {"vote": "exclude", "type": "other", "rationale": "..."}
        }
      }
    }
  }
}

# Use Write tool

================================================================================
Component Discovery Complete
================================================================================

Platform: {platform}
Checkouts directory: {checkouts_dir}
Discovery method: {Breadcrumb exploration | Release payload signals}

Results:
  Total repositories scanned: {total}
  Components discovered: {discovered}
  Components excluded: {excluded}

--- If release payload signals were found: ---

Release payload signals detected: {signal_types}

Core platform ({count}):
  ✓ cluster-etcd-operator (type: operator, tier: core_platform)
  ✓ cluster-kube-apiserver-operator (type: operator, tier: core_platform)
  ✓ machine-config-operator (type: operator, tier: core_platform)
  ...

Optional platform ({count}):
  ✓ cluster-samples-operator (type: operator, tier: optional_platform, capability: openshift-samples)
  ✓ console-operator (type: operator, tier: optional_platform, capability: Console)
  ...

Shared libraries / API specs:
  ✓ library-go (type: shared_library, used by: N components) [ARCHITECTURALLY SIGNIFICANT]
  ✓ gateway-api (type: api_specification, upstream: kubernetes-sigs) [ARCHITECTURALLY SIGNIFICANT]
  ...

Consensus-reviewed (included):
  ✓ console (type: service, confidence: high, votes: 3/3 include)
      structural: include — "Has Dockerfile, deployed as pod"
      relational: include — "Image referenced by console-operator"
      functional: include — "Web UI served in production"
  ...

Consensus-reviewed (excluded — review recommended):
  ✗ some-tool (confidence: medium, votes: 2/3 exclude)
  ...

Disputed (needs human review):
  ⚠ ambiguous-repo (confidence: disputed, votes: 1/1/1)
  ...

Ecosystem (excluded — no release payload signals):
  ✗ aws-account-operator (ecosystem)
  ✗ addon-operator (ecosystem)
  ... and {N} more

--- If NO release payload signals found (breadcrumb mode): ---

Entry points used:
  - {entry1}
  - {entry2}

Discovered components:
  ✓ awx-operator (type: operator, via: operator_bundle, ref by: installer)
  ✓ eda-operator (type: operator, via: operator_bundle, ref by: installer)
  ✓ awx-api (type: service, via: container_image, ref by: awx-operator)
  ✓ django-ansible-base (type: shared_library, used by: 3 components) [ARCHITECTURALLY SIGNIFICANT]
  ✓ gateway-api (type: api_specification, upstream: kubernetes-sigs) [ARCHITECTURALLY SIGNIFICANT]
  ...

Consensus-reviewed (included):
  ✓ data-science-pipelines (type: service, confidence: medium, votes: 2/3 include)
      structural: include — "Has Dockerfile and kustomize manifests"
      relational: include — "Image referenced by data-science-pipelines-operator"
      functional: exclude — "Operand only, no standalone lifecycle"
  ...

Consensus-reviewed (excluded — review recommended):
  ✗ some-helper-tool (confidence: medium, votes: 2/3 exclude)
      structural: include — "Has Dockerfile"
      relational: exclude — "Not referenced by any included component"
      functional: exclude — "CI/CD helper, not a production workload"
  ...

Disputed (needs human review):
  ⚠ ambiguous-repo (confidence: disputed, votes: 1/1/1)
      structural: include — "..."
      relational: exclude — "..."
      functional: unsure — "..."
  ...

Excluded repositories:
  ✗ ansible-docs (documentation_only)
  ✗ ansible-ci-tools (development_tooling)
  ...

Output: architecture/{platform}/component-map.json

Next steps:
1. Review component-map.json (edit if needed)
2. Run: python main.py generate-architecture --platform={platform}
3. Run: python main.py collect-architectures --platform={platform}
================================================================================

Discover Components

Breadcrumb Types

Arguments

Discover Components

Breadcrumb Types

Arguments

Instructions

Step 1: Scan Checkouts Directory

Step 2: Initial Filtering

Step 2a: Probe for Release Payload Signals

Signal 1: Release Inclusion Annotations

Signal 2: Capability/Optional Annotations

Signal 3: Image Reference Manifests

Signal 4: OLM Catalog Membership

Signal 5: Helm Chart Index / Kustomize Catalog

Full Signal Scan

Bootstrap / Self-Referential Components

Step 3: Find Entry Points

Step 4: Explore Breadcrumbs from Entry Points

4a. Kubernetes Manifests

4b. Ansible Playbooks

4c. Dependency Files

4d. Git Submodules

4e. CI/CD Pipelines

Step 5: Build Component Graph

Step 5a: Identify Shared Libraries

Step 5b: Identify Architecturally Significant External APIs

Step 5c: Classify Component Type

Step 6: Classify Remaining Repos

Step 6a: Multi-Reviewer Consensus for Low-Confidence Repos

Consensus Procedure

Aggregating Votes

Recording Consensus Results

Performance Notes

Step 7: Check for Existing Architectures

Step 8: Build Output JSON

Step 9: Write Output

Step 10: Report Summary

Heuristics for Component Classification

Include: Deployed Components (shipped: true)

Include: Shared Libraries (shipped: false, architecturally_significant: true)

Exclude: Non-Components

Error Handling

Notes

Critical: Don't Exclude Shared Libraries or API Contracts!

Session Logs

OpenClaw Test Heap Leaks

Node Connect

Openclaw Qa Testing

Openclaw Secret Scanning Maintainer

Flags