Transcript-first nightly pipeline for mining AI conversations into structured knowledge notes, candidate research seeds, and next-day leverage. Use when: 对话沉淀, 对话整理, conversation mining, transcript mining, 昨天的对话, nightly conversation knowledge, review AI chat transcripts, extract insights from conversations, or run the 03:00 conversation stage.
02:00 article integration may enrich Brain context before this stage runs
this stage may read article-derived knowledge as context, but does not normalize article notes
04:00 amplifier may read this stage's notes, daily suggestions block, and candidate research seeds
Preferred upstream read order for 03:00: shared nightly digest → 03-KNOWLEDGE/99-SYSTEM/03-INTEGRATION-REPORTS/run-reports/YYYY-MM-DD/ → stable indexes only if needed
Do not rely on legacy article report paths under 12-REVIEWS/KNOWLEDGEBASE/ for this split pipeline
Skip Steps 3–4. In Step 5, read the top N largest transcripts per project group directly.
If a project brief is available, read it first as a routing anchor.
When QMD is healthy, run the full 7-step flow.
Step 1: Build daily manifest
Run helper scripts to list the previous day's transcripts and produce a manifest.
Step 2: Group by project
Infer project mapping primarily from filenames / transcript titles. This step is allowed to stay simple and explicit.
When a likely project is identified, resolve it to a stable project anchor:
prefer an existing project_ref if already known
otherwise use scripts/resolve-project-ref.py <inferred-name> to map to a known project brief under $BRAIN_ROOT/05-PROJECTS/
if confidence is low, leave the project unresolved rather than hallucinating
Step 3: Build a high-recall candidate pool with QMD
For each project group:
create or refresh a QMD collection
run qmd-safe.sh update
run qmd-safe.sh embed
use qmd-safe.sh query / qmd-safe.sh vsearch with reranking to retrieve likely high-value transcript chunks
This layer is for recall, not final judgment.
Step 4: Run Surveillance scan
Use a fast, cheap model layer to quickly scan the recalled candidates.
Surveillance is a candidate-recall layer, not a knowledge-authoring layer.
It should answer:
which conversations are most likely worth {{MAIN_AGENT_NAME}} review
which ones may change today's execution quality
which ones deserve outside reinforcement later
Recommended signal dimensions:
project relevance: is it clearly tied to an active project?
structural density: does it contain decisions, comparisons, review conclusions, architecture edges, rules, or reusable methods?
next-day leverage: can it influence tomorrow's execution or judgment?
pattern / anomaly recurrence: does it repeat across multiple transcripts or reveal a systemic issue?
Surveillance output should be:
top candidate transcript paths per project
one-line reason each candidate is worth human review
priority bucket (P1, P2, P3) for {{MAIN_AGENT_NAME}} attention
optional external research seed
Step 5: Final human-quality synthesis
Before final synthesis, if a project can be identified with reasonable confidence, read the corresponding Brain-side project brief first:
When applicable, daily suggestions should point back to the relevant project brief so the recommendation is not floating without project context.
Daily briefing feed-in
Provide a compact block titled:
昨日沉淀带来的新想法
Failure Handling
No transcripts found → still write machine report + digest section; state clearly that transcript root / target date was missing
QMD unhealthy → first try skill-local qmd-safe.sh auto-repair once; if still unhealthy, enter degraded mode explicitly and still write machine report + digest section
Surveillance noisy / low confidence → still pass only compact candidate shortlist to {{MAIN_AGENT_NAME}}, not long summaries
No high-value insights → produce a short "no-signal" machine report + digest section; no forced note