Name: Pmc2bioc
Author: vimalinx

EDirect shell pipeline that converts PMC <article> XML into BioC-style collection / document / passage XML. It lifts front-matter metadata into the first passage, then emits abstract, title, and body passages for downstream text-mining workflows.

Quick Start

Command: pmc2bioc
Local executable: /home/vimalinx/miniforge3/envs/bio/bin/pmc2bioc
Environment prerequisite: add /home/vimalinx/miniforge3/envs/bio/bin to PATH so xtract, transmute, and related EDirect helpers are available

When To Use This Tool

Convert PMC full-text XML into BioC collection XML.
Feed PMC articles into BioC-oriented NLP, annotation, or corpus-building pipelines.
Preserve article metadata, title, abstract, and body passages in a text-mining-friendly structure.
Use this on PMC article XML, not on PubMed citation XML or summary XML.

Pmc2bioc

Pmc2bioc

Quick Start

When To Use This Tool

Common Patterns

Recommended Workflow

Guardrails

Nanoclaw Repl

Bioinformatics

Smart Explore

Vector Database Engineer

Skin Health Analyzer

Scanpy