Name: Text Analyst
Author: nealcaren

搜索技能.../

Method	Recommended Language	Rationale
Topic Models (LDA, STM)	R	`stm` package is gold standard; better diagnostics
Dictionary/Sentiment	R	tidytext workflow is elegant; great lexicon support
Visualization	R	ggplot2 produces publication-ready figures
Transformers/BERT	Python	HuggingFace ecosystem, GPU support
BERTopic	Python	Neural topic modeling, only in Python
Named Entity Recognition	Python	spaCy is industry standard
Supervised Classification	Either	sklearn and tidymodels both excellent
Word Embeddings	Python	gensim more mature; sentence-transformers

project/
├── data/
│   ├── raw/              # Original text files
│   └── processed/        # Cleaned corpus, DTMs
├── code/
│   ├── 00_master.R       # or 00_master.py
│   ├── 01_preprocess.R
│   ├── 02_analysis.R
│   └── 03_validation.R
├── output/
│   ├── tables/
│   └── figures/
├── dictionaries/         # Custom lexicons if used
└── memos/                # Phase outputs

Guide	Topics
`01_dictionary_methods.md`	Lexicons, custom dictionaries, validation
`02_topic_models.md`	LDA, STM, BERTopic theory and selection
`03_supervised_classification.md`	Training data, features, evaluation
`04_embeddings.md`	Word2Vec, GloVe, BERT concepts
`05_sentiment_analysis.md`	Dictionary vs ML approaches
`06_validation_strategies.md`	Human coding, diagnostics, robustness

Guide	Topics
`01_preprocessing.md`	tidytext, quanteda
`02_dictionary_sentiment.md`	tidytext lexicons, TF-IDF
`03_topic_models.md`	topicmodels, stm
`04_supervised.md`	tidymodels for text
`05_embeddings.md`	text2vec
`06_visualization.md`	ggplot2 for text

Guide	Topics
`01_preprocessing.md`	nltk, spaCy, sklearn
`02_dictionary_sentiment.md`	VADER, TextBlob
`03_topic_models.md`	gensim, BERTopic
`04_supervised.md`	sklearn, transformers
`05_embeddings.md`	gensim, sentence-transformers
`06_visualization.md`	matplotlib, pyLDAvis

Task: Phase 0 Research Design
subagent_type: general-purpose

Text Analyst | Skills Pool

Text Analyst

Text Analyst

Computational Text Analysis Agent

Core Principles

Language Selection

Analysis Phases

Phase 0: Research Design & Method Selection

Phase 1: Corpus Preparation & Exploration

Phase 2: Method Specification

Phase 3: Main Analysis

Phase 4: Validation & Robustness

Phase 5: Output & Interpretation

Folder Structure

Technique Guides

Conceptual Guides (language-agnostic)

R Technique Guides

Python Technique Guides

Invoking Phase Agents

Deep Research

Academic Researcher

Data Analyst

Clinical Decision Support Documents

Brenda Database

Gemini