技能檔案

Ai Research Watch

Name: Ai Research Watch
Author: K4M1coder

**WORKFLOW SKILL** — AI technical watch: arxiv analysis, paper methodology, benchmark tracking, conference monitoring, reproducibility assessment. USE FOR: analyzing papers, tracking SOTA, evaluating new methods, summarizing research trends, comparing approaches. USE WHEN: evaluating a new technique, staying current on AI research, assessing paper claims.

K4M1coder0 星標2026年4月6日

職業
分類: 項目管理

技能內容

Track state-of-the-art AI research, analyze papers, and evaluate new methods for practical applicability.

When to Use

Evaluating a new paper or technique for adoption
Tracking benchmarks and SOTA for a specific task
Summarizing research trends in a domain
Assessing reproducibility of published results
Comparing competing approaches for a project

Core Concepts

Concept	Description
SOTA Tracking	Monitoring best results on standard benchmarks
Paper Analysis	Structured reading: claims, methods, results, limitations
Reproducibility	Can results be replicated? Code, data, compute requirements
Ablation Study	Which components matter? Sensitivity analysis
Transfer Potential

相關技能

Ai Research Watch | Skills Pool

Paper	Topic	Relevance
arxiv:2410.00037	Moshi: full-duplex speech-text	Core architecture
arxiv:2502.03382	Hibiki: speech translation	Cross-lingual
arxiv:2509.06926	Pocket-TTS: lightweight TTS	Edge deployment
arxiv:2505.18825	LSD: latent speech diffusion	Generative method
arxiv:2106.09685	LoRA	Fine-tuning method
arxiv:2104.09864	RoPE	Position encoding

Conference	When	Focus
NeurIPS	December	General ML
ICML	July	General ML
ICLR	May	Representation learning
ACL/EMNLP	July/December	NLP
Interspeech	September	Speech
ICASSP	April/June	Signal processing
CVPR/ICCV	June/October	Computer vision

Benchmark	Measures	Key Models
MMLU	Knowledge, reasoning	GPT-4, Claude, Llama
HumanEval	Code generation	Codex, DeepSeek-Coder
HellaSwag	Common sense	Most LLMs
ARC	Science reasoning	Most LLMs

Benchmark	Measures	Key Models
LibriSpeech	ASR (WER)	Whisper, Conformer
PESQ/STOI	Audio quality	Mimi, EnCodec, DAC
MOS	Subjective quality	TTS systems
VoiceBox benchmark	Speech generation	VoiceBox, Moshi

Benchmark	Measures	Key Models
ImageNet	Classification	ViT, EfficientNet
COCO	Detection/Segmentation	DETR, SAM
FID/IS	Image generation quality	Diffusion models

Criterion	Weight	Score 1-5
Relevance to audio/speech	30%
Implementation complexity	20%
Compute requirements	20%
Quality improvement expected	20%
Maturity / reproducibility	10%

Ai Research Watch

When to Use

Core Concepts

Ai Research Watch

When to Use

Core Concepts

Procedure

Phase 1: Paper Analysis

Phase 2: Benchmark Tracking

Language Models

Speech/Audio

Vision

Phase 3: Research Trend Monitoring

Phase 4: Applicability Assessment

Kyutai Open-Source Reference — Published Papers

Key Conferences

Output Format

Things Mac

Trello

Production Scheduling

Jira Integration

Production Scheduling

Cost Aware Llm Pipeline