Científicos de datos

Clickhouse Query

Run ClickHouse queries for analytics, metrics analysis, and event data exploration. Use when you need to query ClickHouse directly, analyze metrics, check event tracking data, or test query performance. Read-only by default.

civitai7.1k

LLM & AI

Chroma

Open-source embedding database for AI applications. Store embeddings and metadata, perform vector and full-text search, filter by metadata. Simple 4-function API. Scales from notebooks to production clusters. Use for semantic search, RAG applications, or document retrieval. Best for local development and open-source projects.

Bioinformática

Sparse Autoencoder Training

Provides guidance for training and analyzing Sparse Autoencoders (SAEs) using SAELens to decompose neural network activations into interpretable features. Use when discovering interpretable features, analyzing superposition, or studying monosemantic representations in language models.

Grpo Rl Training

Expert guidance for GRPO/RL fine-tuning with TRL for reasoning and task-specific model training

Simpo Training

Simple Preference Optimization for LLM alignment. Reference-free alternative to DPO with better performance (+6.4 points on AlpacaEval 2.0). No reference model needed, more efficient than DPO. Use for preference alignment when want simpler, faster training than DPO/PPO.

Fine Tuning With Trl

Fine-tune LLMs using reinforcement learning with TRL - SFT for instruction tuning, DPO for preference alignment, PPO/GRPO for reward optimization, and reward model training. Use when need RLHF, align model with preferences, or train from human feedback. Works with HuggingFace Transformers.

Constitutional Ai

Anthropic's method for training harmless AI through self-improvement. Two-phase approach - supervised learning with self-critique/revision, then RLAIF (RL from AI Feedback). Use for safety alignment, reducing harmful outputs without human labels. Powers Claude's safety system.

Desarrolladores de software

Previous 1…9 10 11…818 Next

Científicos de datos

Analistas y testers de aseguramiento de calidad de software

Administradores de redes y sistemas informáticos

Analistas de seguridad de la información

Desarrolladores web

Arquitectos de bases de datos

Diseñadores de interfaces web y digitales

Administradores de bases de datos

Analistas de sistemas informáticos

Gerentes de sistemas informáticos y de información

Programadores informáticos

Profesores universitarios de informática

Científicos de investigación en informática e información

Especialistas en soporte al usuario informático

Arquitectos de redes informáticas

Ingenieros electrónicos

Capturistas de datos

Asistentes estadísticos

Especialistas en soporte de redes

Reparadores de equipos de oficina y cajeros automáticos

Ingenieros de hardware

Operadores de máquinas de oficina

Operadores de máquinas CNC

Ver todas las ocupaciones

Científicos de datos

Desarrolladores de software

Analistas y testers de aseguramiento de calidad de software

Administradores de redes y sistemas informáticos

Analistas de seguridad de la información

Desarrolladores web

Arquitectos de bases de datos

Diseñadores de interfaces web y digitales

Administradores de bases de datos

Analistas de sistemas informáticos

Gerentes de sistemas informáticos y de información

Programadores informáticos

Profesores universitarios de informática

Científicos de investigación en informática e información

Especialistas en soporte al usuario informático

Arquitectos de redes informáticas

Ingenieros electrónicos

Capturistas de datos

Asistentes estadísticos

Especialistas en soporte de redes

Reparadores de equipos de oficina y cajeros automáticos

Ingenieros de hardware

Operadores de máquinas de oficina

Operadores de máquinas CNC

Herramientas de Laboratorio

Ab Test Analysis

Analyze A/B test results with statistical significance, sample size validation, confidence intervals, and ship/extend/stop recommendations. Use when evaluating experiment results, checking if a test reached significance, interpreting split test data, or deciding whether to ship a variant.

phuryn10.2k

Ventas y Marketing

Sentiment Analysis

Analyze user feedback data to identify segments with sentiment scores, JTBD, and product satisfaction insights. Use when analyzing user feedback at scale, running sentiment analysis on reviews or surveys, or identifying satisfaction patterns.

phuryn10.2k

Ventas y Marketing

Gs Quant

This document covers the core workflows for using the `gs_quant` library: establishing a session, constructing and resolving instruments, building portfolios, pricing historically, pricing with live market data, and extracting results.

goldmansachs10.1k

Ingeniería de Datos

Od Expert

Anomaly detection expert backed by PyOD's ADEngine. Drives autonomous detection workflows on tabular, time series, graph, text, and image data — profiling, planning, multi-detector comparison, quality assessment, iteration, and reporting. Encodes deep OD knowledge so non-expert users get expert-quality results without driving every decision.

yzhao0629.8k

Data Analysis

Do some data analysis

nix-community9.7k

Phoenix Evals New Metric

Create a new built-in classification evaluator for Phoenix evals. Use this skill whenever the user asks to create a new eval, build a new metric, add a new builtin evaluator, create an LLM-as-a-judge metric, or add a new classification evaluator to Phoenix.

Arize-ai9.3k

Sales Analytics

Database schema and business logic for sales data analysis including customers, orders, and revenue.

alibaba9.3k

Train Sft

SFT training reference for the ART framework. Use when the user asks to create, write, or help with an SFT training script, fine-tune a model, train from a JSONL dataset, do distillation, or anything related to supervised fine-tuning.

OpenPipe9.2k

Educación

Train Rl

RL training reference for the ART framework. Use when the user asks to create, write, or help with an RL training script, reinforcement learning, GRPO, reward functions, RULER scoring, rollout functions, or anything related to RL fine-tuning.

OpenPipe9.2k

Ingeniería de Datos

Data Research

Structured data research: search sources, extract structured data, archive raw sources, maintain canonical tracker pages, deduplicate. Parameterized via YAML recipes for investor updates, donations, company updates, or any email-to-structured-data pipeline.

garrytan9.1k

Depuración

Sentiment Analysis

Extracts the true audience mood and key feedback by analyzing comment sentiment and keyword frequency.

google8.9k

Understand Knowledge

Analyze a Karpathy-pattern LLM wiki knowledge base and generate an interactive knowledge graph with entity extraction, implicit relationships, and topic clustering.

Lum11048.5k

Ingeniería de Datos

ML Pipeline Expert

Designs and implements production-grade ML pipeline infrastructure: configures experiment tracking with MLflow or Weights & Biases, creates Kubeflow or Airflow DAGs for training orchestration, builds feature store schemas with Feast, deploys model registries, and automates retraining and validation workflows. Use when building ML pipelines, orchestrating training workflows, automating model lifecycle, implementing feature stores, managing experiment tracking systems, setting up DVC for data versioning, tuning hyperparameters, or configuring MLOps tooling like Kubeflow, Airflow, MLflow, or Prefect.

LLM & AI

Fine-Tuning Expert

Use when fine-tuning LLMs, training custom models, or adapting foundation models for specific tasks. Invoke for configuring LoRA/QLoRA adapters, preparing JSONL training datasets, setting hyperparameters for fine-tuning runs, adapter training, transfer learning, finetuning with Hugging Face PEFT, OpenAI fine-tuning, instruction tuning, RLHF, DPO, or quantizing and deploying fine-tuned models. Trigger terms include: LoRA, QLoRA, PEFT, finetuning, fine-tuning, adapter tuning, LLM training, model training, custom model.

Pandas Pro

Performs pandas DataFrame operations for data analysis, manipulation, and transformation. Use when working with pandas DataFrames, data cleaning, aggregation, merging, or time series analysis. Invoke for data manipulation tasks such as joining DataFrames on multiple keys, pivoting tables, resampling time series, handling NaN values with interpolation or forward-fill, groupby aggregations, type conversion, or performance optimization of large datasets.

LLM & AI

Rag Architect

Designs and implements production-grade RAG systems by chunking documents, generating embeddings, configuring vector stores, building hybrid search pipelines, applying reranking, and evaluating retrieval quality. Use when building RAG systems, vector databases, or knowledge-grounded AI applications requiring semantic search, document retrieval, context augmentation, similarity search, or embedding-based indexing.

Spark Engineer

Use when writing Spark jobs, debugging performance issues, or configuring cluster settings for Apache Spark applications, distributed data processing pipelines, or big data workloads. Invoke to write DataFrame transformations, optimize Spark SQL queries, implement RDD pipelines, tune shuffle operations, configure executor memory, process .parquet files, handle data partitioning, or build structured streaming analytics.