Use when converting legacy Zoom aligner output into SAM and the read length must be supplied explicitly.
Use when removing invariant columns from a tab-delimited table, especially in EDirect or bioinformatics comparison pipelines.
Use when working with Variant Call Format (VCF) files and need to filter, summarize, or manipulate variant data.
Use when masking repetitive or low-complexity regions in genomic sequences before alignment or database searches
Publish and manage research papers on Hugging Face Hub. Supports creating paper pages, linking papers to models/datasets, claiming authorship, and generating professional markdown-based research articles.
2-8 scientific lens passes to surface hidden requirements single-pass analysis misses. USE WHEN iterative depth, deep exploration, multi-angle analysis, multiple perspectives, examine from angles, surface hidden requirements.
Deep brainstorming and solution exploration based on research findings
Use when you need to shift genomic intervals in BED/GFF/VCF files by a specified number of base pairs, either uniformly or strand-specifically.
Use when you need to sort BED, GFF, or VCF interval files for downstream bedtools processing, or rank records by feature size or score.
Use when aligning long RNA-seq reads to a reference genome or generating genome indexes for spliced transcript alignment
Use when aligning RNA-seq reads to a reference genome with splice-aware mapping, generating genome indices, or performing related operations like lift-over and BAM input processing.
Use when aligning RNA-seq reads to a reference genome with splice-aware mapping, generating genome indexes, or performing splice junction detection.
Use when aligning RNA-seq reads to a reference genome or generating splice-aware genome indices for transcript alignment.
Use when building an index from a reference sequence for Subread alignment tools.
Use when you need to remove overlapping portions of one interval set from another, such as subtracting blacklist, repeat, or annotation regions from BED, GFF, VCF, or BAM-like inputs.
Use when converting `spdi2tbl`-style variant rows into reference and altered product sequences for coding or protein variants.
Use when you need to annotate BAM alignments with a two-character tag based on overlaps with BED, GFF, or VCF annotation files, such as labeling reads by feature class or interval source.
Use when smoke-testing an Entrez Direct installation with the bundled long-form example suite or the focused `-test` trace mode.
Use when downloading or updating pre-formatted BLAST databases from NCBI or cloud providers (AWS, GCP)
Use when analyzing biological sequence, annotation, variation, expression, or comparative genomics data with reproducible computational workflows.
Use when applying VCF variants to a reference FASTA to generate a consensus sequence.
Use when you need to validate VCF files for format compliance and detect issues like duplicate positions.
Use when VCF files have incorrect ploidy annotations for sex chromosomes or mitochondrial DNA, particularly when processing samples with known sex but mismatched genotype fields.
Use when aligning RNA-seq reads to a reference genome with junction detection, including exon-exon junctions and gene fusions.
Use when simulating paired-end short reads from a reference FASTA for testing, benchmarking, or pipeline validation
Use when you need to find features in one file that fall within a configurable window around features in another file, including strand-aware upstream and downstream proximity searches.
Use when you need to create adjacent or sliding windows across a genome or BED file for binning genomic regions into fixed-size or fixed-count intervals.
Use when scanning a reference index for all high-similarity genomic locations of one specific read sequence string.
Use when you need to calculate the transition/transversion (Ts/Tv) ratio from VCF files for variant call quality assessment.
Use when comparing variant samples against background samples to identify unique genotypes and novel variants in VCF files.